Aaron Pham
ef94c6b98a
feat(container): vLLM build and base image strategies ( #142 )
2023-07-31 02:44:52 -04:00
Aaron Pham
4fae00b68b
fix(ci): correct tag for checkout ( #150 )
2023-07-25 14:11:03 -04:00
Aaron Pham
c391717226
feat(ci): automatic release semver + git archival installation ( #143 )
2023-07-25 04:18:49 -04:00
Aaron Pham
7eabcd4355
feat: vLLM integration for PagedAttention ( #134 )
2023-07-24 15:42:17 -04:00
Aaron Pham
c1ddb9ed7c
feat: GPTQ + vLLM and LlaMA ( #113 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00
Aaron Pham
fc963c42ce
fix: build isolation ( #116 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-16 01:52:21 -04:00
aarnphm-ec2-dev
d37d14e52b
fix(tests): mark package on CI to xfail
...
XXX: @aarnphm to solve build isolation when have bandwidth. Currently
this is not a problem when running locally.
`openllm build` just works, where as `openllm.build` won't work
sequentially.
Address some type stubs for jupytext
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-15 12:48:28 +00:00
Aaron Pham
b2dba6143f
fix(resource): correctly parse CUDA_VISIBLE_DEVICES ( #114 )
2023-07-15 07:19:35 -04:00
aarnphm-ec2-dev
7824332a01
chore: remove auto workers
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-10 21:51:12 +00:00
Aaron Pham
c7f4dc7bb2
feat(test): snapshot testing ( #107 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 17:23:19 -04:00
Aaron Pham
fb849a384e
feat: GPTNeoX ( #106 )
2023-07-07 03:05:40 -04:00
Aaron Pham
d6303d306a
perf: fixing import custom paths and cleanup serialisation ( #102 )
2023-07-04 12:49:14 -04:00
Aaron Pham
bd4cc9b3ff
fix: loading local ( #87 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-28 11:25:54 -04:00
Aaron Pham
db1494a6ae
feat(start): starting bento and fix load ( #80 )
2023-06-27 12:45:17 -04:00
Aaron Pham
74fdd5e259
feat: release binary distribution ( #66 )
2023-06-25 10:38:03 -04:00
Aaron Pham
f0773f2d01
chore: add more test matrices ( #70 )
2023-06-25 03:44:03 -04:00
Aaron Pham
3593c764f0
fix(test): robustness ( #64 )
2023-06-24 11:10:07 -04:00
Aaron
622a2fb37d
fix: separate hatch config
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-19 03:39:05 -04:00