Aaron Pham
|
8c2867d26d
|
style: define experimental guidelines (#168)
|
2023-07-31 07:54:26 -04:00 |
|
Aaron Pham
|
ef94c6b98a
|
feat(container): vLLM build and base image strategies (#142)
|
2023-07-31 02:44:52 -04:00 |
|
aarnphm-ec2-dev
|
084786c898
|
fix(cli): `openllm models` for showing available
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-07-24 23:00:03 +00:00 |
|
Aaron Pham
|
7eabcd4355
|
feat: vLLM integration for PagedAttention (#134)
|
2023-07-24 15:42:17 -04:00 |
|
aarnphm-ec2-dev
|
e4ac0ed8b7
|
fix(cuda): support loading in single GPU
add available_devices for getting # of available GPUs
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-07-21 08:10:01 +00:00 |
|
aarnphm-ec2-dev
|
3e50f0a851
|
fix(cli): implement latest bentoml 1.0.25 features
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-07-20 20:51:27 +00:00 |
|
Aaron Pham
|
c1ddb9ed7c
|
feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-19 18:12:12 -04:00 |
|
Aaron Pham
|
fc963c42ce
|
fix: build isolation (#116)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-16 01:52:21 -04:00 |
|
HeTaoPKU
|
fd9ae56812
|
fix(baichuan): add "cpm-kernel" as additional requirements (#117)
This is to support the 13b variant of baichuan
Co-authored-by: the <tao.he@hulu.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-07-15 23:16:05 -04:00 |
|
HeTaoPKU
|
09b0787306
|
feat(models): Baichuan (#115)
Co-authored-by: the <tao.he@hulu.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-07-15 22:01:37 -04:00 |
|
Aaron Pham
|
b2dba6143f
|
fix(resource): correctly parse CUDA_VISIBLE_DEVICES (#114)
|
2023-07-15 07:19:35 -04:00 |
|