Commit Graph

13 Commits

Author SHA1 Message Date
Aaron
d4fbfa5e5c fix: custom release strategy for correct naming
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-02 03:03:21 -04:00
Aaron Pham
acb81a6e1a fix(build): dispatch container via workflow calls (#174)
add OPENLLM_USE_LOCAL_LATEST as default behaviour within container
2023-08-02 01:54:10 -04:00
Aaron
6ba8899743 fix: remove invalid OPENLLMDEVDEBUG envvar
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-01 01:52:08 -04:00
Aaron Pham
8c2867d26d style: define experimental guidelines (#168) 2023-07-31 07:54:26 -04:00
Aaron Pham
ef94c6b98a feat(container): vLLM build and base image strategies (#142) 2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev
fc66ff275b fix: make sure to add torch to dependencies
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-28 00:01:52 +00:00
Aaron Pham
dcd34bd381 fix(build): running bento insider container (#141)
Behaviour of `docker run` should be the same with `openllm start`
2023-07-25 04:24:28 -04:00
Aaron Pham
7eabcd4355 feat: vLLM integration for PagedAttention (#134) 2023-07-24 15:42:17 -04:00
aarnphm-ec2-dev
e4ac0ed8b7 fix(cuda): support loading in single GPU
add available_devices for getting # of available GPUs

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-21 08:10:01 +00:00
Aaron
9ccbd60584 revert: include configuration to labels
This is used for starting up the bento

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-07-20 23:37:25 -04:00
aarnphm-ec2-dev
f91e750fcd fix(build): remove configuration from labels
labels will only include model_id for it to work with bentocloud

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-21 03:30:59 +00:00
aarnphm-ec2-dev
b31cd0460b fix: correct tag inference for model-id
in the case of build, the model_id is passed as a full valid tag under
bento store

XXX: We will need to fix this later

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-20 21:40:56 +00:00
Aaron Pham
c1ddb9ed7c feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00