Aaron
d4fbfa5e5c
fix: custom release strategy for correct naming
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-08-02 03:03:21 -04:00
Aaron Pham
acb81a6e1a
fix(build): dispatch container via workflow calls ( #174 )
...
add OPENLLM_USE_LOCAL_LATEST as default behaviour within container
2023-08-02 01:54:10 -04:00
Aaron
6ba8899743
fix: remove invalid OPENLLMDEVDEBUG envvar
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-08-01 01:52:08 -04:00
Aaron Pham
8c2867d26d
style: define experimental guidelines ( #168 )
2023-07-31 07:54:26 -04:00
Aaron Pham
ef94c6b98a
feat(container): vLLM build and base image strategies ( #142 )
2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev
fc66ff275b
fix: make sure to add torch to dependencies
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-28 00:01:52 +00:00
Aaron Pham
dcd34bd381
fix(build): running bento insider container ( #141 )
...
Behaviour of `docker run` should be the same with `openllm start`
2023-07-25 04:24:28 -04:00
Aaron Pham
7eabcd4355
feat: vLLM integration for PagedAttention ( #134 )
2023-07-24 15:42:17 -04:00
aarnphm-ec2-dev
e4ac0ed8b7
fix(cuda): support loading in single GPU
...
add available_devices for getting # of available GPUs
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 08:10:01 +00:00
Aaron
9ccbd60584
revert: include configuration to labels
...
This is used for starting up the bento
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-20 23:37:25 -04:00
aarnphm-ec2-dev
f91e750fcd
fix(build): remove configuration from labels
...
labels will only include model_id for it to work with bentocloud
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 03:30:59 +00:00
aarnphm-ec2-dev
b31cd0460b
fix: correct tag inference for model-id
...
in the case of build, the model_id is passed as a full valid tag under
bento store
XXX: We will need to fix this later
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-20 21:40:56 +00:00
Aaron Pham
c1ddb9ed7c
feat: GPTQ + vLLM and LlaMA ( #113 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00