OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-01-22 14:31:26 -05:00

Author	SHA1	Message	Date
Aaron Pham	8c2867d26d	style: define experimental guidelines (#168 )	2023-07-31 07:54:26 -04:00
Aaron Pham	ef94c6b98a	feat(container): vLLM build and base image strategies (#142 )	2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev	084786c898	fix(cli): ``openllm models`` for showing available Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-24 23:00:03 +00:00
Aaron Pham	7eabcd4355	feat: vLLM integration for PagedAttention (#134 )	2023-07-24 15:42:17 -04:00
aarnphm-ec2-dev	e4ac0ed8b7	fix(cuda): support loading in single GPU add available_devices for getting # of available GPUs Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-21 08:10:01 +00:00
aarnphm-ec2-dev	3e50f0a851	fix(cli): implement latest bentoml 1.0.25 features Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-20 20:51:27 +00:00
Aaron Pham	c1ddb9ed7c	feat: GPTQ + vLLM and LlaMA (#113 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-19 18:12:12 -04:00
Aaron Pham	fc963c42ce	fix: build isolation (#116 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-16 01:52:21 -04:00
HeTaoPKU	fd9ae56812	fix(baichuan): add "cpm-kernel" as additional requirements (#117 ) This is to support the 13b variant of baichuan Co-authored-by: the <tao.he@hulu.com> Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-07-15 23:16:05 -04:00
HeTaoPKU	09b0787306	feat(models): Baichuan (#115 ) Co-authored-by: the <tao.he@hulu.com> Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-07-15 22:01:37 -04:00
Aaron Pham	b2dba6143f	fix(resource): correctly parse CUDA_VISIBLE_DEVICES (#114 )	2023-07-15 07:19:35 -04:00

11 Commits