OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-01-22 06:19:35 -05:00

Author	SHA1	Message	Date
Aaron Pham	8c2867d26d	style: define experimental guidelines (#168 )	2023-07-31 07:54:26 -04:00
Aaron Pham	ef94c6b98a	feat(container): vLLM build and base image strategies (#142 )	2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev	56bf84a760	fix(ci): make sure to exclude generated _version.py Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-25 09:55:24 +00:00
Aaron Pham	dcd34bd381	fix(build): running bento insider container (#141 ) Behaviour of `docker run` should be the same with `openllm start`	2023-07-25 04:24:28 -04:00
Aaron Pham	c391717226	feat(ci): automatic release semver + git archival installation (#143 )	2023-07-25 04:18:49 -04:00
Aaron Pham	7eabcd4355	feat: vLLM integration for PagedAttention (#134 )	2023-07-24 15:42:17 -04:00
dependabot[bot]	9afbdc5198	chore(deps): update bitsandbytes requirement from <0.40 to <0.42 (#137 ) Updates the requirements on [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) to permit the latest version. - [Release notes](https://github.com/TimDettmers/bitsandbytes/releases) - [Changelog](https://github.com/TimDettmers/bitsandbytes/blob/main/CHANGELOG.md) - [Commits](https://github.com/TimDettmers/bitsandbytes/compare/0.32.0...0.41.0) --- updated-dependencies: - dependency-name: bitsandbytes dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-07-24 07:59:50 +00:00
Aaron Pham	693631958a	feat(service): provisional API (#133 )	2023-07-23 02:15:39 -04:00
aarnphm-ec2-dev	e4ac0ed8b7	fix(cuda): support loading in single GPU add available_devices for getting # of available GPUs Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-21 08:10:01 +00:00
Aaron Pham	f56f8ee782	feat: fine-tuning script for LlaMA 2 (#128 )	2023-07-20 20:44:51 -04:00
aarnphm-ec2-dev	3e50f0a851	fix(cli): implement latest bentoml 1.0.25 features Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-20 20:51:27 +00:00
Aaron Pham	c1ddb9ed7c	feat: GPTQ + vLLM and LlaMA (#113 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-19 18:12:12 -04:00
Aaron Pham	fc963c42ce	fix: build isolation (#116 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-16 01:52:21 -04:00
HeTaoPKU	fd9ae56812	fix(baichuan): add "cpm-kernel" as additional requirements (#117 ) This is to support the 13b variant of baichuan Co-authored-by: the <tao.he@hulu.com> Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-07-15 23:16:05 -04:00
HeTaoPKU	09b0787306	feat(models): Baichuan (#115 ) Co-authored-by: the <tao.he@hulu.com> Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-07-15 22:01:37 -04:00
aarnphm-ec2-dev	d37d14e52b	fix(tests): mark package on CI to xfail XXX: @aarnphm to solve build isolation when have bandwidth. Currently this is not a problem when running locally. `openllm build` just works, where as `openllm.build` won't work sequentially. Address some type stubs for jupytext Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-15 12:48:28 +00:00
Aaron Pham	b2dba6143f	fix(resource): correctly parse CUDA_VISIBLE_DEVICES (#114 )	2023-07-15 07:19:35 -04:00
aarnphm-ec2-dev	e2ae24b74c	fix(tests): building not being isolated We will need to fix this from BentoML Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-11 17:28:00 +00:00
aarnphm-ec2-dev	cea082e7bd	fix(cli): correct prune based on metadata Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-11 00:34:22 +00:00
aarnphm-ec2-dev	c2bb29b4f3	fix: building mpt dependencies Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-11 00:21:23 +00:00
Aaron Pham	c7f4dc7bb2	feat(test): snapshot testing (#107 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-10 17:23:19 -04:00
Aaron Pham	fb849a384e	feat: GPTNeoX (#106 )	2023-07-07 03:05:40 -04:00
aarnphm-ec2-dev	4c5b27495c	fix: bettertransformer check to bool already Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-04 17:58:02 +00:00
Aaron Pham	d6303d306a	perf: fixing import custom paths and cleanup serialisation (#102 )	2023-07-04 12:49:14 -04:00
Aaron Pham	8ac2755de4	feat(llm): fine-tuning Falcon (#98 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-30 21:25:16 -04:00
Aaron Pham	59b1d89971	feat: custom dockerfile templates (#95 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-30 13:44:11 -04:00
Aaron Pham	f2457fcdaf	tests: add sanity check for openllm.client (#93 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-29 12:07:00 -04:00
Aaron	7e8ca79c2d	chore: style [skip ci] Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-29 10:54:58 -04:00
Aaron Pham	e52045eda6	fix: running MPT on CPU (#92 )	2023-06-29 10:54:12 -04:00
Aaron Pham	01db504e7d	feat: MPT (#91 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-28 23:12:15 -04:00
Aaron Pham	5a4df53490	fix(load): tokenizer and adapter within a BentoLLM (#88 )	2023-06-28 15:45:25 -04:00
Aaron Pham	bd4cc9b3ff	fix: loading local (#87 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-28 11:25:54 -04:00
Aaron Pham	db1494a6ae	feat(start): starting bento and fix load (#80 )	2023-06-27 12:45:17 -04:00
Aaron Pham	74fdd5e259	feat: release binary distribution (#66 )	2023-06-25 10:38:03 -04:00
Aaron Pham	3593c764f0	fix(test): robustness (#64 )	2023-06-24 11:10:07 -04:00
Aaron Pham	1435478f6c	fix(cli): ensure we parse tag for download (#58 )	2023-06-23 21:24:53 -04:00
Aaron Pham	dfca956fad	feat: serve adapter layers (#52 )	2023-06-23 10:07:15 -04:00
Aaron	752c2e60a5	fix: remove direct url reference Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-19 13:25:29 -04:00
Aaron	1ed0ae7787	fix(log): make sure to configure OpenLLM logs correctly Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-19 06:19:06 -04:00
Aaron	622a2fb37d	fix: separate hatch config Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-19 03:39:05 -04:00
Aaron	e3fad40f21	fix(env): make tests with extra-dependencies Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-18 23:58:03 -04:00
Aaron Pham	03758a5487	fix(tools): adhere to style guidelines (#31 )	2023-06-18 20:03:17 -04:00
Aaron Pham	4fcd7c8ac9	integration: HuggingFace Agent (#29 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-18 00:13:53 -04:00
Aaron Pham	ded8a9f809	feat: quantization (#27 )	2023-06-16 18:10:50 -04:00
Aaron Pham	19bc7e3116	feat: fine-tuning [part 1] (#23 )	2023-06-16 00:19:01 -04:00
Aaron	528f76e1d0	fix(client): using httpx for running calls within async context This is so that client.query works within a async context Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-15 01:58:49 -04:00
Aaron	d07cc95ea0	ci: add hatch to dev envs Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-14 03:48:05 -04:00
Aaron	be41c23c10	codegen: remove black as dependencies Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-14 03:22:05 -04:00
Aaron Pham	dd20941050	chore: metadata (#19 )	2023-06-13 04:09:33 -04:00
Aaron Pham	f8ebb36e15	tests: fastpath (#17 ) added fastpath cases for configuration and Flan-T5 fixes respecting model_id into lifecycle hooks. update CLI to cleanup models info	2023-06-12 14:18:26 -04:00

1 2

89 Commits