Commit Graph

62 Commits

Author SHA1 Message Date
pre-commit-ci[bot]
c2ed1d56da chore(release): update base container restriction (#173)
Prepare for 0.2.12 release

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-01 15:25:17 -04:00
Aaron Pham
8c2867d26d style: define experimental guidelines (#168) 2023-07-31 07:54:26 -04:00
Aaron Pham
ef94c6b98a feat(container): vLLM build and base image strategies (#142) 2023-07-31 02:44:52 -04:00
Aaron Pham
c391717226 feat(ci): automatic release semver + git archival installation (#143) 2023-07-25 04:18:49 -04:00
aarnphm-ec2-dev
084786c898 fix(cli): `openllm models` for showing available
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-24 23:00:03 +00:00
Aaron Pham
7eabcd4355 feat: vLLM integration for PagedAttention (#134) 2023-07-24 15:42:17 -04:00
aarnphm-ec2-dev
e4ac0ed8b7 fix(cuda): support loading in single GPU
add available_devices for getting # of available GPUs

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-21 08:10:01 +00:00
Aaron Pham
f56f8ee782 feat: fine-tuning script for LlaMA 2 (#128) 2023-07-20 20:44:51 -04:00
aarnphm-ec2-dev
3e50f0a851 fix(cli): implement latest bentoml 1.0.25 features
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-20 20:51:27 +00:00
Aaron Pham
1b3508619e feat(llama): add default prompt for LlaMA-2 (#122) 2023-07-20 07:46:33 -04:00
Aaron Pham
c1ddb9ed7c feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00
Aaron Pham
fc963c42ce fix: build isolation (#116)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-16 01:52:21 -04:00
HeTaoPKU
fd9ae56812 fix(baichuan): add "cpm-kernel" as additional requirements (#117)
This is to support the 13b variant of baichuan

Co-authored-by: the <tao.he@hulu.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-07-15 23:16:05 -04:00
HeTaoPKU
09b0787306 feat(models): Baichuan (#115)
Co-authored-by: the <tao.he@hulu.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-07-15 22:01:37 -04:00
Aaron Pham
b2dba6143f fix(resource): correctly parse CUDA_VISIBLE_DEVICES (#114) 2023-07-15 07:19:35 -04:00
aarnphm-ec2-dev
c2bb29b4f3 fix: building mpt dependencies
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-11 00:21:23 +00:00
Aaron Pham
c7f4dc7bb2 feat(test): snapshot testing (#107)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 17:23:19 -04:00
Aaron Pham
fb849a384e feat: GPTNeoX (#106) 2023-07-07 03:05:40 -04:00
Aaron Pham
d6303d306a perf: fixing import custom paths and cleanup serialisation (#102) 2023-07-04 12:49:14 -04:00
Aaron Pham
8ac2755de4 feat(llm): fine-tuning Falcon (#98)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-30 21:25:16 -04:00
aarnphm-ec2-dev
e81203884b fix(nightly-requirements): missing new lines [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-29 16:23:46 +00:00
aarnphm-ec2-dev
d3633a9430 chore(ci): update correct submodules for compiling triton [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-29 16:22:09 +00:00
Aaron Pham
e52045eda6 fix: running MPT on CPU (#92) 2023-06-29 10:54:12 -04:00
Aaron Pham
01db504e7d feat: MPT (#91)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-28 23:12:15 -04:00
Aaron Pham
bd4cc9b3ff fix: loading local (#87)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-28 11:25:54 -04:00
Aaron Pham
db1494a6ae feat(start): starting bento and fix load (#80) 2023-06-27 12:45:17 -04:00
Aaron
6e281cd4cd chore: simplify actions
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-25 11:05:16 -04:00
Aaron
bcf3ef76f3 revert: "chore: script to exit on error"
This reverts commit e1ce8f9c20.

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-25 10:43:46 -04:00
Aaron
e1ce8f9c20 chore: script to exit on error
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-25 10:41:48 -04:00
Aaron Pham
74fdd5e259 feat: release binary distribution (#66) 2023-06-25 10:38:03 -04:00
Aaron Pham
3593c764f0 fix(test): robustness (#64) 2023-06-24 11:10:07 -04:00
Aaron Pham
dfca956fad feat: serve adapter layers (#52) 2023-06-23 10:07:15 -04:00
Aaron
752c2e60a5 fix: remove direct url reference
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-19 13:25:29 -04:00
Aaron
1ed0ae7787 fix(log): make sure to configure OpenLLM logs correctly
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-19 06:19:06 -04:00
Aaron Pham
4fcd7c8ac9 integration: HuggingFace Agent (#29)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-18 00:13:53 -04:00
Aaron Pham
ded8a9f809 feat: quantization (#27) 2023-06-16 18:10:50 -04:00
Aaron Pham
19bc7e3116 feat: fine-tuning [part 1] (#23) 2023-06-16 00:19:01 -04:00
Aaron
74c8323e42 docs: update generated with href
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-13 07:30:43 -04:00
Aaron
764d86289c chore(readme): update table with model_ids matrix
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-12 16:57:40 -04:00
aarnphm-ec2-dev
c669d38dea fix(flan-t5): casting model to CUDA
Add a notes about GPU support for Flax

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 02:55:55 -04:00
Aaron
e90d90e9a0 feat(docs): copy button from table list
the script now generate into a HTML table, which allows us to use the
copy button from the README.md

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 01:23:56 -04:00
Aaron
7d382ced4f chore(docs): update notes about flan-t5
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 00:22:12 -04:00
Chaoyu
e2b26adf2f chore(docs): update README.md
See #12
2023-06-10 00:21:21 -04:00
Aaron
16df0f4393 chore(infra): increase timeout to 60m
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 18:18:51 -04:00
Aaron
ebe5ae797e fix(script): avoid using private variable
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 17:59:06 -04:00
Aaron
f5edd4fcf4 feat(script): add easy script to release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 17:52:39 -04:00
Aaron
067a7a8e81 chore(ci): add check script for README table update
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 09:16:28 -04:00
Aaron
c0418b76ec feat(infra): add tools for managing optional-dependencies
based on llm config

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 08:57:19 -04:00
Aaron
23d98a2729 feat(tooling): add script to auto update readme table of supported
models

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 08:22:55 -04:00
Aaron
44ac29b9dd infra: update release scripts to run on actions only
setup release notes to make sure it runs after pushing tag

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 08:45:51 -04:00