Commit Graph

100 Commits

Author SHA1 Message Date
Aaron Pham
5a1fcc9cd5 fix: set default serialisation methods (#355)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-18 02:26:53 -04:00
Aaron Pham
ad9107958d feat: continuous batching with vLLM (#349)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat: continuous batching

Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>

* chore: add changeloe

Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>

* chore: add one shot generation

Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-14 03:09:36 -04:00
Aaron Pham
35e6945e86 fix(serialisation): vLLM safetensors support (#324)
* fix(serilisation): vllm support for safetensors

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

* chore: running tools

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: generalize one shot generation

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: add changelog [skip ci]

Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
2023-09-12 17:44:01 -04:00
Aaron
c6c23bc959 fix(actions): hermetic dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-12 13:44:18 -04:00
Aaron
70f4ccfae6 fix(ratchet): lock correctly on cron job
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-11 15:09:49 -04:00
aarnphm-ec2-dev
c7f915fa71 chore: update documentation wrt to envvar correctness
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-09-08 17:43:03 +00:00
Aaron
0d50aa00b9 chore: add openllm-core as meta dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-07 10:31:40 -04:00
Aaron
887ffa9aa0 chore: cleanup pre-commit jobs and update usage
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-05 10:06:36 -04:00
Aaron Pham
956b3a53bc fix(gptq): use upstream integration (#297)
* wip

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

* feat: GPTQ transformers integration

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

* fix: only load if variable is available and add changelog

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

* chore: remove boilerplate check

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-09-04 14:05:50 -04:00
aarnphm-ec2-dev
7d893e6cd2 chore: ignore new lines split [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-09-01 17:00:49 +00:00
Aaron Pham
b7af7765d4 fix(yapf): align weird new lines break [generated] [skip ci] (#284)
fix(yapf): align weird new lines break

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-09-01 05:34:22 -04:00
Aaron Pham
3e45530abd refactor(breaking): unify LLM API (#283)
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-01 05:15:19 -04:00
Aaron
b545ad2ad1 style: google
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-30 13:52:35 -04:00
Aaron Pham
c9cef1d773 fix: persistent styling between ruff and yapf (#279) 2023-08-30 11:37:41 -04:00
Aaron Pham
2036d4e015 chore(build): use latest vllm pre-built kernel (#261) 2023-08-26 09:02:52 -04:00
aarnphm-ec2-dev
806a663e4a chore(style): add one blank line
to conform with Google style

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-26 11:36:57 +00:00
Aaron Pham
938fd362bb feat(vllm): streaming (#260) 2023-08-26 07:27:32 -04:00
aarnphm-ec2-dev
dae38cdba1 chore: update external dependencies [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-25 09:27:26 +00:00
Aaron Pham
bbd9aa7646 refactor(contrib): similar namespace [clojure-ui build] (#251) 2023-08-23 00:21:59 -04:00
aarnphm-ec2-dev
1488fbb167 chore(style): enable yapf to match with style guidelines
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-22 14:03:06 +00:00
Aaron Pham
3ffb25a872 refactor: packages (#249) 2023-08-22 08:55:46 -04:00
Aaron
9fb46e1676 chore(release): add manual workflow dispatch run on new release tag
[skip ci]

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-17 16:38:58 -04:00
Aaron Pham
4140d160b8 feat(embedding): Adding generic endpoint (#227) 2023-08-17 15:17:00 -04:00
Aaron Pham
665233c30f chore: conditional commit for running jobs (#232) 2023-08-17 10:13:53 -04:00
Aaron Pham
d7a6859c40 chore(gh): use setup-bentoml-action (#230) 2023-08-17 08:34:35 -04:00
GutZuFusss
4cad367ab5 feat(contrib): ClojureScript UI (#89)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-16 03:30:44 -04:00
Aaron Pham
58527032e0 feat: add default python version for development [skip ci] (#212) 2023-08-15 02:39:43 -04:00
Aaron Pham
cd872ef631 refactor: monorepo (#203) 2023-08-15 02:11:14 -04:00
Aaron Pham
f6317d8003 infra: enable compiled wheels for all supported Python (#201) 2023-08-12 04:54:50 -04:00
Aaron Pham
5329853b10 perf: compiled modules and enable lazyeval (#200) 2023-08-11 05:53:45 -04:00
Aaron Pham
c083990edd infra: migrate to initial openllm-node library (#199) 2023-08-10 18:54:00 -04:00
aarnphm-ec2-dev
dfc4b489c5 feat(build): notes on compiled wheels for Bento
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-09 21:52:34 +00:00
Aaron Pham
b1445c6516 refactor(cli): compiled wheels and extension modules (#191) 2023-08-09 17:10:15 -04:00
Aaron
ae11e487d9 fix(brew): specific installation from gzip [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-08 22:32:11 -04:00
Aaron
21143fdfab fix(brew): set correct url for release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-08 22:18:26 -04:00
Aaron Pham
b9dd54f634 feat: homebrew tap (#190) 2023-08-08 22:11:48 -04:00
Aaron Pham
21ea7e493f feat(generation): initial work for generating tokens (#186) 2023-08-06 20:04:40 -04:00
Aaron Pham
2541a0f8dc infra: initial work on compiling mypyc wheels (#182) 2023-08-04 10:20:03 -04:00
pre-commit-ci[bot]
c2ed1d56da chore(release): update base container restriction (#173)
Prepare for 0.2.12 release

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-01 15:25:17 -04:00
Aaron Pham
8c2867d26d style: define experimental guidelines (#168) 2023-07-31 07:54:26 -04:00
Aaron Pham
ef94c6b98a feat(container): vLLM build and base image strategies (#142) 2023-07-31 02:44:52 -04:00
Aaron Pham
c391717226 feat(ci): automatic release semver + git archival installation (#143) 2023-07-25 04:18:49 -04:00
aarnphm-ec2-dev
084786c898 fix(cli): `openllm models` for showing available
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-24 23:00:03 +00:00
Aaron Pham
7eabcd4355 feat: vLLM integration for PagedAttention (#134) 2023-07-24 15:42:17 -04:00
aarnphm-ec2-dev
e4ac0ed8b7 fix(cuda): support loading in single GPU
add available_devices for getting # of available GPUs

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-21 08:10:01 +00:00
Aaron Pham
f56f8ee782 feat: fine-tuning script for LlaMA 2 (#128) 2023-07-20 20:44:51 -04:00
aarnphm-ec2-dev
3e50f0a851 fix(cli): implement latest bentoml 1.0.25 features
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-20 20:51:27 +00:00
Aaron Pham
1b3508619e feat(llama): add default prompt for LlaMA-2 (#122) 2023-07-20 07:46:33 -04:00
Aaron Pham
c1ddb9ed7c feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00
Aaron Pham
fc963c42ce fix: build isolation (#116)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-16 01:52:21 -04:00