Aaron Pham
|
5a1fcc9cd5
|
fix: set default serialisation methods (#355)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-18 02:26:53 -04:00 |
|
Aaron Pham
|
ad9107958d
|
feat: continuous batching with vLLM (#349)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat: continuous batching
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: add changeloe
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: add one shot generation
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-14 03:09:36 -04:00 |
|
Aaron Pham
|
35e6945e86
|
fix(serialisation): vLLM safetensors support (#324)
* fix(serilisation): vllm support for safetensors
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: generalize one shot generation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: add changelog [skip ci]
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2023-09-12 17:44:01 -04:00 |
|
Aaron
|
c6c23bc959
|
fix(actions): hermetic dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-12 13:44:18 -04:00 |
|
Aaron
|
70f4ccfae6
|
fix(ratchet): lock correctly on cron job
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-11 15:09:49 -04:00 |
|
aarnphm-ec2-dev
|
c7f915fa71
|
chore: update documentation wrt to envvar correctness
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-08 17:43:03 +00:00 |
|
Aaron
|
0d50aa00b9
|
chore: add openllm-core as meta dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 10:31:40 -04:00 |
|
Aaron
|
887ffa9aa0
|
chore: cleanup pre-commit jobs and update usage
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-05 10:06:36 -04:00 |
|
Aaron Pham
|
956b3a53bc
|
fix(gptq): use upstream integration (#297)
* wip
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* feat: GPTQ transformers integration
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* fix: only load if variable is available and add changelog
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: remove boilerplate check
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-04 14:05:50 -04:00 |
|
aarnphm-ec2-dev
|
7d893e6cd2
|
chore: ignore new lines split [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 17:00:49 +00:00 |
|
Aaron Pham
|
b7af7765d4
|
fix(yapf): align weird new lines break [generated] [skip ci] (#284)
fix(yapf): align weird new lines break
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 05:34:22 -04:00 |
|
Aaron Pham
|
3e45530abd
|
refactor(breaking): unify LLM API (#283)
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 05:15:19 -04:00 |
|
Aaron
|
b545ad2ad1
|
style: google
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-30 13:52:35 -04:00 |
|
Aaron Pham
|
c9cef1d773
|
fix: persistent styling between ruff and yapf (#279)
|
2023-08-30 11:37:41 -04:00 |
|
Aaron Pham
|
2036d4e015
|
chore(build): use latest vllm pre-built kernel (#261)
|
2023-08-26 09:02:52 -04:00 |
|
aarnphm-ec2-dev
|
806a663e4a
|
chore(style): add one blank line
to conform with Google style
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-26 11:36:57 +00:00 |
|
Aaron Pham
|
938fd362bb
|
feat(vllm): streaming (#260)
|
2023-08-26 07:27:32 -04:00 |
|
aarnphm-ec2-dev
|
dae38cdba1
|
chore: update external dependencies [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-25 09:27:26 +00:00 |
|
Aaron Pham
|
bbd9aa7646
|
refactor(contrib): similar namespace [clojure-ui build] (#251)
|
2023-08-23 00:21:59 -04:00 |
|
aarnphm-ec2-dev
|
1488fbb167
|
chore(style): enable yapf to match with style guidelines
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-22 14:03:06 +00:00 |
|
Aaron Pham
|
3ffb25a872
|
refactor: packages (#249)
|
2023-08-22 08:55:46 -04:00 |
|
Aaron
|
9fb46e1676
|
chore(release): add manual workflow dispatch run on new release tag
[skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-17 16:38:58 -04:00 |
|
Aaron Pham
|
4140d160b8
|
feat(embedding): Adding generic endpoint (#227)
|
2023-08-17 15:17:00 -04:00 |
|
Aaron Pham
|
665233c30f
|
chore: conditional commit for running jobs (#232)
|
2023-08-17 10:13:53 -04:00 |
|
Aaron Pham
|
d7a6859c40
|
chore(gh): use setup-bentoml-action (#230)
|
2023-08-17 08:34:35 -04:00 |
|
GutZuFusss
|
4cad367ab5
|
feat(contrib): ClojureScript UI (#89)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 03:30:44 -04:00 |
|
Aaron Pham
|
58527032e0
|
feat: add default python version for development [skip ci] (#212)
|
2023-08-15 02:39:43 -04:00 |
|
Aaron Pham
|
cd872ef631
|
refactor: monorepo (#203)
|
2023-08-15 02:11:14 -04:00 |
|
Aaron Pham
|
f6317d8003
|
infra: enable compiled wheels for all supported Python (#201)
|
2023-08-12 04:54:50 -04:00 |
|
Aaron Pham
|
5329853b10
|
perf: compiled modules and enable lazyeval (#200)
|
2023-08-11 05:53:45 -04:00 |
|
Aaron Pham
|
c083990edd
|
infra: migrate to initial openllm-node library (#199)
|
2023-08-10 18:54:00 -04:00 |
|
aarnphm-ec2-dev
|
dfc4b489c5
|
feat(build): notes on compiled wheels for Bento
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-09 21:52:34 +00:00 |
|
Aaron Pham
|
b1445c6516
|
refactor(cli): compiled wheels and extension modules (#191)
|
2023-08-09 17:10:15 -04:00 |
|
Aaron
|
ae11e487d9
|
fix(brew): specific installation from gzip [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 22:32:11 -04:00 |
|
Aaron
|
21143fdfab
|
fix(brew): set correct url for release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 22:18:26 -04:00 |
|
Aaron Pham
|
b9dd54f634
|
feat: homebrew tap (#190)
|
2023-08-08 22:11:48 -04:00 |
|
Aaron Pham
|
21ea7e493f
|
feat(generation): initial work for generating tokens (#186)
|
2023-08-06 20:04:40 -04:00 |
|
Aaron Pham
|
2541a0f8dc
|
infra: initial work on compiling mypyc wheels (#182)
|
2023-08-04 10:20:03 -04:00 |
|
pre-commit-ci[bot]
|
c2ed1d56da
|
chore(release): update base container restriction (#173)
Prepare for 0.2.12 release
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-01 15:25:17 -04:00 |
|
Aaron Pham
|
8c2867d26d
|
style: define experimental guidelines (#168)
|
2023-07-31 07:54:26 -04:00 |
|
Aaron Pham
|
ef94c6b98a
|
feat(container): vLLM build and base image strategies (#142)
|
2023-07-31 02:44:52 -04:00 |
|
Aaron Pham
|
c391717226
|
feat(ci): automatic release semver + git archival installation (#143)
|
2023-07-25 04:18:49 -04:00 |
|
aarnphm-ec2-dev
|
084786c898
|
fix(cli): `openllm models` for showing available
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-07-24 23:00:03 +00:00 |
|
Aaron Pham
|
7eabcd4355
|
feat: vLLM integration for PagedAttention (#134)
|
2023-07-24 15:42:17 -04:00 |
|
aarnphm-ec2-dev
|
e4ac0ed8b7
|
fix(cuda): support loading in single GPU
add available_devices for getting # of available GPUs
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-07-21 08:10:01 +00:00 |
|
Aaron Pham
|
f56f8ee782
|
feat: fine-tuning script for LlaMA 2 (#128)
|
2023-07-20 20:44:51 -04:00 |
|
aarnphm-ec2-dev
|
3e50f0a851
|
fix(cli): implement latest bentoml 1.0.25 features
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-07-20 20:51:27 +00:00 |
|
Aaron Pham
|
1b3508619e
|
feat(llama): add default prompt for LlaMA-2 (#122)
|
2023-07-20 07:46:33 -04:00 |
|
Aaron Pham
|
c1ddb9ed7c
|
feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-19 18:12:12 -04:00 |
|
Aaron Pham
|
fc963c42ce
|
fix: build isolation (#116)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-16 01:52:21 -04:00 |
|