Aaron Pham
ef94c6b98a
feat(container): vLLM build and base image strategies ( #142 )
2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev
fc66ff275b
fix: make sure to add torch to dependencies
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-28 00:01:52 +00:00
Aaron Pham
15640a85cd
feat: supports embeddings for T5 and ChatGLM family generation ( #153 )
2023-07-27 16:43:43 -04:00
Aaron Pham
e075bd25ea
chore: add NousResearch's as non-gated Llama ( #152 )
2023-07-27 15:30:56 -04:00
aarnphm-ec2-dev
6dc0bf0b12
fix: remove breakpoint on CLI
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-25 16:30:16 +00:00
aarnphm-ec2-dev
b23b59e1c9
fix(embeddings): correctly set JSON data via CLI client
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-25 16:26:01 +00:00
Aaron Pham
1940086bec
feat(client): embeddings ( #146 )
2023-07-25 05:44:21 -04:00
Aaron Pham
dcd34bd381
fix(build): running bento insider container ( #141 )
...
Behaviour of `docker run` should be the same with `openllm start`
2023-07-25 04:24:28 -04:00
Aaron Pham
c391717226
feat(ci): automatic release semver + git archival installation ( #143 )
2023-07-25 04:18:49 -04:00
Aaron Pham
5635ce8d87
infra: bump to dev version of 0.2.10.dev0 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-24 23:35:04 +00:00
Aaron Pham
fb656164e1
infra: prepare for release 0.2.9 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-24 23:24:09 +00:00
aarnphm-ec2-dev
084786c898
fix(cli): `openllm models` for showing available
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-24 23:00:03 +00:00
Aaron Pham
e72f0d55f4
infra: bump to dev version of 0.2.9.dev0 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-24 19:58:13 +00:00
Aaron Pham
23a8ae44ed
infra: prepare for release 0.2.8 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-24 19:44:11 +00:00
Aaron Pham
7eabcd4355
feat: vLLM integration for PagedAttention ( #134 )
2023-07-24 15:42:17 -04:00
aarnphm-ec2-dev
4cd0784ee2
chore: export generation items for lazy loading
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-23 08:01:55 +00:00
aarnphm-ec2-dev
e2cdd767ef
chore(cli): simplify table for `openllm models`
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-23 06:29:58 +00:00
Aaron Pham
693631958a
feat(service): provisional API ( #133 )
2023-07-23 02:15:39 -04:00
Aaron Pham
d88b069160
infra: bump to dev version of 0.2.8.dev0 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-23 01:21:32 +00:00
Aaron Pham
b74bea36a7
infra: prepare for release 0.2.7 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-23 01:10:45 +00:00
aarnphm-ec2-dev
99bb0e4446
fix(serialisation): using save_pretrained with import_model
...
Fix llm_post_init correct wrapper behaviour
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-23 01:07:39 +00:00
aarnphm-ec2-dev
d4f3cf8b75
fix(llm): ignore quantization config when --quantize int4 is passed
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-22 22:45:46 +00:00
aarnphm-ec2-dev
6f4c58175d
chore(llm): add envvar for making tag
...
the envvar isd OPENLLM_USE_LOCAL_LATEST
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-22 21:37:19 +00:00
Aaron Pham
57a0fec247
infra: bump to dev version of 0.2.7.dev0 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-22 21:31:19 +00:00
Aaron Pham
71689e506d
infra: prepare for release 0.2.6 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-22 21:19:04 +00:00
Aaron Pham
19f20c7dad
perf(serialisation): implement wrapper to reduce callstack ( #132 )
2023-07-22 17:15:03 -04:00
Aaron
ecf31e90b7
chore(configuration): remove unused call
...
to remove one call in the call stack
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-22 15:25:44 -04:00
aarnphm-ec2-dev
beb8c2bb08
fix(ft): set report_to none to avoid wandb setup
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 18:33:38 +00:00
Aaron
5d2dd470d0
infra: bump to dev version of 0.2.6.dev0 [generated] [skip ci]
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-21 14:16:48 -04:00
Aaron Pham
d49ff95f7f
infra: prepare for release 0.2.5 [generated] [skip ci]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-21 17:59:21 +00:00
Aaron Pham
81b0451685
feat(cli): query with per request instruction ( #130 )
2023-07-21 13:57:21 -04:00
aarnphm-ec2-dev
aa32bfcc4d
infra: bump to dev version of 0.2.5.dev0 [generated]
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 15:45:13 +00:00
Aaron Pham
6b61217523
infra: prepare for release 0.2.4 [generated]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-21 08:21:36 +00:00
aarnphm-ec2-dev
e4ac0ed8b7
fix(cuda): support loading in single GPU
...
add available_devices for getting # of available GPUs
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 08:10:01 +00:00
aarnphm-ec2-dev
033358a991
infra: bump to dev version of 0.2.4.dev0 [generated]
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 06:31:49 +00:00
Aaron Pham
e5cada218a
infra: prepare for release 0.2.3 [generated]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-21 03:39:34 +00:00
Aaron
9ccbd60584
revert: include configuration to labels
...
This is used for starting up the bento
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-20 23:37:25 -04:00
aarnphm-ec2-dev
f91e750fcd
fix(build): remove configuration from labels
...
labels will only include model_id for it to work with bentocloud
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 03:30:59 +00:00
Aaron
347ffaadbe
chore(playground): generate default dir to not set
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-20 21:25:03 -04:00
aarnphm-ec2-dev
f5b1c8ec1b
fix(ft): correct set epochs args for TrainingArguments
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-21 01:20:56 +00:00
Aaron
11f88b24ca
infra: bump to dev version of 0.2.3.dev0 [generated]
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-20 21:03:02 -04:00
Aaron Pham
16118dd28f
infra: prepare for release 0.2.2 [generated]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-21 00:46:36 +00:00
Aaron Pham
f56f8ee782
feat: fine-tuning script for LlaMA 2 ( #128 )
2023-07-20 20:44:51 -04:00
Aaron
c101103d37
infra: bump to dev version of 0.2.2.dev0 [generated]
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-20 18:51:00 -04:00
Aaron Pham
804b30adc4
infra: prepare for release 0.2.1 [generated]
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-07-20 22:38:27 +00:00
aarnphm-ec2-dev
ea07ff6ce9
fix(llama): loose requirements for running llama in container
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-20 22:07:14 +00:00
aarnphm-ec2-dev
b31cd0460b
fix: correct tag inference for model-id
...
in the case of build, the model_id is passed as a full valid tag under
bento store
XXX: We will need to fix this later
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-20 21:40:56 +00:00
aarnphm-ec2-dev
3e50f0a851
fix(cli): implement latest bentoml 1.0.25 features
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-20 20:51:27 +00:00
Aaron Pham
858c2007c3
feat: revision parsed via model_id ( #126 )
2023-07-20 14:36:53 -04:00
Aaron Pham
1b3508619e
feat(llama): add default prompt for LlaMA-2 ( #122 )
2023-07-20 07:46:33 -04:00