Commit Graph

240 Commits

Author SHA1 Message Date
aarnphm-ec2-dev
bb37f7e238 feat(utils): lazy load modules and fix typo
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-10 22:18:37 +00:00
Aaron
05fa34f9e6 refactor: pretrained => model_id
I think model_id makes more sense than calling it pretrained

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 17:36:02 -04:00
Aaron
4841051fc5 feat(stablelm): CPU inference
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 07:53:29 -04:00
aarnphm-ec2-dev
53296111d0 fix(gpu): enable device_map 'auto' to multi-gpu setup only
This device_map is a magical value to set all available GPU to the
model. Usually this should only be set when multiple GPUs are
available.

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 07:41:30 -04:00
Aaron Pham [bot]
66a87ef0b7 infra: bump to dev version of 0.0.34.dev0 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
2023-06-10 10:19:02 +00:00
Aaron Pham [bot]
56f50deab6 infra: prepare for release 0.0.33 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
v0.0.33
2023-06-10 10:09:12 +00:00
aarnphm-ec2-dev
2348946ada fix(starcoder): disable quant 8
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-10 10:01:43 +00:00
aarnphm-ec2-dev
4db141c649 feat(gpu): support passing GPU per LLM
respect CUDA_VISIBLE_DEVICES and optionally --device

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-10 09:47:16 +00:00
aarnphm-ec2-dev
ebfed3c116 fix(chatglm): generation tokens not concatenated correctly
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-10 09:46:33 +00:00
Aaron
d70530cb0e chore: add stubs for deepmerge [generated]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 03:04:56 -04:00
aarnphm-ec2-dev
8fbf352ec6 docs: add more information about pretrained weights
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-10 06:58:36 +00:00
aarnphm-ec2-dev
c669d38dea fix(flan-t5): casting model to CUDA
Add a notes about GPU support for Flax

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 02:55:55 -04:00
Aaron
afddaed08c fix(perf): respect per request information
remove use_default_prompt_template options

add pretrained to list of start help docstring

fix flax generation config

improve flax and tensorflow implementation

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 02:14:13 -04:00
Aaron
e90d90e9a0 feat(docs): copy button from table list
the script now generate into a HTML table, which allows us to use the
copy button from the README.md

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 01:23:56 -04:00
Aaron
7d382ced4f chore(docs): update notes about flan-t5
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 00:22:12 -04:00
Chaoyu
9ffe1f40bf chore: rename LICENSE to LICENSE.md
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 00:21:54 -04:00
Chaoyu
e2b26adf2f chore(docs): update README.md
See #12
2023-06-10 00:21:21 -04:00
Aaron
1597d5d4bb chore(readme): update stablelm [generated]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 00:21:21 -04:00
Aaron
bca133f389 revert: update metadata for Python 3.8 and 3.9
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 00:21:20 -04:00
Aaron Pham [bot]
11cedce974 infra: bump to dev version of 0.0.33.dev0 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 00:21:20 -04:00
Aaron Pham [bot]
03ac525949 infra: prepare for release 0.0.32 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
v0.0.32
2023-06-09 19:05:09 +00:00
Aaron
9bbe1ff4bf chore(stablelm): make stablelm run explicitly with GPU
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-09 14:57:12 -04:00
Aaron
c51e944cb2 chore(version): remove support for 3.8 and 3.9 for now
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 22:47:57 -04:00
Aaron
b72317db67 fix(import): lazy load torch
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 22:05:41 -04:00
Aaron
16df0f4393 chore(infra): increase timeout to 60m
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 18:18:51 -04:00
Aaron Pham [bot]
d005760c68 infra: bump to dev version of 0.0.32.dev0 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
2023-06-08 22:15:29 +00:00
Aaron Pham [bot]
e2813f843e infra: prepare for release 0.0.31 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
v0.0.31
2023-06-08 22:04:19 +00:00
Aaron
ebe5ae797e fix(script): avoid using private variable
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 17:59:06 -04:00
Aaron
f5edd4fcf4 feat(script): add easy script to release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 17:52:39 -04:00
Aaron
f284c64370 docs: update release-notes run with ref for tags
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 17:18:23 -04:00
aarnphm-ec2-dev
acf78ce731 fix(saving): make sure to cleanup cuda cache after using default
import

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 21:11:07 +00:00
Aaron Pham [bot]
a451b03a0a infra: bump to dev version of 0.0.31.dev0 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
2023-06-08 21:10:01 +00:00
Aaron Pham [bot]
55d584a986 infra: prepare for release 0.0.30 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
v0.0.30
2023-06-08 20:55:39 +00:00
aarnphm-ec2-dev
2f9bd2f6fe fix(packaging): make sure to add BENTOML_CONFIG_OPTIONS into
Dockerfile

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 20:33:20 +00:00
Aaron
71198b66cc revert: move release-notes to separate actions
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 16:03:41 -04:00
Aaron
1902954463 infra: bump to dev version of 0.0.30.dev0 [generated]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 16:03:36 -04:00
Aaron Pham [bot]
2db7663ba5 infra: prepare for release 0.0.29 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
v0.0.29
2023-06-08 19:56:51 +00:00
aarnphm-ec2-dev
42f8d0271c chore(model_name): shorten model name
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:41:59 +00:00
aarnphm-ec2-dev
d86fb322d0 fix(containerize): Install base openllm for non OpenLLM dev build
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:31:36 +00:00
aarnphm-ec2-dev
1c9c9645a7 fix(label): make sure to convert labels to all string
to avoid warning from bentoml

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:25:55 +00:00
aarnphm-ec2-dev
0f7840626d fix(cli): make sure to allow user to pass endpointu
--endpoint http://0.0.0.0:3000

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:23:04 +00:00
aarnphm-ec2-dev
f84b975a55 fix(llm): build to include openllm_client
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:20:43 +00:00
Jian Shen
e6dd1b1c39 docs: Update README.md
Signed-off-by: Jian Shen <jianshen92@gmail.com>
2023-06-09 03:02:53 +08:00
aarnphm-ec2-dev
15cb13839d fix(load_model): make sure to use correct implementation
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:01:01 +00:00
Aaron
a84661142c chore(cli): remove --local for query
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 14:53:11 -04:00
Aaron
7a162402a1 fix(llm): make sure to use correct load_model
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 14:50:58 -04:00
Aaron
20bc9153b1 fix(ci): checkout version on actions
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 14:40:38 -04:00
Aaron
20416ab107 infra: bump to dev version of 0.0.29.dev0 [generated]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 14:40:29 -04:00
Aaron Pham [bot]
f6d6b08369 infra: prepare for release 0.0.28 [generated]
Signed-off-by: Aaron Pham [bot] <29749331+aarnphm@users.noreply.github.com>
v0.0.28
2023-06-08 13:25:55 +00:00
Aaron
400445da6f fix(deps): broken name for bitsandbytes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 09:19:05 -04:00