Aaron Pham
|
81688e0949
|
infra: prepare for release 0.4.34 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-30 12:17:48 +00:00 |
|
yansheng
|
3cb7f14fc1
|
feat(models): Support qwen (#742)
* support qwen
* support qwen
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* Update openllm-core/src/openllm_core/config/configuration_qwen.py
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* chore: update correct readme and supports qwen models
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: root <yansheng105@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-30 06:54:17 -05:00 |
|
Aaron Pham
|
9fa0dee406
|
infra: prepare for release 0.4.33 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-29 18:01:13 +00:00 |
|
Aaron Pham
|
69deedd9b8
|
infra: prepare for release 0.4.32 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-29 07:39:42 +00:00 |
|
Aaron Pham
|
77af72ed2a
|
infra: prepare for release 0.4.31 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-26 23:49:04 +00:00 |
|
Aaron Pham
|
e157d3aa9e
|
infra: prepare for release 0.4.30 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-26 09:54:42 +00:00 |
|
Aaron Pham
|
f7a803dfa2
|
infra: prepare for release 0.4.29 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-26 07:50:48 +00:00 |
|
Aaron Pham
|
e27764fe6b
|
infra: prepare for release 0.4.28 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-24 07:09:06 +00:00 |
|
Aaron Pham
|
d8a783772d
|
infra: prepare for release 0.4.27 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-24 06:25:16 +00:00 |
|
Aaron
|
d0e12b1fb8
|
fix(metadata): remove unused packages
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-24 01:19:09 -05:00 |
|
Aaron Pham
|
5442d9cd10
|
fix(trust_remote_code): handle args correctly (#727)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 17:03:13 -05:00 |
|
Aaron Pham
|
7eae50377d
|
infra: prepare for release 0.4.26 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 11:50:50 +00:00 |
|
Aaron Pham
|
0189342730
|
infra: prepare for release 0.4.25 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 09:22:45 +00:00 |
|
Aaron Pham
|
7f09f9daf2
|
infra: prepare for release 0.4.24 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 06:34:30 +00:00 |
|
Aaron Pham
|
85e03a4b92
|
infra: prepare for release 0.4.23 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 06:16:49 +00:00 |
|
Aaron Pham
|
f3fd32d596
|
infra: prepare for release 0.4.22 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-21 01:38:46 +00:00 |
|
Aaron Pham
|
4c4bc82a47
|
infra: prepare for release 0.4.21 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-20 22:32:44 +00:00 |
|
Aaron Pham
|
204cbd43d2
|
infra: prepare for release 0.4.20 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-20 22:09:47 +00:00 |
|
Aaron Pham
|
46d6fcca98
|
infra: prepare for release 0.4.19 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-20 08:06:53 +00:00 |
|
Aaron Pham
|
c1f86bda16
|
infra: prepare for release 0.4.18 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-20 05:15:14 +00:00 |
|
Aaron Pham
|
6505abdb44
|
chore: update lower bound version of bentoml to avoid breakage (#703)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 23:09:14 -05:00 |
|
Aaron Pham
|
d1915d7a9e
|
infra: prepare for release 0.4.17 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-20 03:43:21 +00:00 |
|
Aaron Pham
|
e9207ff683
|
infra: prepare for release 0.4.16 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 15:41:03 +00:00 |
|
Aaron
|
cb4386b013
|
fix(release): remove unecessary check for client dependencies [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 10:39:38 -05:00 |
|
Aaron Pham
|
539f250c0f
|
feat(vllm): bump to 0.2.2 (#695)
* feat(vllm): bump to 0.2.2
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: move up to CUDA 12.1
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* fix: remove auto-gptq installation
since the builder image doesn't have access to GPU
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* fix: update containerization warning
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 02:52:32 -05:00 |
|
Aaron Pham
|
206521e02d
|
feat(ctranslate): initial infrastructure support (#694)
* perf: compact and improve speed and agility
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* --wip--
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: cleanup infrastructure
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update styles notes and autogen mypy configuration
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 01:48:33 -05:00 |
|
Aaron Pham
|
c19654adf3
|
infra: prepare for release 0.4.15 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 00:47:18 +00:00 |
|
Aaron Pham
|
5402db1e61
|
infra: prepare for release 0.4.14 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-17 21:54:10 +00:00 |
|
Aaron Pham
|
e14f3ffed5
|
infra: prepare for release 0.4.13 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-17 21:06:56 +00:00 |
|
Aaron Pham
|
80ed400646
|
fix(build): lock lower version based on each release and update infra (#686)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-17 15:57:31 -05:00 |
|
Aaron Pham
|
44f6db982d
|
infra: remove codegolf (#671)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-16 17:38:47 -05:00 |
|
Aaron Pham
|
6102a67a83
|
infra: makes huggingface-hub requirements on fine-tune (#665)
infra: makes huggingface-hub core deps
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-16 03:12:52 -05:00 |
|
Aaron Pham
|
4a6f13ddd2
|
feat(type): provide structured annotations stubs (#663)
* feat(type): provide client stubs
separation of concern for more brevity code base
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* docs: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-16 02:58:45 -05:00 |
|
Aaron Pham
|
103156cd71
|
chore(cli): move playground to CLI components (#655)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-14 23:20:50 -05:00 |
|
Aaron Pham
|
0bf6ec7537
|
fix(dependencies): lock build < 1 for now (#643)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-14 00:36:08 -05:00 |
|
Zhao Shenyang
|
ae69524749
|
doc: update adding new model guide (#637)
* update
* Update openllm-python/ADDING_NEW_MODEL.md
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Zhao Shenyang <dev@zsy.im>
* Update openllm-python/ADDING_NEW_MODEL.md
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Zhao Shenyang <dev@zsy.im>
* Update openllm-python/ADDING_NEW_MODEL.md
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Zhao Shenyang <dev@zsy.im>
* move ADDING_NEW_MODEL.md to git root directory
---------
Signed-off-by: Zhao Shenyang <dev@zsy.im>
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-13 18:30:44 -05:00 |
|
Aaron Pham
|
e0632a85ed
|
refactor(cli): move out to its own packages (#619)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-12 18:25:44 -05:00 |
|
Aaron Pham
|
b8a2e8cf91
|
refactor(cli): cleanup API (#592)
* chore: remove unused imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* refactor(cli): update to only need model_id
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat: `openllm start model-id`
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: add changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog notice
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update correct config and running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update backward compat options and treat JSON outputs
corespondingly
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-09 11:40:17 -05:00 |
|
Aaron Pham
|
cfd09bfc47
|
chore(runner): yield the outputs directly (#573)
update openai client examples to >1
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-11-07 22:34:11 -05:00 |
|
Aaron Pham
|
e2029c934b
|
perf: unify LLM interface (#518)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-11-06 20:39:43 -05:00 |
|
Aaron Pham
|
72c6005d3b
|
chore(inference): update vllm to 0.2.1.post1 and update config parsing (#554)
chore(dependencies): update vllm to 0.2.1.post1 and update config
parsing
|
2023-11-04 04:01:56 -04:00 |
|
aarnphm-ec2-dev
|
65c76cace3
|
chore: update deps for transformers and vllm
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-10-11 04:28:46 +00:00 |
|
XunchaoZ
|
04bb29a264
|
feat: OpenAI-compatible API (#417)
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-10-07 00:50:03 -04:00 |
|
Aaron Pham
|
ad9107958d
|
feat: continuous batching with vLLM (#349)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat: continuous batching
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: add changeloe
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: add one shot generation
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-14 03:09:36 -04:00 |
|
Aaron Pham
|
35e6945e86
|
fix(serialisation): vLLM safetensors support (#324)
* fix(serilisation): vllm support for safetensors
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: generalize one shot generation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: add changelog [skip ci]
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2023-09-12 17:44:01 -04:00 |
|
Aaron
|
0d50aa00b9
|
chore: add openllm-core as meta dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 10:31:40 -04:00 |
|
aarnphm-ec2-dev
|
8173cb09a5
|
fix(quantize): dyn quant for int8 and int4
only set tokenizer when it is gptq
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 01:48:45 +00:00 |
|
Aaron
|
887ffa9aa0
|
chore: cleanup pre-commit jobs and update usage
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-05 10:06:36 -04:00 |
|
Aaron Pham
|
956b3a53bc
|
fix(gptq): use upstream integration (#297)
* wip
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* feat: GPTQ transformers integration
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* fix: only load if variable is available and add changelog
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: remove boilerplate check
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-04 14:05:50 -04:00 |
|
Aaron Pham
|
2036d4e015
|
chore(build): use latest vllm pre-built kernel (#261)
|
2023-08-26 09:02:52 -04:00 |
|