Commit Graph

197 Commits

Author SHA1 Message Date
Aaron Pham
c33b071ee4 refactor: delete unused code (#716)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 04:39:48 -05:00
Aaron Pham
a8a9f154ce fix(ci): tests (#715)
* fix: tests

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* chore: remove broken tests

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-21 03:05:22 -05:00
Aaron Pham
e70246ca5d feat(generation): add support for eos_token_id (#714)
chore: add support for custom eos_token_id

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 02:01:36 -05:00
Aaron Pham
fde78a2c78 chore: cleanup unused prompt templates (#713)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 01:56:51 -05:00
Aaron Pham
f3fd32d596 infra: prepare for release 0.4.22 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-21 01:38:46 +00:00
Aaron Pham
ad4f388c98 refactor: update runner helpers and add max_model_len (#712)
* chore(runner): cleanup unecessary checks for runnable backend

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: saving llm reference to runner

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: correct inject item

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update support for max_seq_len

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: correct max_model_len

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update and warning backward compatibility

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: remove unused sets

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 20:37:15 -05:00
Aaron Pham
4c4bc82a47 infra: prepare for release 0.4.21 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 22:32:44 +00:00
Aaron
00e2666e48 fix(build): contraint packages for bentoml >1.1.10
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 17:30:38 -05:00
Aaron Pham
204cbd43d2 infra: prepare for release 0.4.20 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 22:09:47 +00:00
Aaron
f753662ae6 fix(build): only load model when eager is True
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 17:06:25 -05:00
Aaron
5b92e848e2 fix: raises error if backend is not supported
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 17:03:30 -05:00
Aaron Pham
46d6fcca98 infra: prepare for release 0.4.19 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 08:06:53 +00:00
Aaron Pham
c1f86bda16 infra: prepare for release 0.4.18 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 05:15:14 +00:00
Aaron Pham
513c08ccda feat(openai): dynamic model_type registration (#704)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 00:13:45 -05:00
Aaron Pham
6505abdb44 chore: update lower bound version of bentoml to avoid breakage (#703)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 23:09:14 -05:00
Aaron Pham
d1915d7a9e infra: prepare for release 0.4.17 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 03:43:21 +00:00
Aaron Pham
4491aa54d0 fix(backend): correct use variable for backend when initialisation (#702)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 22:42:25 -05:00
Aaron Pham
e9207ff683 infra: prepare for release 0.4.16 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-19 15:41:03 +00:00
Aaron
cb4386b013 fix(release): remove unecessary check for client dependencies [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 10:39:38 -05:00
Aaron Pham
d80c392661 chore: update documentation about runtime (#699)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 10:27:07 -05:00
Aaron Pham
816c1ee80e feat(engine): CTranslate2 (#698)
* chore: update instruction for dependencies

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat(experimental): CTranslate2

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 10:25:08 -05:00
Aaron Pham
539f250c0f feat(vllm): bump to 0.2.2 (#695)
* feat(vllm): bump to 0.2.2

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: move up to CUDA 12.1

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: remove auto-gptq installation

since the builder image doesn't have access to GPU

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: update containerization warning

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 02:52:32 -05:00
Aaron Pham
206521e02d feat(ctranslate): initial infrastructure support (#694)
* perf: compact and improve speed and agility

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* --wip--

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: cleanup infrastructure

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update styles notes and autogen mypy configuration

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 01:48:33 -05:00
Aaron Pham
c19654adf3 infra: prepare for release 0.4.15 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-19 00:47:18 +00:00
Aaron Pham
1831d8f129 feat: heuristics logprobs (#692)
* fix(encoder): bring back T5 support on PyTorch

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat: support logprobs and prompt_logprobs

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* docs: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-18 19:26:20 -05:00
Aaron Pham
4499469efb fix(annotations): check library through find_spec (#691)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-18 02:02:16 -05:00
Aaron Pham
5402db1e61 infra: prepare for release 0.4.14 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-17 21:54:10 +00:00
Aaron Pham
e14f3ffed5 infra: prepare for release 0.4.13 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-17 21:06:56 +00:00
Aaron Pham
80ed400646 fix(build): lock lower version based on each release and update infra (#686)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 15:57:31 -05:00
Aaron Pham
381d740a7a fix(llm): remove unnecessary check (#683)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 11:23:22 -05:00
Aaron Pham
65370f6919 infra: prepare for release 0.4.12 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-17 15:54:41 +00:00
Aaron Pham
14b3ceb436 fix(torch_dtype): correctly infer based on options (#682)
Users should be able to set the dtype during build, as we it doesn't effect start time

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 10:52:05 -05:00
Aaron Pham
7402408c5f fix(envvar): explicitly set NVIDIA_DRIVER_CAPABILITIES (#681)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 10:40:45 -05:00
Aaron Pham
5752c3f0d8 infra: prepare for release 0.4.11 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-17 14:53:12 +00:00
Aaron Pham
bce273ad47 fix(env): correct format environment on docker (#680)
* fix(env): correct format environment on docker

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* docs: changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 09:51:17 -05:00
Aaron Pham
c1e0e3eae7 fix(build): correctly parse default env for container (#679)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 09:35:26 -05:00
Aaron Pham
60b60ed29a infra: update cbfmt options (#676)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 07:51:33 -05:00
Aaron Pham
f4de4a9f13 infra: prepare for release 0.4.10 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-17 06:16:58 +00:00
Aaron Pham
d60ca49d2f perf: potentially reduce image size (#675)
* perf: potentially reduce image size

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* perf: use base python packages only

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: typo

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* perf: Shave off 2GB

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-17 01:15:56 -05:00
Aaron Pham
09cc84a56c chore(loading): include verbose warning about trust_remote_code (#674)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 20:09:50 -05:00
Aaron Pham
1a38de9b1f fix(docs): chatglm support on vLLM (#673)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 17:54:06 -05:00
Aaron Pham
c850d76ccd feat(models): Phi 1.5 (#672)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 17:48:10 -05:00
Aaron Pham
44f6db982d infra: remove codegolf (#671)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 17:38:47 -05:00
Aaron Pham
8fdfd0491f perf(build): locking and improve build speed (#669)
* revert(build): not locking packages

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* perf: improve svars generation and unifying envvar parsing

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* docs: update changelog

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* chore: update stubs check for mypy

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-16 06:27:45 -05:00
Aaron Pham
fce8f223f3 perf: reduce footprint (#668)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 04:45:49 -05:00
Aaron Pham
9e3f0fea15 types: update stubs for remaining entrypoints (#667)
* perf(type): static OpenAI types definition

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat: add hf types

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* types: update remaining missing stubs

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 04:26:13 -05:00
Aaron Pham
6102a67a83 infra: makes huggingface-hub requirements on fine-tune (#665)
infra: makes huggingface-hub core deps

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 03:12:52 -05:00
Aaron Pham
86d23fd6f5 feat(llm): respect warnings environment for dtype warning (#664)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 03:05:58 -05:00
Aaron Pham
4a6f13ddd2 feat(type): provide structured annotations stubs (#663)
* feat(type): provide client stubs

separation of concern for more brevity code base

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* docs: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-16 02:58:45 -05:00
Kuan-Chun Wang
af88b9b077 fix(runner): remove keyword args for attrs.get() (#661) 2023-11-15 04:59:01 -05:00