Commit Graph

222 Commits

Author SHA1 Message Date
Aaron
69aae34cf4 fix(style): reduce boilerplate and format to custom logics
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-26 01:44:59 -05:00
Aaron Pham
e27764fe6b infra: prepare for release 0.4.28 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-24 07:09:06 +00:00
MingLiangDai
7b8d9024c4 fix(baichuan): supported from baichuan 2 from now on. (#728)
* config support multiple architectures

* chore: only support baichuan2 from now on

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update notes

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: run script [skip ci]

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-24 02:07:06 -05:00
Aaron Pham
d8a783772d infra: prepare for release 0.4.27 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-24 06:25:16 +00:00
Aaron
b4c9971678 fix(build): explicitly not lock packages
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-24 01:21:29 -05:00
Aaron
d0e12b1fb8 fix(metadata): remove unused packages
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-24 01:19:09 -05:00
Aaron
7dd4e3ac4b fix(build): don't lock packages for now, but do lock base requirements
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-24 01:17:45 -05:00
Aaron
7beaa92c2b fix(types): using correct refactored literal
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-24 01:14:29 -05:00
Aaron Pham
aab173cd99 refactor: focus (#730)
* perf: remove based images

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: move dockerifle to run on release only

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: cleanup unused types

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-24 01:11:31 -05:00
Aaron Pham
52a44b1bfa chore: cleanup loader (#729)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 21:51:51 -05:00
Aaron Pham
5442d9cd10 fix(trust_remote_code): handle args correctly (#727)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-22 17:03:13 -05:00
Aaron Pham
7eae50377d infra: prepare for release 0.4.26 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 11:50:50 +00:00
Aaron Pham
b28b5269b5 feat(openai): chat templates and complete control of prompt generation (#725)
* feat(openai): chat templates and complete control of prompt generation

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* fix: correctly use base chat templates

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* fix: remove symlink

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 06:49:14 -05:00
Aaron Pham
f83f64ffd7 fix(infra): setup higher timer for building container images (#723)
* fix(infra): setup higher timer for building container images

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: remove invalid tests

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-22 05:00:33 -05:00
Aaron Pham
0189342730 infra: prepare for release 0.4.25 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 09:22:45 +00:00
Aaron Pham
63d86faa32 fix(openai): correct stop tokens and finish_reason state (#722)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 04:21:13 -05:00
Aaron Pham
7f09f9daf2 infra: prepare for release 0.4.24 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 06:34:30 +00:00
Aaron
d697ea3903 fix(image): setup correct installation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-22 01:33:26 -05:00
Aaron Pham
85e03a4b92 infra: prepare for release 0.4.23 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 06:16:49 +00:00
Aaron Pham
38b7c44df0 fix(base-image): update base image to include cuda for now (#720)
* fix(base-image): update base image to include cuda for now

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: build core and client on release images

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: cleanup style changes

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-22 01:15:19 -05:00
Aaron Pham
8bb2742a9a chore(types): append additional types change (#719)
* chore(types): append additional types change

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* chore: add arguments for parsing dir

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-21 22:38:20 -05:00
Aaron Pham
04ef08a7f8 chore(strategy): compact and add stubs (#718)
generate service_vars automatically inline without reading from files

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-21 21:49:28 -05:00
Aaron Pham
909db8c3bf refactor: reduce compiled cacheline
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-22 02:27:42 +00:00
Aaron Pham
77bd6f090a chore(logger): fix warnings and streamline style (#717)
Sorry but there are too much wasted spacing in `_llm.py`, and I'm unhappy and not productive anytime I look or want to do anything with it

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-21 18:54:51 -05:00
Aaron
14242a7ab8 fix(utils): correct import
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 05:03:20 -05:00
Aaron Pham
c33b071ee4 refactor: delete unused code (#716)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 04:39:48 -05:00
Aaron Pham
a8a9f154ce fix(ci): tests (#715)
* fix: tests

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* chore: remove broken tests

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-21 03:05:22 -05:00
Aaron Pham
e70246ca5d feat(generation): add support for eos_token_id (#714)
chore: add support for custom eos_token_id

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 02:01:36 -05:00
Aaron Pham
fde78a2c78 chore: cleanup unused prompt templates (#713)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-21 01:56:51 -05:00
Aaron Pham
f3fd32d596 infra: prepare for release 0.4.22 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-21 01:38:46 +00:00
Aaron Pham
ad4f388c98 refactor: update runner helpers and add max_model_len (#712)
* chore(runner): cleanup unecessary checks for runnable backend

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: saving llm reference to runner

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: correct inject item

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update support for max_seq_len

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: correct max_model_len

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update and warning backward compatibility

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: remove unused sets

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 20:37:15 -05:00
Aaron Pham
4c4bc82a47 infra: prepare for release 0.4.21 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 22:32:44 +00:00
Aaron
00e2666e48 fix(build): contraint packages for bentoml >1.1.10
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 17:30:38 -05:00
Aaron Pham
204cbd43d2 infra: prepare for release 0.4.20 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 22:09:47 +00:00
Aaron
f753662ae6 fix(build): only load model when eager is True
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 17:06:25 -05:00
Aaron
5b92e848e2 fix: raises error if backend is not supported
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-20 17:03:30 -05:00
Aaron Pham
46d6fcca98 infra: prepare for release 0.4.19 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 08:06:53 +00:00
Aaron Pham
c1f86bda16 infra: prepare for release 0.4.18 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 05:15:14 +00:00
Aaron Pham
513c08ccda feat(openai): dynamic model_type registration (#704)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 00:13:45 -05:00
Aaron Pham
6505abdb44 chore: update lower bound version of bentoml to avoid breakage (#703)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 23:09:14 -05:00
Aaron Pham
d1915d7a9e infra: prepare for release 0.4.17 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-20 03:43:21 +00:00
Aaron Pham
4491aa54d0 fix(backend): correct use variable for backend when initialisation (#702)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 22:42:25 -05:00
Aaron Pham
e9207ff683 infra: prepare for release 0.4.16 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-19 15:41:03 +00:00
Aaron
cb4386b013 fix(release): remove unecessary check for client dependencies [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 10:39:38 -05:00
Aaron Pham
d80c392661 chore: update documentation about runtime (#699)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 10:27:07 -05:00
Aaron Pham
816c1ee80e feat(engine): CTranslate2 (#698)
* chore: update instruction for dependencies

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat(experimental): CTranslate2

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 10:25:08 -05:00
Aaron Pham
539f250c0f feat(vllm): bump to 0.2.2 (#695)
* feat(vllm): bump to 0.2.2

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: move up to CUDA 12.1

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: remove auto-gptq installation

since the builder image doesn't have access to GPU

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: update containerization warning

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 02:52:32 -05:00
Aaron Pham
206521e02d feat(ctranslate): initial infrastructure support (#694)
* perf: compact and improve speed and agility

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* --wip--

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: cleanup infrastructure

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update styles notes and autogen mypy configuration

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-19 01:48:33 -05:00
Aaron Pham
c19654adf3 infra: prepare for release 0.4.15 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-19 00:47:18 +00:00
Aaron Pham
1831d8f129 feat: heuristics logprobs (#692)
* fix(encoder): bring back T5 support on PyTorch

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat: support logprobs and prompt_logprobs

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* docs: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-18 19:26:20 -05:00