OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-01-17 03:47:54 -05:00

Author	SHA1	Message	Date
Aaron	727361ced7	chore: running updated ruff formatter [skip ci] Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2024-03-15 05:35:24 -04:00
Aaron Pham	072b3e97ec	feat: 1.2 APIs (#821 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-03-15 03:49:19 -04:00
Aaron	e3392476be	revert: "ci: pre-commit autoupdate [pre-commit.ci] (#931 )" This reverts commit `7b00c84c2a`.	2024-03-15 03:47:23 -04:00
pre-commit-ci[bot]	7b00c84c2a	ci: pre-commit autoupdate [pre-commit.ci] (#931 ) * ci: pre-commit autoupdate [pre-commit.ci] updates: - [github.com/astral-sh/ruff-pre-commit: v0.2.2 → v0.3.2](https://github.com/astral-sh/ruff-pre-commit/compare/v0.2.2...v0.3.2) - [github.com/pre-commit/mirrors-eslint: v9.0.0-beta.0 → v9.0.0-beta.2](https://github.com/pre-commit/mirrors-eslint/compare/v9.0.0-beta.0...v9.0.0-beta.2) * ci: auto fixes from pre-commit.ci For more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-03-15 03:46:28 -04:00
Fazli Sapuan	6b3a1bd708	chore: fix typo in list_models pydoc (#847 )	2024-01-15 08:26:48 -05:00
Aaron Pham	5d27337e82	fix(cli): avoid runtime `__origin__` check for older Python (#798 ) fix(cli): avoid runtime __origin__ on older Python Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-12-18 12:33:36 -05:00
Aaron Pham	c8c9663d06	fix(infra): conform ruff to 150 LL (#781 ) Generally correctly format it with ruff format and manual style Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-12-14 17:27:32 -05:00
Aaron Pham	2dbcfa8a0c	fix(cli): correct set arguments for `openllm import` and `openllm build` (#775 ) * fix(cli): correct set arguments for `openllm import` and `openllm build` Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update changelog Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-12-13 15:52:59 -05:00
Aaron Pham	9706228956	chore(vllm): add arguments for gpu memory utilization Probably not going to fix anything, just delaying the problem. Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-29 06:45:14 +00:00
Aaron Pham	f0fa06004b	chore: revert back previous backend support PyTorch (#739 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-29 01:44:41 -05:00
Aaron	96318b65ee	fix(sdk): remove broken sdk codespace now around 2.8k lines Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-26 04:53:36 -05:00
Aaron Pham	aab173cd99	refactor: focus (#730 ) * perf: remove based images Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update changelog Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: move dockerifle to run on release only Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: cleanup unused types Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-24 01:11:31 -05:00
Aaron Pham	52a44b1bfa	chore: cleanup loader (#729 ) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-22 21:51:51 -05:00
Aaron Pham	5442d9cd10	fix(trust_remote_code): handle args correctly (#727 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-22 17:03:13 -05:00
Aaron Pham	b28b5269b5	feat(openai): chat templates and complete control of prompt generation (#725 ) * feat(openai): chat templates and complete control of prompt generation Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: correctly use base chat templates Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * fix: remove symlink Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-22 06:49:14 -05:00
Aaron Pham	63d86faa32	fix(openai): correct stop tokens and finish_reason state (#722 ) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-22 04:21:13 -05:00
Aaron Pham	38b7c44df0	fix(base-image): update base image to include cuda for now (#720 ) * fix(base-image): update base image to include cuda for now Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * fix: build core and client on release images Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: cleanup style changes Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-22 01:15:19 -05:00
Aaron Pham	77bd6f090a	chore(logger): fix warnings and streamline style (#717 ) Sorry but there are too much wasted spacing in `_llm.py`, and I'm unhappy and not productive anytime I look or want to do anything with it --------- Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-11-21 18:54:51 -05:00
Aaron Pham	c33b071ee4	refactor: delete unused code (#716 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-21 04:39:48 -05:00
Aaron Pham	fde78a2c78	chore: cleanup unused prompt templates (#713 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-21 01:56:51 -05:00
Aaron Pham	ad4f388c98	refactor: update runner helpers and add max_model_len (#712 ) * chore(runner): cleanup unecessary checks for runnable backend Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: saving llm reference to runner Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: correct inject item Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update support for max_seq_len Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * fix: correct max_model_len Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update and warning backward compatibility Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: remove unused sets Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-20 20:37:15 -05:00
Aaron	f753662ae6	fix(build): only load model when eager is True Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-20 17:06:25 -05:00
Aaron	5b92e848e2	fix: raises error if backend is not supported Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-20 17:03:30 -05:00
Aaron Pham	816c1ee80e	feat(engine): CTranslate2 (#698 ) * chore: update instruction for dependencies Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * feat(experimental): CTranslate2 Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-19 10:25:08 -05:00
Aaron Pham	14b3ceb436	fix(torch_dtype): correctly infer based on options (#682 ) Users should be able to set the dtype during build, as we it doesn't effect start time Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-17 10:52:05 -05:00
Aaron Pham	bce273ad47	fix(env): correct format environment on docker (#680 ) * fix(env): correct format environment on docker Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * docs: changelog Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-17 09:51:17 -05:00
Aaron Pham	c1e0e3eae7	fix(build): correctly parse default env for container (#679 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-17 09:35:26 -05:00
Aaron Pham	8fdfd0491f	perf(build): locking and improve build speed (#669 ) * revert(build): not locking packages Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * perf: improve svars generation and unifying envvar parsing Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * docs: update changelog Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> * chore: update stubs check for mypy Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-16 06:27:45 -05:00
Aaron Pham	9e3f0fea15	types: update stubs for remaining entrypoints (#667 ) * perf(type): static OpenAI types definition Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * feat: add hf types Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * types: update remaining missing stubs Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-16 04:26:13 -05:00
Aaron Pham	4a6f13ddd2	feat(type): provide structured annotations stubs (#663 ) * feat(type): provide client stubs separation of concern for more brevity code base Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * docs: update changelog Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-16 02:58:45 -05:00
Aaron Pham	a58d947bc8	perf: improve build logics and cleanup speed (#657 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-15 00:18:31 -05:00
Aaron Pham	103156cd71	chore(cli): move playground to CLI components (#655 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-14 23:20:50 -05:00
Aaron Pham	6a6d689a77	feat: Yi models (#651 ) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-14 21:55:24 -05:00
Aaron Pham	b4b70e2f20	fix(cli): update context name parsing correctly (#652 ) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-14 21:53:56 -05:00
Aaron Pham	b30a412398	fix(cli): set default dtype to auto infer (#642 ) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-13 23:05:27 -05:00
pre-commit-ci[bot]	52367d1e8b	ci: pre-commit autoupdate [pre-commit.ci] (#629 ) * ci: pre-commit autoupdate [pre-commit.ci] updates: - [github.com/pre-commit/mirrors-prettier: v3.0.3 → v3.1.0](https://github.com/pre-commit/mirrors-prettier/compare/v3.0.3...v3.1.0) * ci: auto fixes from pre-commit.ci For more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-11-13 13:07:53 -05:00
Aaron Pham	852cd863a9	fix(cli): make sure to pass the dtype to subprocess service (#628 ) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>	2023-11-13 05:32:17 -05:00
Aaron Pham	099c0dc31b	feat(cli): `--dtype` arguments (#627 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-13 05:25:50 -05:00
Aaron Pham	126e6c9d63	fix(ruff): correct consistency between isort and formatter (#624 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-12 21:12:50 -05:00
Aaron Pham	de04de7136	fix(sdk): make sure build to quiet out stdout (#622 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-12 18:59:48 -05:00
Aaron Pham	e667dac82f	chore(cli): always show available models (#621 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-12 18:36:19 -05:00
Aaron Pham	c50a7db80d	fix(cli): correct set working_dir (#620 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-12 18:34:11 -05:00
Aaron Pham	e0632a85ed	refactor(cli): move out to its own packages (#619 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-11-12 18:25:44 -05:00

43 Commits