Aaron
727361ced7
chore: running updated ruff formatter [skip ci]
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2024-03-15 05:35:24 -04:00
Aaron Pham
072b3e97ec
feat: 1.2 APIs ( #821 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-03-15 03:49:19 -04:00
Aaron
e3392476be
revert: "ci: pre-commit autoupdate [pre-commit.ci] ( #931 )"
...
This reverts commit 7b00c84c2a .
2024-03-15 03:47:23 -04:00
pre-commit-ci[bot]
7b00c84c2a
ci: pre-commit autoupdate [pre-commit.ci] ( #931 )
...
* ci: pre-commit autoupdate [pre-commit.ci]
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.2.2 → v0.3.2](https://github.com/astral-sh/ruff-pre-commit/compare/v0.2.2...v0.3.2 )
- [github.com/pre-commit/mirrors-eslint: v9.0.0-beta.0 → v9.0.0-beta.2](https://github.com/pre-commit/mirrors-eslint/compare/v9.0.0-beta.0...v9.0.0-beta.2 )
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-03-15 03:46:28 -04:00
Fazli Sapuan
6b3a1bd708
chore: fix typo in list_models pydoc ( #847 )
2024-01-15 08:26:48 -05:00
Aaron Pham
5d27337e82
fix(cli): avoid runtime __origin__ check for older Python ( #798 )
...
fix(cli): avoid runtime __origin__ on older Python
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-12-18 12:33:36 -05:00
Aaron Pham
c8c9663d06
fix(infra): conform ruff to 150 LL ( #781 )
...
Generally correctly format it with ruff format and manual style
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-12-14 17:27:32 -05:00
Aaron Pham
2dbcfa8a0c
fix(cli): correct set arguments for openllm import and openllm build ( #775 )
...
* fix(cli): correct set arguments for `openllm import` and `openllm build`
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-12-13 15:52:59 -05:00
Aaron Pham
9706228956
chore(vllm): add arguments for gpu memory utilization
...
Probably not going to fix anything, just delaying the problem.
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-29 06:45:14 +00:00
Aaron Pham
f0fa06004b
chore: revert back previous backend support PyTorch ( #739 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-29 01:44:41 -05:00
Aaron
96318b65ee
fix(sdk): remove broken sdk
...
codespace now around 2.8k lines
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-26 04:53:36 -05:00
Aaron Pham
aab173cd99
refactor: focus ( #730 )
...
* perf: remove based images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: move dockerifle to run on release only
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: cleanup unused types
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-24 01:11:31 -05:00
Aaron Pham
52a44b1bfa
chore: cleanup loader ( #729 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-22 21:51:51 -05:00
Aaron Pham
5442d9cd10
fix(trust_remote_code): handle args correctly ( #727 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-22 17:03:13 -05:00
Aaron Pham
b28b5269b5
feat(openai): chat templates and complete control of prompt generation ( #725 )
...
* feat(openai): chat templates and complete control of prompt generation
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* fix: correctly use base chat templates
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* fix: remove symlink
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-22 06:49:14 -05:00
Aaron Pham
63d86faa32
fix(openai): correct stop tokens and finish_reason state ( #722 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-22 04:21:13 -05:00
Aaron Pham
38b7c44df0
fix(base-image): update base image to include cuda for now ( #720 )
...
* fix(base-image): update base image to include cuda for now
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: build core and client on release images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: cleanup style changes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-22 01:15:19 -05:00
Aaron Pham
77bd6f090a
chore(logger): fix warnings and streamline style ( #717 )
...
Sorry but there are too much wasted spacing in `_llm.py`, and I'm unhappy and not productive anytime I look or want to do anything with it
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-21 18:54:51 -05:00
Aaron Pham
c33b071ee4
refactor: delete unused code ( #716 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-21 04:39:48 -05:00
Aaron Pham
fde78a2c78
chore: cleanup unused prompt templates ( #713 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-21 01:56:51 -05:00
Aaron Pham
ad4f388c98
refactor: update runner helpers and add max_model_len ( #712 )
...
* chore(runner): cleanup unecessary checks for runnable backend
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: saving llm reference to runner
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: correct inject item
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update support for max_seq_len
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: correct max_model_len
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update and warning backward compatibility
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: remove unused sets
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-20 20:37:15 -05:00
Aaron
f753662ae6
fix(build): only load model when eager is True
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-20 17:06:25 -05:00
Aaron
5b92e848e2
fix: raises error if backend is not supported
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-20 17:03:30 -05:00
Aaron Pham
816c1ee80e
feat(engine): CTranslate2 ( #698 )
...
* chore: update instruction for dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat(experimental): CTranslate2
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-19 10:25:08 -05:00
Aaron Pham
14b3ceb436
fix(torch_dtype): correctly infer based on options ( #682 )
...
Users should be able to set the dtype during build, as we it doesn't effect start time
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 10:52:05 -05:00
Aaron Pham
bce273ad47
fix(env): correct format environment on docker ( #680 )
...
* fix(env): correct format environment on docker
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* docs: changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 09:51:17 -05:00
Aaron Pham
c1e0e3eae7
fix(build): correctly parse default env for container ( #679 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 09:35:26 -05:00
Aaron Pham
8fdfd0491f
perf(build): locking and improve build speed ( #669 )
...
* revert(build): not locking packages
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* perf: improve svars generation and unifying envvar parsing
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* docs: update changelog
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: update stubs check for mypy
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-16 06:27:45 -05:00
Aaron Pham
9e3f0fea15
types: update stubs for remaining entrypoints ( #667 )
...
* perf(type): static OpenAI types definition
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat: add hf types
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* types: update remaining missing stubs
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-16 04:26:13 -05:00
Aaron Pham
4a6f13ddd2
feat(type): provide structured annotations stubs ( #663 )
...
* feat(type): provide client stubs
separation of concern for more brevity code base
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* docs: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-16 02:58:45 -05:00
Aaron Pham
a58d947bc8
perf: improve build logics and cleanup speed ( #657 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-15 00:18:31 -05:00
Aaron Pham
103156cd71
chore(cli): move playground to CLI components ( #655 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-14 23:20:50 -05:00
Aaron Pham
6a6d689a77
feat: Yi models ( #651 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-14 21:55:24 -05:00
Aaron Pham
b4b70e2f20
fix(cli): update context name parsing correctly ( #652 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-14 21:53:56 -05:00
Aaron Pham
b30a412398
fix(cli): set default dtype to auto infer ( #642 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-13 23:05:27 -05:00
pre-commit-ci[bot]
52367d1e8b
ci: pre-commit autoupdate [pre-commit.ci] ( #629 )
...
* ci: pre-commit autoupdate [pre-commit.ci]
updates:
- [github.com/pre-commit/mirrors-prettier: v3.0.3 → v3.1.0](https://github.com/pre-commit/mirrors-prettier/compare/v3.0.3...v3.1.0 )
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-13 13:07:53 -05:00
Aaron Pham
852cd863a9
fix(cli): make sure to pass the dtype to subprocess service ( #628 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-13 05:32:17 -05:00
Aaron Pham
099c0dc31b
feat(cli): --dtype arguments ( #627 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-13 05:25:50 -05:00
Aaron Pham
126e6c9d63
fix(ruff): correct consistency between isort and formatter ( #624 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 21:12:50 -05:00
Aaron Pham
de04de7136
fix(sdk): make sure build to quiet out stdout ( #622 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:59:48 -05:00
Aaron Pham
e667dac82f
chore(cli): always show available models ( #621 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:36:19 -05:00
Aaron Pham
c50a7db80d
fix(cli): correct set working_dir ( #620 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:34:11 -05:00
Aaron Pham
e0632a85ed
refactor(cli): move out to its own packages ( #619 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:25:44 -05:00