Aaron Pham
1d817a7e01
fix: add support for min_tokens
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2024-04-02 04:22:14 +00:00
Aaron Pham
5c0d2787c0
feat: add dbrx support
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2024-04-02 04:10:19 +00:00
Aaron Pham
3ef93fe371
chore: update support development_mode as DEBUG and support for RELOAD
...
envvar
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2024-03-22 01:19:32 +00:00
Aaron Pham
80b35f0d72
revert: correct type for openapi schema generation
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2024-03-21 07:51:00 +00:00
Aaron Pham
51bec78ee9
fix(load): make sure to respect MAX_MODEL_LEN from env
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2024-03-21 07:44:49 +00:00
Aaron Pham
072b3e97ec
feat: 1.2 APIs ( #821 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-03-15 03:49:19 -04:00
Aaron
e3392476be
revert: "ci: pre-commit autoupdate [pre-commit.ci] ( #931 )"
...
This reverts commit 7b00c84c2a .
2024-03-15 03:47:23 -04:00
pre-commit-ci[bot]
7b00c84c2a
ci: pre-commit autoupdate [pre-commit.ci] ( #931 )
...
* ci: pre-commit autoupdate [pre-commit.ci]
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.2.2 → v0.3.2](https://github.com/astral-sh/ruff-pre-commit/compare/v0.2.2...v0.3.2 )
- [github.com/pre-commit/mirrors-eslint: v9.0.0-beta.0 → v9.0.0-beta.2](https://github.com/pre-commit/mirrors-eslint/compare/v9.0.0-beta.0...v9.0.0-beta.2 )
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-03-15 03:46:28 -04:00
Zhao Shenyang
16d8caf2ee
chore: bump up bentoml version to 1.1.11 ( #883 )
2024-02-04 21:31:14 +08:00
Zhao Shenyang
9d0e292076
fix: limit BentoML version range ( #881 )
2024-02-04 16:59:21 +08:00
Zhao Shenyang
6c909aabdb
chore: set stop to empty list by default ( #878 )
2024-02-02 19:28:49 +08:00
Aaron Pham
2bb97f8ba2
chore: update discord link ( #838 )
...
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2024-01-08 19:09:51 -05:00
Aaron Pham
88b6d3d6de
perf: upgrade mixtral to use expert parallelism ( #783 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-12-15 11:45:08 -05:00
Aaron Pham
c8c9663d06
fix(infra): conform ruff to 150 LL ( #781 )
...
Generally correctly format it with ruff format and manual style
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-12-14 17:27:32 -05:00
Aaron Pham
10f508d051
fix(mixtral): correct chat templates to remove additional spacing ( #774 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-12-13 12:34:06 -05:00
Aaron Pham
3ab78cd105
feat(mixtral): correct support for mixtral ( #772 )
...
feat(mixtral): support inference with pt
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-12-13 09:03:56 -05:00
Aaron Pham
6a185bc88e
fix(logprobs): explicitly set logprobs=None ( #757 )
...
logprobs=0 still outputs logprobs for one generation
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-12-07 03:36:39 -05:00
Aaron
9a7e0cecf0
fix(types): makes sures mypy is running strict
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-30 09:42:24 -05:00
Aaron
9d1b16395e
infra: remove redundant mypy config
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-30 09:33:52 -05:00
yansheng
3cb7f14fc1
feat(models): Support qwen ( #742 )
...
* support qwen
* support qwen
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* Update openllm-core/src/openllm_core/config/configuration_qwen.py
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: update correct readme and supports qwen models
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: root <yansheng105@gmail.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-30 06:54:17 -05:00
Aaron Pham
d04309188b
chore(style): 2.7k
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-28 07:04:27 +00:00
Aaron
96318b65ee
fix(sdk): remove broken sdk
...
codespace now around 2.8k lines
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-26 04:53:36 -05:00
Aaron
69aae34cf4
fix(style): reduce boilerplate and format to custom logics
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-26 01:44:59 -05:00
MingLiangDai
7b8d9024c4
fix(baichuan): supported from baichuan 2 from now on. ( #728 )
...
* config support multiple architectures
* chore: only support baichuan2 from now on
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update notes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: run script [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-24 02:07:06 -05:00
Aaron Pham
aab173cd99
refactor: focus ( #730 )
...
* perf: remove based images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: move dockerifle to run on release only
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: cleanup unused types
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-24 01:11:31 -05:00
Aaron Pham
52a44b1bfa
chore: cleanup loader ( #729 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-22 21:51:51 -05:00
Aaron Pham
5442d9cd10
fix(trust_remote_code): handle args correctly ( #727 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-22 17:03:13 -05:00
Aaron Pham
b28b5269b5
feat(openai): chat templates and complete control of prompt generation ( #725 )
...
* feat(openai): chat templates and complete control of prompt generation
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* fix: correctly use base chat templates
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* fix: remove symlink
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-22 06:49:14 -05:00
Aaron Pham
7aa0918a6f
fix(client): correct schemas parser from correct response output ( #724 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-22 05:01:35 -05:00
Aaron Pham
63d86faa32
fix(openai): correct stop tokens and finish_reason state ( #722 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-22 04:21:13 -05:00
Aaron Pham
38b7c44df0
fix(base-image): update base image to include cuda for now ( #720 )
...
* fix(base-image): update base image to include cuda for now
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: build core and client on release images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: cleanup style changes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-22 01:15:19 -05:00
Aaron Pham
8bb2742a9a
chore(types): append additional types change ( #719 )
...
* chore(types): append additional types change
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: add arguments for parsing dir
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-21 22:38:20 -05:00
Aaron Pham
77bd6f090a
chore(logger): fix warnings and streamline style ( #717 )
...
Sorry but there are too much wasted spacing in `_llm.py`, and I'm unhappy and not productive anytime I look or want to do anything with it
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-21 18:54:51 -05:00
Aaron Pham
c33b071ee4
refactor: delete unused code ( #716 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-21 04:39:48 -05:00
Aaron Pham
a8a9f154ce
fix(ci): tests ( #715 )
...
* fix: tests
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: remove broken tests
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-21 03:05:22 -05:00
Aaron Pham
fde78a2c78
chore: cleanup unused prompt templates ( #713 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-21 01:56:51 -05:00
Aaron Pham
ad4f388c98
refactor: update runner helpers and add max_model_len ( #712 )
...
* chore(runner): cleanup unecessary checks for runnable backend
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: saving llm reference to runner
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: correct inject item
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update support for max_seq_len
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: correct max_model_len
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update and warning backward compatibility
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: remove unused sets
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-20 20:37:15 -05:00
Aaron Pham
816c1ee80e
feat(engine): CTranslate2 ( #698 )
...
* chore: update instruction for dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat(experimental): CTranslate2
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-19 10:25:08 -05:00
Aaron Pham
206521e02d
feat(ctranslate): initial infrastructure support ( #694 )
...
* perf: compact and improve speed and agility
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* --wip--
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: cleanup infrastructure
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update styles notes and autogen mypy configuration
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-19 01:48:33 -05:00
Aaron Pham
099cc22a94
chore: update documentation ( #693 )
...
* chore: update documentation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update readme
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update documentations for configuration
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-18 19:44:52 -05:00
Aaron Pham
1831d8f129
feat: heuristics logprobs ( #692 )
...
* fix(encoder): bring back T5 support on PyTorch
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat: support logprobs and prompt_logprobs
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* docs: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-18 19:26:20 -05:00
Aaron Pham
e9a89b7a7e
fix(cattrs): strictly lock <23.2 until we upgrade validation logic ( #690 )
...
fix(cattrs): strictly lock <23.2 until we move converter to upper version
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 17:11:15 -05:00
Aaron Pham
0891cde0b6
fix(dependencies): ignore broken cattrs release ( #689 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 16:52:58 -05:00
Aaron Pham
80ed400646
fix(build): lock lower version based on each release and update infra ( #686 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 15:57:31 -05:00
Aaron Pham
21a308538e
fix: correct set item for attrs >23.1 ( #678 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-17 09:16:52 -05:00
Aaron Pham
1a38de9b1f
fix(docs): chatglm support on vLLM ( #673 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-16 17:54:06 -05:00
Aaron Pham
c850d76ccd
feat(models): Phi 1.5 ( #672 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-16 17:48:10 -05:00
Aaron Pham
4a6f13ddd2
feat(type): provide structured annotations stubs ( #663 )
...
* feat(type): provide client stubs
separation of concern for more brevity code base
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* docs: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-16 02:58:45 -05:00
Aaron Pham
876586a30e
fix(falcon): remove early_stopping default arguments ( #660 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-15 02:49:54 -05:00
Aaron Pham
034e08cf08
infra: update scripts to run update readme automatically ( #658 )
...
* infra: update scripts to run update readme automatically
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: cleanup mirror
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore(dropdown): correctly format noteblock and important block
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: whitespace aware
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-15 02:22:49 -05:00