Aaron Pham
e667dac82f
chore(cli): always show available models ( #621 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:36:19 -05:00
Aaron Pham
c50a7db80d
fix(cli): correct set working_dir ( #620 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:34:11 -05:00
Aaron Pham
e0632a85ed
refactor(cli): move out to its own packages ( #619 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 18:25:44 -05:00
Aaron Pham
c3416c0afd
feat(llm): update warning envvar and add embedded mode ( #618 )
...
* chore: unify warning envvar and update type inference
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore; update documentation about embedded
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 17:39:06 -05:00
Aaron Pham
7e1fb35a71
chore(llm): expose quantise and lazy load heavy imports ( #617 )
...
* chore(llm): expose quantise and lazy load heavy imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: move transformers to TYPE_CHECKING block
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 14:55:37 -05:00
Aaron Pham
fad4186dbc
feat(server): helpers endpoints for conversation format ( #613 )
...
* feat: add support for helpers conversation conversion endpoint
also correct schema generation for openllm client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update clients to reuse `openllm-core` logics
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: add changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-12 01:02:27 -05:00
Aaron Pham
7438005c04
refactor(config): simplify configuration and update start CLI output ( #611 )
...
* chore(config): simplify configuration and update start CLI output
handling
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: remove state and message sent after server lifecycle
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update color stream and refactor reusable logic
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update documentations and mypy
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-11 22:36:10 -05:00
Aaron Pham
c41828f68f
feat(client): support authentication token and shim implementation ( #605 )
...
* chore: synch generate_iterator to be the same as server
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* --wip--
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* wip
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat: cleanup shim implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* chore: fix pre-commit
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update check with tuple
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-10 17:44:31 -05:00
Aaron Pham
f89bec261c
fix: correct importmodules locally ( #601 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-10 03:32:12 -05:00
Aaron Pham
fa2038f4e2
fix: loading correct local models ( #599 )
...
* fix(model): loading local correctly
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: update repr and correct bentomodel processor
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* chore: cleanup transformers implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: ruff to ignore I001 on all stubs
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-10 02:36:12 -05:00
Aaron Pham
5e45245457
package: add openllm core dependencies to labels ( #600 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-10 02:33:55 -05:00
Aaron Pham
665a41940e
revert: configuration not to dump flatten ( #597 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-09 14:48:23 -05:00
Aaron Pham
d60f2fb909
infra: remove tsconfig ( #595 )
...
* infra: remove tsconfig
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* chore: filter only ec python and jsx
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update pnpm lock
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: run vendor
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: ignore blame
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: ignore on CI
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-09 13:06:31 -05:00
Aaron Pham
ac377fe490
infra: using ruff formatter ( #594 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-09 12:44:05 -05:00
Aaron Pham
b8a2e8cf91
refactor(cli): cleanup API ( #592 )
...
* chore: remove unused imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* refactor(cli): update to only need model_id
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat: `openllm start model-id`
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: add changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update changelog notice
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update correct config and running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update backward compat options and treat JSON outputs
corespondingly
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-09 11:40:17 -05:00
Aaron Pham
e87830ef0a
container: update tracing dependencies ( #591 )
...
* chore: update build message
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: add tracing dependencies to container
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-08 08:08:40 -05:00
Aaron Pham
0ea025da5a
fix(cli): append model-id instruction to build ( #590 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-08 07:44:36 -05:00
Aaron Pham
47107727b3
feat(vllm): squeezellm ( #588 )
...
* feat(vllm): squeezellm
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* fix: correct import_model with awq and gatekeep squeezellm for PyTorch
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-08 07:21:27 -05:00
Aaron Pham
ff8b6377c8
fix(awq): correct awq detection for support ( #586 )
...
* fix(awq): correct detection for awq
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: update base docker to work
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: disable awq on pytorch for now
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-08 06:57:11 -05:00
Aaron Pham
387637405d
fix(gptq): update config fields ( #585 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-08 05:20:30 -05:00
Aaron Pham
85a7243ac3
fix: device imports using strategies ( #584 )
...
* fix: device imports using strategies
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* chore: support trust_remote_code for vLLM runners
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-08 05:10:50 -05:00
Aaron Pham
ea42108e45
chore(service): cleanup API ( #579 )
...
* chore(service): cleanup API
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* fix: tests import
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-08 02:53:08 -05:00
Aaron Pham
7398ae0486
refactor(strategies): move logics into openllm-python ( #578 )
...
fix(strategies): move to openllm
Strategies shouldn't be a part of openllm-core
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-08 02:23:08 -05:00
Aaron Pham
97d7c38fea
refactor: cleanup typing to expose correct API ( #576 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-08 01:24:03 -05:00
Aaron Pham
cfd09bfc47
chore(runner): yield the outputs directly ( #573 )
...
update openai client examples to >1
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-07 22:34:11 -05:00
Aaron Pham
4d356f4b72
feat: Mistral support ( #571 )
...
* feat: Mistral support
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* chore: fix style
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: update README docs about mistral
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-07 17:28:02 -05:00
Aaron Pham
dc27b0e727
fix: update build dependencies and format chat prompt ( #569 )
...
chore: update correct check and format prompt
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-11-07 16:42:20 -05:00
Aaron Pham
8fade070f3
infra: update docs on serving fine-tuning layers ( #567 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-11-06 21:34:44 -05:00
Aaron Pham
e2029c934b
perf: unify LLM interface ( #518 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-06 20:39:43 -05:00
XunchaoZ
440e3d646f
fix: Max new tokens ( #550 )
...
Bug fix for retrieving user input max_new_tokens
2023-11-03 13:44:25 -04:00
XunchaoZ
392c7a8139
Fix chat template and message list bug ( #549 )
2023-10-30 14:28:42 -07:00
XunchaoZ
022130d0ac
fix(openai): Chat templates ( #519 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-10-30 03:20:43 -04:00
Aaron
aedb1e4843
fix: correct classes for regression
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-10-17 02:00:11 -04:00
Aaron Pham
d59a8860df
fix(build): check for parity ( #508 )
2023-10-16 17:33:47 -04:00
XunchaoZ
d9183267dc
feat: openai.Model.list() ( #499 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-10-14 16:33:49 -04:00
Aaron Pham
c1ca7ccd3b
fix(breaking): remove embeddings and update client implementation ( #500 )
2023-10-14 16:04:35 -04:00
Aaron Pham
1539c3f7dc
feat(client): simple implementation and streaming ( #256 )
2023-10-12 17:21:54 -04:00
Zhao Shenyang
bf96570eab
fix: do not reply on env var for built bento/docker ( #477 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-10-10 12:29:20 -04:00
Aaron
625b82a0fc
fix(style): remove weird break on split item
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-10-07 02:21:31 -04:00
XunchaoZ
04bb29a264
feat: OpenAI-compatible API ( #417 )
...
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-10-07 00:50:03 -04:00
Aaron
b43fabfff8
fix(playground): eager import jupytext
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-10-04 19:24:03 -04:00
Aaron
d2a2af3ee2
fix: import nbformat for playground
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-10-04 19:21:14 -04:00
MingLiangDai
a0e0f81306
feat: PromptTemplate and system prompt support ( #407 )
...
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
2023-10-03 09:53:37 -04:00
Aaron Pham
3b2ac1cd59
feat: support continuous batching on generate ( #375 )
...
* feat: support continuous batching on `generate`
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: add changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-09-19 03:04:59 -04:00
Aaron Pham
5a1fcc9cd5
fix: set default serialisation methods ( #355 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-09-18 02:26:53 -04:00
Aaron Pham
a32cf324d8
fix(prompt): correct export extra objects items ( #351 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-09-14 03:42:28 -04:00
Aaron Pham
ad9107958d
feat: continuous batching with vLLM ( #349 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* feat: continuous batching
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
* chore: add changeloe
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
* chore: add one shot generation
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-09-14 03:09:36 -04:00
Aaron Pham
35e6945e86
fix(serialisation): vLLM safetensors support ( #324 )
...
* fix(serilisation): vllm support for safetensors
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
* chore: running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: generalize one shot generation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
* chore: add changelog [skip ci]
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com >
2023-09-12 17:44:01 -04:00
Alan Poulain
88d7ba7ca8
fix(vllm): Make sure to use max number of GPUs available ( #326 )
...
* fix(serving): vllm bad num_gpus
Signed-off-by: Alan Poulain <contact@alanpoulain.eu >
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Signed-off-by: Alan Poulain <contact@alanpoulain.eu >
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-09-12 12:45:00 -04:00
Aaron Pham
fddd0bf95e
feat: bootstrap documentation site ( #252 )
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: GutZuFusss <leon.ikinger@googlemail.com >
Co-authored-by: GutZuFusss <leon.ikinger@googlemail.com >
2023-09-12 12:28:29 -04:00