Commit Graph

285 Commits

Author SHA1 Message Date
Aaron Pham
f0ab6d44fa fix: make sure to include new implementation in bundle build
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-03-20 22:11:53 +00:00
Aaron
5c8c30a70b fix: uses --pre for alpha releases for now
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2024-03-20 13:38:10 -04:00
Aaron
2ddbe4eb22 fix(service): remove mounting ASGI app
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2024-03-20 11:51:09 -04:00
Aaron Pham
824ff68818 chore: update local script and update service
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-03-15 20:29:49 +00:00
Aaron
727361ced7 chore: running updated ruff formatter [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2024-03-15 05:35:24 -04:00
Aaron
c34db550a6 fix(build): explicit set to use alpha version
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2024-03-15 05:33:18 -04:00
Aaron
0274fb4c11 fix: don't lock openllm to support alpha release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2024-03-15 05:29:35 -04:00
Aaron Pham
58c741c5aa infra: prepare for release 0.5.0-alpha [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-03-15 08:46:18 +00:00
Aaron Pham
072b3e97ec feat: 1.2 APIs (#821)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-03-15 03:49:19 -04:00
Aaron
e3392476be revert: "ci: pre-commit autoupdate [pre-commit.ci] (#931)"
This reverts commit 7b00c84c2a.
2024-03-15 03:47:23 -04:00
pre-commit-ci[bot]
7b00c84c2a ci: pre-commit autoupdate [pre-commit.ci] (#931)
* ci: pre-commit autoupdate [pre-commit.ci]

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.2.2 → v0.3.2](https://github.com/astral-sh/ruff-pre-commit/compare/v0.2.2...v0.3.2)
- [github.com/pre-commit/mirrors-eslint: v9.0.0-beta.0 → v9.0.0-beta.2](https://github.com/pre-commit/mirrors-eslint/compare/v9.0.0-beta.0...v9.0.0-beta.2)

* ci: auto fixes from pre-commit.ci

For more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-03-15 03:46:28 -04:00
Aaron Pham
1b54d64eb0 infra: prepare for release 0.4.44 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-02-06 03:07:09 +00:00
Zhao Shenyang
3299f463a6 fix: remove vllm dependency for pytorch bento (#893) 2024-02-05 18:36:14 -05:00
Aaron Pham
fe44c843ec infra: prepare for release 0.4.43 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-02-05 10:48:05 +00:00
Zhao Shenyang
16d8caf2ee chore: bump up bentoml version to 1.1.11 (#883) 2024-02-04 21:31:14 +08:00
Zhao Shenyang
9d0e292076 fix: limit BentoML version range (#881) 2024-02-04 16:59:21 +08:00
Aaron Pham
d1583cc1bb infra: prepare for release 0.4.42 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-02-02 12:21:09 +00:00
Zhao Shenyang
9f9195f74b fix: all runners sse output (#880) 2024-02-02 20:08:31 +08:00
Zhao Shenyang
6c909aabdb chore: set stop to empty list by default (#878) 2024-02-02 19:28:49 +08:00
Zhao Shenyang
aff5dc8ff2 fix: proper SSE handling for vllm (#877)
fix: proper SSE handling
2024-02-02 17:25:58 +08:00
Fazli Sapuan
1c0ff115a4 docs: update README.md telemetry code link (#842) 2024-01-15 11:41:49 -05:00
Fazli Sapuan
6b3a1bd708 chore: fix typo in list_models pydoc (#847) 2024-01-15 08:26:48 -05:00
Zhao Shenyang
8baaf122ae improv(package): use python slim base image and let pytorch install cuda (#807) 2024-01-11 23:23:03 -05:00
Aaron Pham
2bb97f8ba2 chore: update discord link (#838)
* Update pyproject.toml

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* Update pyproject.toml

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* Update pyproject.toml

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* Update pyproject.toml

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2024-01-08 19:09:51 -05:00
Aaron Pham
79da419d87 chore(deps): bump vllm to 0.2.7 (#837)
* chore(deps): bump vllm to 0.2.7

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2024-01-08 14:41:58 -05:00
Aaron Pham
7e0c9180fe chore(script): run vendored scripts (#808)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-22 10:46:15 -05:00
Aaron Pham
b09bd20750 infra: prepare for release 0.4.41 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-18 18:08:46 +00:00
Aaron Pham
8d63afc9ce feat(vllm): support GPTQ with 0.2.6 (#797)
* feat(vllm): GPTQ support passthrough

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: run scripts

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* fix(install): set order of xformers before vllm

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat: support GPTQ with vLLM

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-18 12:41:19 -05:00
Aaron Pham
5d27337e82 fix(cli): avoid runtime __origin__ check for older Python (#798)
fix(cli): avoid runtime __origin__ on older Python

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-18 12:33:36 -05:00
Aaron Pham
2e8fc284f5 infra: prepare for release 0.4.40 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-15 16:46:12 +00:00
Aaron Pham
88b6d3d6de perf: upgrade mixtral to use expert parallelism (#783)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-15 11:45:08 -05:00
Aaron Pham
c8c9663d06 fix(infra): conform ruff to 150 LL (#781)
Generally correctly format it with ruff format and manual style

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-14 17:27:32 -05:00
Aaron Pham
d4fbbcee34 infra: prepare for release 0.4.39 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-14 19:20:01 +00:00
Aaron Pham
44383528b5 fix(logprobs): correct check logprobs (#779)
* fix(logprobs): correct check logprobs

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changlog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-14 14:19:01 -05:00
Aaron Pham
1dbae67172 infra: prepare for release 0.4.38 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-13 23:27:41 +00:00
Aaron Pham
0d83cefcb6 fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors (#776)
fix(mixtral): setup hack atm to load weights from pt specifically
instead of safetensors

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-13 18:18:51 -05:00
Aaron Pham
2dbcfa8a0c fix(cli): correct set arguments for openllm import and openllm build (#775)
* fix(cli): correct set arguments for `openllm import` and `openllm build`

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-13 15:52:59 -05:00
Aaron Pham
8d9d212d61 infra: prepare for release 0.4.37 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-13 14:07:33 +00:00
Aaron Pham
3ab78cd105 feat(mixtral): correct support for mixtral (#772)
feat(mixtral): support inference with pt

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-13 09:03:56 -05:00
Aaron Pham
9cd1e44b1e infra: prepare for release 0.4.36 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-12 06:34:39 +00:00
Aaron Pham
d3328343d7 feat: mixtral support (#770)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-12 01:33:13 -05:00
Aaron
59e8ef93dc chore(deps): lock vLLM to 0.2.4
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-12 00:17:18 -05:00
Aaron Pham
08114410bc fix(openai): logprobs when echo is enabled (#761)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-10 18:09:25 -05:00
Aaron Pham
c3a0b5c39f feat(openai): supports echo (#760)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-10 13:19:40 -05:00
Aaron
bb4ed8b53c fix(llm): correct annotations definitions
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-09 09:59:02 -05:00
Aaron Pham
8019fd84c8 infra: prepare for release 0.4.35 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-12-07 08:38:13 +00:00
Aaron
9a7e0cecf0 fix(types): makes sures mypy is running strict
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-30 09:42:24 -05:00
Aaron
55a0b2f825 fix(style): setup correct block format
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-30 07:58:35 -05:00
Aaron
b53559de6f fix(setter): correct item with the same kwargs with stubs
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-30 07:36:34 -05:00
Aaron Pham
81688e0949 infra: prepare for release 0.4.34 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-11-30 12:17:48 +00:00