paperspace
|
6726f6ae3e
|
fix: make sure to add cpu to number
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-05-09 00:06:10 +00:00 |
|
Aaron Pham
|
d02f267fc7
|
infra: prepare for release 0.5.0-alpha.5 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-05-08 23:37:42 +00:00 |
|
Aaron Pham
|
0a1bcacbc4
|
infra: prepare for release 0.5.0-alpha.4 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-05-08 23:22:33 +00:00 |
|
paperspace
|
526a770a06
|
chore: update base requirements to 0.4.2
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-05-08 18:46:13 +00:00 |
|
Aaron Pham
|
42417dbdbf
|
fix: make sure to respect additional parameters parse (#981)
|
2024-05-08 13:53:56 -04:00 |
|
Aaron Pham
|
43b635fbfd
|
fix: update correct CompletionOutput object (#973)
* fix: update correct CompletionOutput object
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* fix: revert to correct version
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-04-30 15:06:46 -04:00 |
|
Aaron
|
66de54eae7
|
chore: update default params as pydantic fields
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-04-03 15:50:37 -04:00 |
|
Aaron Pham
|
135503017d
|
infra: prepare for release 0.5.0-alpha.3 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 04:36:56 +00:00 |
|
Aaron Pham
|
32f4dff83b
|
fix: explicitly pass only non-null value
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 04:35:47 +00:00 |
|
Aaron Pham
|
1d817a7e01
|
fix: add support for min_tokens
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 04:22:14 +00:00 |
|
Aaron Pham
|
5c0d2787c0
|
feat: add dbrx support
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 04:10:19 +00:00 |
|
Aaron Pham
|
e9e6434012
|
infra: prepare for release 0.5.0-alpha.2 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 03:49:09 +00:00 |
|
Aaron Pham
|
4661838964
|
chore: move out the template to separate files
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 03:24:26 +00:00 |
|
Aaron Pham
|
67ab9b5762
|
fix: swagger showing for subpath
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-22 02:24:22 +00:00 |
|
Aaron Pham
|
3ef93fe371
|
chore: update support development_mode as DEBUG and support for RELOAD
envvar
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-22 01:19:32 +00:00 |
|
Aaron Pham
|
80b35f0d72
|
revert: correct type for openapi schema generation
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-21 07:51:00 +00:00 |
|
Aaron Pham
|
51bec78ee9
|
fix(load): make sure to respect MAX_MODEL_LEN from env
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-21 07:44:49 +00:00 |
|
Aaron
|
295a3b1061
|
chore(codegen): update generated var to read from envvar
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-20 21:51:39 -04:00 |
|
Aaron Pham
|
12ac99867f
|
infra: prepare for release 0.5.0-alpha.1 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-21 01:37:56 +00:00 |
|
Aaron Pham
|
f0ab6d44fa
|
fix: make sure to include new implementation in bundle build
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-20 22:11:53 +00:00 |
|
Aaron
|
5c8c30a70b
|
fix: uses --pre for alpha releases for now
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-20 13:38:10 -04:00 |
|
Aaron
|
2ddbe4eb22
|
fix(service): remove mounting ASGI app
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-20 11:51:09 -04:00 |
|
Aaron Pham
|
824ff68818
|
chore: update local script and update service
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-15 20:29:49 +00:00 |
|
Aaron
|
727361ced7
|
chore: running updated ruff formatter [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-15 05:35:24 -04:00 |
|
Aaron
|
c34db550a6
|
fix(build): explicit set to use alpha version
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-15 05:33:18 -04:00 |
|
Aaron
|
0274fb4c11
|
fix: don't lock openllm to support alpha release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-15 05:29:35 -04:00 |
|
Aaron Pham
|
58c741c5aa
|
infra: prepare for release 0.5.0-alpha [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-03-15 08:46:18 +00:00 |
|
Aaron Pham
|
072b3e97ec
|
feat: 1.2 APIs (#821)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2024-03-15 03:49:19 -04:00 |
|
Aaron
|
e3392476be
|
revert: "ci: pre-commit autoupdate [pre-commit.ci] (#931)"
This reverts commit 7b00c84c2a.
|
2024-03-15 03:47:23 -04:00 |
|
pre-commit-ci[bot]
|
7b00c84c2a
|
ci: pre-commit autoupdate [pre-commit.ci] (#931)
* ci: pre-commit autoupdate [pre-commit.ci]
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.2.2 → v0.3.2](https://github.com/astral-sh/ruff-pre-commit/compare/v0.2.2...v0.3.2)
- [github.com/pre-commit/mirrors-eslint: v9.0.0-beta.0 → v9.0.0-beta.2](https://github.com/pre-commit/mirrors-eslint/compare/v9.0.0-beta.0...v9.0.0-beta.2)
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2024-03-15 03:46:28 -04:00 |
|
Aaron Pham
|
1b54d64eb0
|
infra: prepare for release 0.4.44 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-02-06 03:07:09 +00:00 |
|
Zhao Shenyang
|
3299f463a6
|
fix: remove vllm dependency for pytorch bento (#893)
|
2024-02-05 18:36:14 -05:00 |
|
Aaron Pham
|
fe44c843ec
|
infra: prepare for release 0.4.43 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-02-05 10:48:05 +00:00 |
|
Zhao Shenyang
|
16d8caf2ee
|
chore: bump up bentoml version to 1.1.11 (#883)
|
2024-02-04 21:31:14 +08:00 |
|
Zhao Shenyang
|
9d0e292076
|
fix: limit BentoML version range (#881)
|
2024-02-04 16:59:21 +08:00 |
|
Aaron Pham
|
d1583cc1bb
|
infra: prepare for release 0.4.42 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-02-02 12:21:09 +00:00 |
|
Zhao Shenyang
|
9f9195f74b
|
fix: all runners sse output (#880)
|
2024-02-02 20:08:31 +08:00 |
|
Zhao Shenyang
|
6c909aabdb
|
chore: set stop to empty list by default (#878)
|
2024-02-02 19:28:49 +08:00 |
|
Zhao Shenyang
|
aff5dc8ff2
|
fix: proper SSE handling for vllm (#877)
fix: proper SSE handling
|
2024-02-02 17:25:58 +08:00 |
|
Fazli Sapuan
|
1c0ff115a4
|
docs: update README.md telemetry code link (#842)
|
2024-01-15 11:41:49 -05:00 |
|
Fazli Sapuan
|
6b3a1bd708
|
chore: fix typo in list_models pydoc (#847)
|
2024-01-15 08:26:48 -05:00 |
|
Zhao Shenyang
|
8baaf122ae
|
improv(package): use python slim base image and let pytorch install cuda (#807)
|
2024-01-11 23:23:03 -05:00 |
|
Aaron Pham
|
2bb97f8ba2
|
chore: update discord link (#838)
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* Update pyproject.toml
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-01-08 19:09:51 -05:00 |
|
Aaron Pham
|
79da419d87
|
chore(deps): bump vllm to 0.2.7 (#837)
* chore(deps): bump vllm to 0.2.7
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-01-08 14:41:58 -05:00 |
|
Aaron Pham
|
7e0c9180fe
|
chore(script): run vendored scripts (#808)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-12-22 10:46:15 -05:00 |
|
Aaron Pham
|
b09bd20750
|
infra: prepare for release 0.4.41 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-12-18 18:08:46 +00:00 |
|
Aaron Pham
|
8d63afc9ce
|
feat(vllm): support GPTQ with 0.2.6 (#797)
* feat(vllm): GPTQ support passthrough
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: run scripts
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* fix(install): set order of xformers before vllm
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat: support GPTQ with vLLM
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-12-18 12:41:19 -05:00 |
|
Aaron Pham
|
5d27337e82
|
fix(cli): avoid runtime __origin__ check for older Python (#798)
fix(cli): avoid runtime __origin__ on older Python
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-12-18 12:33:36 -05:00 |
|
Aaron Pham
|
2e8fc284f5
|
infra: prepare for release 0.4.40 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-12-15 16:46:12 +00:00 |
|
Aaron Pham
|
88b6d3d6de
|
perf: upgrade mixtral to use expert parallelism (#783)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-12-15 11:45:08 -05:00 |
|