Aaron Pham
|
d7e99c2827
|
fix: correctly set quantise for non quantise options
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2024-06-14 02:20:19 +00:00 |
|
Aaron Pham
|
7d501d6778
|
chore: export machine id and ip correspondingly
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-11 10:09:02 -04:00 |
|
Aaron Pham
|
c70f85992c
|
ci: reduce machine type to more available options
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-11 09:45:12 -04:00 |
|
Aaron Pham
|
6cdcb0d20a
|
chore(infra): add masks
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-11 09:32:16 -04:00 |
|
Aaron Pham
|
eaf5dafca9
|
feat(infra): add support for autogenerate CI runners
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-11 09:27:30 -04:00 |
|
Aaron Pham
|
4f64649fe4
|
feat: add support for tests mode on daemon.
TODO: uv tool run once it becomes stable
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-11 08:01:21 +00:00 |
|
Aaron Pham
|
ca306adde7
|
infra: update installation deps
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-11 05:44:32 +00:00 |
|
Aaron Pham
|
3c7362289a
|
chore(infra): cleanup bashscript and respect .envrc [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-08 01:08:43 -04:00 |
|
Aaron Pham
|
9d3ddae520
|
fix(client): remove circular dependency
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-06-02 12:31:53 -04:00 |
|
paperspace
|
a93da12084
|
chore: upgrade to new vLLM schema
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-06-02 15:52:45 +00:00 |
|
Aaron Pham
|
bf28f977bc
|
feat(models): command-r (#1005)
* feat(models): add support for command-r
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* feat(models): support command-r and remove deadcode and extensions
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: update local.sh script
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-06-02 10:16:08 -04:00 |
|
Aaron Pham (mbp16)
|
ed868bad5c
|
chore: update releases script
Signed-off-by: Aaron Pham (mbp16) <29749331+aarnphm@users.noreply.github.com>
|
2024-05-27 14:07:46 -04:00 |
|
Aaron Pham (mbp16)
|
da42c269c9
|
fix(ci): remove checking for hatch
since we don't use it on the script any longer.
Signed-off-by: Aaron Pham (mbp16) <29749331+aarnphm@users.noreply.github.com>
|
2024-05-27 12:12:40 -04:00 |
|
Aaron Pham
|
f248ea25cd
|
feat(ci): running CI on paperspace (#998)
* chore: update tiny script
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* feat(ci): running on paperspace machines
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update models and increase timeout readiness
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* fix: schema validation for inputs and update client supporting stop
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: update coverage config
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: remove some non-essentials
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: update locks
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-05-26 13:14:54 -04:00 |
|
Aaron Pham
|
3f048d8a5b
|
chore(qol): update CLI options and performance upgrade for build cache (#997)
* chore(qol): update CLI options and performance upgrade for build cache
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* chore: update default python version for dev
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
* fix: install custom tar.gz models
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-05-26 04:17:23 -04:00 |
|
paperspace
|
db523e2940
|
chore(infra): add support for dry running versioning for releases
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-05-23 14:06:08 +00:00 |
|
Aaron
|
97279f797c
|
fix(ci): make sure to remove alpha version for minor
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-05-12 00:37:18 -04:00 |
|
paperspace
|
526a770a06
|
chore: update base requirements to 0.4.2
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
|
2024-05-08 18:46:13 +00:00 |
|
Aaron Pham
|
5c0d2787c0
|
feat: add dbrx support
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2024-04-02 04:10:19 +00:00 |
|
Aaron
|
08ccc65863
|
ci: simplify release cycle to include alpha [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-03-15 04:26:19 -04:00 |
|
Aaron Pham
|
072b3e97ec
|
feat: 1.2 APIs (#821)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2024-03-15 03:49:19 -04:00 |
|
Zhao Shenyang
|
4dc4c45c4a
|
Bump BentoML version in tools (#884)
|
2024-02-05 05:24:02 +08:00 |
|
Aaron Pham
|
79da419d87
|
chore(deps): bump vllm to 0.2.7 (#837)
* chore(deps): bump vllm to 0.2.7
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2024-01-08 14:41:58 -05:00 |
|
Aaron Pham
|
8d63afc9ce
|
feat(vllm): support GPTQ with 0.2.6 (#797)
* feat(vllm): GPTQ support passthrough
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: run scripts
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* fix(install): set order of xformers before vllm
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat: support GPTQ with vLLM
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-12-18 12:41:19 -05:00 |
|
Aaron Pham
|
88b6d3d6de
|
perf: upgrade mixtral to use expert parallelism (#783)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-12-15 11:45:08 -05:00 |
|
Aaron Pham
|
3ab78cd105
|
feat(mixtral): correct support for mixtral (#772)
feat(mixtral): support inference with pt
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-12-13 09:03:56 -05:00 |
|
Aaron Pham
|
d3328343d7
|
feat: mixtral support (#770)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-12-12 01:33:13 -05:00 |
|
Aaron
|
59e8ef93dc
|
chore(deps): lock vLLM to 0.2.4
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-12-12 00:17:18 -05:00 |
|
Aaron
|
9d1b16395e
|
infra: remove redundant mypy config
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-30 09:33:52 -05:00 |
|
yansheng
|
3cb7f14fc1
|
feat(models): Support qwen (#742)
* support qwen
* support qwen
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
* Update openllm-core/src/openllm_core/config/configuration_qwen.py
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* chore: update correct readme and supports qwen models
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: root <yansheng105@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-30 06:54:17 -05:00 |
|
Aaron
|
39ecc73a50
|
infra: bump to dev version of 0.4.28.dev0 [generated] [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-24 01:54:46 -05:00 |
|
Aaron Pham
|
aab173cd99
|
refactor: focus (#730)
* perf: remove based images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: move dockerifle to run on release only
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: cleanup unused types
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-24 01:11:31 -05:00 |
|
Aaron Pham
|
5442d9cd10
|
fix(trust_remote_code): handle args correctly (#727)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 17:03:13 -05:00 |
|
Aaron Pham
|
79c9608735
|
infra: reduce wait time to around 7 mins (#726)
Seems like the release process for PyPI usually takes from 4-7 minutes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 07:28:36 -05:00 |
|
Aaron Pham
|
f83f64ffd7
|
fix(infra): setup higher timer for building container images (#723)
* fix(infra): setup higher timer for building container images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: remove invalid tests
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 05:00:33 -05:00 |
|
Aaron Pham
|
38b7c44df0
|
fix(base-image): update base image to include cuda for now (#720)
* fix(base-image): update base image to include cuda for now
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* fix: build core and client on release images
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: cleanup style changes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-22 01:15:19 -05:00 |
|
Aaron Pham
|
6505abdb44
|
chore: update lower bound version of bentoml to avoid breakage (#703)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 23:09:14 -05:00 |
|
Aaron Pham
|
44f05da845
|
infra: update generate notes and better local handle (#701)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 17:50:23 -05:00 |
|
Aaron
|
cb4386b013
|
fix(release): remove unecessary check for client dependencies [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 10:39:38 -05:00 |
|
Aaron Pham
|
816c1ee80e
|
feat(engine): CTranslate2 (#698)
* chore: update instruction for dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat(experimental): CTranslate2
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 10:25:08 -05:00 |
|
Aaron Pham
|
539f250c0f
|
feat(vllm): bump to 0.2.2 (#695)
* feat(vllm): bump to 0.2.2
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: move up to CUDA 12.1
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* fix: remove auto-gptq installation
since the builder image doesn't have access to GPU
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* fix: update containerization warning
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 02:52:32 -05:00 |
|
Aaron Pham
|
206521e02d
|
feat(ctranslate): initial infrastructure support (#694)
* perf: compact and improve speed and agility
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* --wip--
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: cleanup infrastructure
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update styles notes and autogen mypy configuration
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-19 01:48:33 -05:00 |
|
Aaron Pham
|
099cc22a94
|
chore: update documentation (#693)
* chore: update documentation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update readme
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update documentations for configuration
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-18 19:44:52 -05:00 |
|
Aaron Pham
|
c03e3bebb3
|
fix(infra): prepare correct dependencies for release [skip ci] (#687)
fix(infra): prepare correct dependencies for release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-17 16:05:46 -05:00 |
|
Aaron Pham
|
80ed400646
|
fix(build): lock lower version based on each release and update infra (#686)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-17 15:57:31 -05:00 |
|
Aaron Pham
|
21a308538e
|
fix: correct set item for attrs >23.1 (#678)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-17 09:16:52 -05:00 |
|
Aaron Pham
|
c850d76ccd
|
feat(models): Phi 1.5 (#672)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-16 17:48:10 -05:00 |
|
Aaron Pham
|
6102a67a83
|
infra: makes huggingface-hub requirements on fine-tune (#665)
infra: makes huggingface-hub core deps
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-16 03:12:52 -05:00 |
|
Aaron Pham
|
4a6f13ddd2
|
feat(type): provide structured annotations stubs (#663)
* feat(type): provide client stubs
separation of concern for more brevity code base
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* docs: update changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-16 02:58:45 -05:00 |
|
Aaron Pham
|
9e6df0df89
|
chore: update requirements in README.md (#659)
chore: update requirements
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-15 02:32:36 -05:00 |
|