Commit Graph

19 Commits

Author SHA1 Message Date
Aaron Pham
0d83cefcb6 fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors (#776)
fix(mixtral): setup hack atm to load weights from pt specifically
instead of safetensors

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-12-13 18:18:51 -05:00
Aaron
ce6efc2a9e chore(style): cleanup bytes
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-28 01:27:27 -05:00
Aaron Pham
a58d947bc8 perf: improve build logics and cleanup speed (#657)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-15 00:18:31 -05:00
Aaron Pham
126e6c9d63 fix(ruff): correct consistency between isort and formatter (#624)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-12 21:12:50 -05:00
Aaron Pham
fa2038f4e2 fix: loading correct local models (#599)
* fix(model): loading local correctly

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* chore: update repr and correct bentomodel processor

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* ci: auto fixes from pre-commit.ci

For more information, see https://pre-commit.ci

* chore: cleanup transformers implementation

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* fix: ruff to ignore I001 on all stubs

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-10 02:36:12 -05:00
Aaron Pham
ac377fe490 infra: using ruff formatter (#594)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-09 12:44:05 -05:00
Aaron Pham
b8a2e8cf91 refactor(cli): cleanup API (#592)
* chore: remove unused imports

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* refactor(cli): update to only need model_id

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* feat: `openllm start model-id`

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: add changelog

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update changelog notice

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update correct config and running tools

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: update backward compat options and treat JSON outputs
corespondingly

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-11-09 11:40:17 -05:00
Aaron Pham
e2029c934b perf: unify LLM interface (#518)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-11-06 20:39:43 -05:00
Aaron Pham
35e6945e86 fix(serialisation): vLLM safetensors support (#324)
* fix(serilisation): vllm support for safetensors

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

* chore: running tools

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: generalize one shot generation

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>

* chore: add changelog [skip ci]

Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
2023-09-12 17:44:01 -04:00
Aaron Pham
608de0b667 fix(serving): vllm distributed size (#285)
* chore(weights): ignore gguf pattern for non GGML backend

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

* chore: correct fix num_gpus to be divisble by 2

This depends on the attention_heads from given models

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>

---------

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-09-01 12:37:10 -04:00
Aaron Pham
3e45530abd refactor(breaking): unify LLM API (#283)
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-09-01 05:15:19 -04:00
Aaron
b545ad2ad1 style: google
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-30 13:52:35 -04:00
Aaron Pham
c9cef1d773 fix: persistent styling between ruff and yapf (#279) 2023-08-30 11:37:41 -04:00
aarnphm-ec2-dev
806a663e4a chore(style): add one blank line
to conform with Google style

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-26 11:36:57 +00:00
Aaron Pham
46c8904806 cron(style): run formatter [generated] [skip ci] (#257) 2023-08-25 06:38:59 -04:00
Aaron
787ce1b3b6 chore(style): synchronized style across packages [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-23 08:46:22 -04:00
aarnphm-ec2-dev
1488fbb167 chore(style): enable yapf to match with style guidelines
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-22 14:03:06 +00:00
Aaron Pham
3ffb25a872 refactor: packages (#249) 2023-08-22 08:55:46 -04:00
Aaron Pham
cd872ef631 refactor: monorepo (#203) 2023-08-15 02:11:14 -04:00