Aaron Pham
|
e0632a85ed
|
refactor(cli): move out to its own packages (#619)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-12 18:25:44 -05:00 |
|
Aaron Pham
|
7e1fb35a71
|
chore(llm): expose quantise and lazy load heavy imports (#617)
* chore(llm): expose quantise and lazy load heavy imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: move transformers to TYPE_CHECKING block
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-12 14:55:37 -05:00 |
|
Aaron Pham
|
5e45245457
|
package: add openllm core dependencies to labels (#600)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-10 02:33:55 -05:00 |
|
Aaron Pham
|
ac377fe490
|
infra: using ruff formatter (#594)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-09 12:44:05 -05:00 |
|
Aaron Pham
|
b8a2e8cf91
|
refactor(cli): cleanup API (#592)
* chore: remove unused imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* refactor(cli): update to only need model_id
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* feat: `openllm start model-id`
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: add changelog
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update changelog notice
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update correct config and running tools
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: update backward compat options and treat JSON outputs
corespondingly
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-09 11:40:17 -05:00 |
|
Aaron Pham
|
e87830ef0a
|
container: update tracing dependencies (#591)
* chore: update build message
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
* chore: add tracing dependencies to container
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-08 08:08:40 -05:00 |
|
Aaron Pham
|
ff8b6377c8
|
fix(awq): correct awq detection for support (#586)
* fix(awq): correct detection for awq
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* chore: update base docker to work
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* chore: disable awq on pytorch for now
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* ci: auto fixes from pre-commit.ci
For more information, see https://pre-commit.ci
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-11-08 06:57:11 -05:00 |
|
Aaron Pham
|
85a7243ac3
|
fix: device imports using strategies (#584)
* fix: device imports using strategies
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
* chore: support trust_remote_code for vLLM runners
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-08 05:10:50 -05:00 |
|
Aaron Pham
|
97d7c38fea
|
refactor: cleanup typing to expose correct API (#576)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-11-08 01:24:03 -05:00 |
|
Aaron Pham
|
dc27b0e727
|
fix: update build dependencies and format chat prompt (#569)
chore: update correct check and format prompt
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-11-07 16:42:20 -05:00 |
|
Aaron Pham
|
e2029c934b
|
perf: unify LLM interface (#518)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: paperspace <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-11-06 20:39:43 -05:00 |
|
Zhao Shenyang
|
bf96570eab
|
fix: do not reply on env var for built bento/docker (#477)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-10-10 12:29:20 -04:00 |
|
Aaron
|
625b82a0fc
|
fix(style): remove weird break on split item
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-10-07 02:21:31 -04:00 |
|
MingLiangDai
|
a0e0f81306
|
feat: PromptTemplate and system prompt support (#407)
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-10-03 09:53:37 -04:00 |
|
Aaron Pham
|
5a1fcc9cd5
|
fix: set default serialisation methods (#355)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-18 02:26:53 -04:00 |
|
Aaron Pham
|
956b3a53bc
|
fix(gptq): use upstream integration (#297)
* wip
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* feat: GPTQ transformers integration
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* fix: only load if variable is available and add changelog
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: remove boilerplate check
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-04 14:05:50 -04:00 |
|
aarnphm-ec2-dev
|
7d893e6cd2
|
chore: ignore new lines split [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 17:00:49 +00:00 |
|
Aaron Pham
|
b7af7765d4
|
fix(yapf): align weird new lines break [generated] [skip ci] (#284)
fix(yapf): align weird new lines break
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 05:34:22 -04:00 |
|
Aaron Pham
|
3e45530abd
|
refactor(breaking): unify LLM API (#283)
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 05:15:19 -04:00 |
|
Aaron
|
b545ad2ad1
|
style: google
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-30 13:52:35 -04:00 |
|
Aaron Pham
|
c9cef1d773
|
fix: persistent styling between ruff and yapf (#279)
|
2023-08-30 11:37:41 -04:00 |
|
Aaron Pham
|
2036d4e015
|
chore(build): use latest vllm pre-built kernel (#261)
|
2023-08-26 09:02:52 -04:00 |
|
aarnphm-ec2-dev
|
806a663e4a
|
chore(style): add one blank line
to conform with Google style
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-26 11:36:57 +00:00 |
|
Aaron Pham
|
46c8904806
|
cron(style): run formatter [generated] [skip ci] (#257)
|
2023-08-25 06:38:59 -04:00 |
|
Aaron
|
787ce1b3b6
|
chore(style): synchronized style across packages [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-23 08:46:22 -04:00 |
|
aarnphm-ec2-dev
|
eddbc06374
|
chore(style): reduce line length and truncate compression
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-22 17:02:00 +00:00 |
|
aarnphm-ec2-dev
|
1488fbb167
|
chore(style): enable yapf to match with style guidelines
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-22 14:03:06 +00:00 |
|
Aaron Pham
|
3ffb25a872
|
refactor: packages (#249)
|
2023-08-22 08:55:46 -04:00 |
|
Aaron Pham
|
4140d160b8
|
feat(embedding): Adding generic endpoint (#227)
|
2023-08-17 15:17:00 -04:00 |
|
aarnphm-ec2-dev
|
3363ee158b
|
fix(container): set correct PyTorch version not to override cuda
wheels
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 10:46:49 +00:00 |
|
Aaron Pham
|
3a73aacb01
|
chore(ci): add dependabot and fix vllm release container (#217)
|
2023-08-16 05:43:41 -04:00 |
|
Aaron
|
af8cb73832
|
fix: latest vllm build
sync changelog with monorepo for sdist installation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 04:03:34 -04:00 |
|
Aaron
|
6b0ab17018
|
chore: remove unnecessary headers
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-15 18:15:54 -04:00 |
|
Aaron Pham
|
cd872ef631
|
refactor: monorepo (#203)
|
2023-08-15 02:11:14 -04:00 |
|