Aaron
|
aaa8ec433c
|
chore(ci): running pyright last
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 22:19:08 -04:00 |
|
Aaron
|
21143fdfab
|
fix(brew): set correct url for release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 22:18:26 -04:00 |
|
Aaron Pham
|
b9dd54f634
|
feat: homebrew tap (#190)
|
2023-08-08 22:11:48 -04:00 |
|
aarnphm-ec2-dev
|
deaee67b47
|
fix(loading): make sure to cast the model to cuda if PyTorch
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-09 01:42:11 +00:00 |
|
aarnphm-ec2-dev
|
ae35ee8115
|
fix(build): set legacy serialisation for vllm on Bento
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 20:10:49 +00:00 |
|
Aaron Pham
|
2d47a54efd
|
feat(strategy): spawn one runner instance (#189)
|
2023-08-08 05:47:11 -04:00 |
|
Aaron Pham
|
9c3019d236
|
infra: bump to dev version of 0.2.18.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 05:44:51 +00:00 |
|
Aaron Pham
|
126491f272
|
infra: prepare for release 0.2.17 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.17
|
2023-08-08 05:34:43 +00:00 |
|
Aaron Pham
|
cb6f3aa48e
|
feat: --force-push to allow force push to bentocloud (#188)
|
2023-08-08 01:06:59 -04:00 |
|
Aaron
|
371a7c896c
|
fix: loading models within k8s API server
remove a logic where the API server tries to load the model when it is
not available locally
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-08 00:22:48 -04:00 |
|
pre-commit-ci[bot]
|
0139613f3c
|
ci: pre-commit autoupdate [pre-commit.ci] [skip ci] (#187)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-08-07 18:02:26 -04:00 |
|
Aaron Pham
|
21ea7e493f
|
feat(generation): initial work for generating tokens (#186)
|
2023-08-06 20:04:40 -04:00 |
|
Aaron Pham
|
2d5be909cd
|
fix(models): setup xformers and loading PyTorch meta weights (#185)
|
2023-08-06 03:25:02 -04:00 |
|
Aaron
|
96b25842d1
|
chore(docs): update security notes obsidian style [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-06 02:02:08 -04:00 |
|
Aaron Pham
|
74a928f6f3
|
chore: CODE_OF_CONDUCT.md [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-06 01:56:08 -04:00 |
|
Aaron Pham
|
752de09626
|
fix(ci): update version correctly [skip ci] (#184)
|
2023-08-06 01:18:33 -04:00 |
|
Aaron Pham
|
4875c3a109
|
feat: optimize model saving and loading on single GPU (#183)
|
2023-08-06 01:00:49 -04:00 |
|
aarnphm-ec2-dev
|
8bba90f611
|
fix: add release for correct CI version [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-05 07:41:50 +00:00 |
|
Aaron Pham
|
82d7ab67f3
|
infra: bump to dev version of ..1.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 16:56:48 +00:00 |
|
Aaron Pham
|
f68beb5ccb
|
infra: prepare for release 0.2.16 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.16
|
2023-08-04 16:43:29 +00:00 |
|
Aaron
|
90072ec5ee
|
fix(regression): setting quantize only if it is not None
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 12:40:55 -04:00 |
|
Aaron
|
ba07205156
|
fix: disable building xformers from source
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 12:14:04 -04:00 |
|
Aaron
|
794719670e
|
chore: update README [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 12:10:21 -04:00 |
|
Aaron Pham
|
cdc6bae0e9
|
infra: bump to dev version of ..1.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 15:47:20 +00:00 |
|
Aaron Pham
|
9d1476e360
|
infra: prepare for release 0.2.15 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.15
|
2023-08-04 15:32:47 +00:00 |
|
Aaron
|
287b7f9ab2
|
fix: releases issue when building new container [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 11:31:02 -04:00 |
|
Aaron
|
20deb3354d
|
infra: bump to dev version of 0.2.15.dev0 [generated] [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 11:11:14 -04:00 |
|
Aaron Pham
|
cb05446760
|
infra: prepare for release 0.2.14 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.14
|
2023-08-04 14:50:59 +00:00 |
|
Aaron
|
975a1d0349
|
fix: remove tokens for release [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 10:49:06 -04:00 |
|
Aaron
|
1e74e967d1
|
fix(container): correct cache directory
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-04 10:31:06 -04:00 |
|
Aaron Pham
|
2541a0f8dc
|
infra: initial work on compiling mypyc wheels (#182)
|
2023-08-04 10:20:03 -04:00 |
|
Aaron Pham
|
2cc264aa72
|
fix(vllm): correctly load given model id from envvar (#181)
|
2023-08-03 16:34:35 -04:00 |
|
Aaron
|
db8e47bc5b
|
fix(build): correct module type for stubs and strip assert [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 04:15:55 -04:00 |
|
Aaron
|
8f74e24c2f
|
fix: clone all for nightly strategy
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 03:17:18 -04:00 |
|
Aaron
|
b949106daf
|
fix(ci): rename runner name [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 02:24:45 -04:00 |
|
Aaron Pham
|
e9eff70978
|
infra: bump to dev version of 0.2.14.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 06:18:57 +00:00 |
|
Aaron Pham
|
8428692d45
|
infra: prepare for release 0.2.13 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.13
|
2023-08-03 06:06:09 +00:00 |
|
Aaron
|
cac7a19be9
|
fix(build): to run on tags [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 02:00:13 -04:00 |
|
aarnphm-ec2-dev
|
29ca9f398f
|
fix: add arch_list for cross compiling
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 04:33:48 +00:00 |
|
Aaron
|
f5eb21ede0
|
revert: "chore(aws): use g4dn for more availability"
This reverts commit a06464bdc7.
|
2023-08-02 23:55:29 -04:00 |
|
aarnphm-ec2-dev
|
a01d867bc7
|
chore(base): add auto-gptq CUDA kernel
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 02:40:06 +00:00 |
|
aarnphm-ec2-dev
|
820b4991fa
|
chore(stubs): add generated for auto-gptq and vllm [skip ci]
This is to help with working on CPU machine
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 02:28:24 +00:00 |
|
aarnphm-ec2-dev
|
a06464bdc7
|
chore(aws): use g4dn for more availability
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 02:17:37 +00:00 |
|
Aaron
|
af64a6dfd5
|
chore(docs): update to obsidian README format
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-02 21:49:33 -04:00 |
|
aarnphm-ec2-dev
|
b349820429
|
fix(build): add `--device` into envvar
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-03 00:44:40 +00:00 |
|
Aaron Pham
|
cfc7f3888d
|
chore(vllm): add all supported models (#179)
|
2023-08-02 17:42:02 -04:00 |
|
Aaron Pham
|
72337410cf
|
fix: nightly resolver for correct tag (#177)
|
2023-08-02 13:10:50 -04:00 |
|
Aaron
|
d4fbfa5e5c
|
fix: custom release strategy for correct naming
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-02 03:03:21 -04:00 |
|
Aaron Pham
|
acb81a6e1a
|
fix(build): dispatch container via workflow calls (#174)
add OPENLLM_USE_LOCAL_LATEST as default behaviour within container
|
2023-08-02 01:54:10 -04:00 |
|
Aaron
|
f989ebd4b9
|
infra: bump to dev version of 0.2.13.dev0 [generated] [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-01 19:52:56 -04:00 |
|