Commit Graph

637 Commits

Author SHA1 Message Date
Aaron Pham
40e56eeb60 infra: bump to dev version of 0.2.22.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-11 10:17:14 +00:00
Aaron Pham
f558d6cf67 infra: prepare for release 0.2.21 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.21
2023-08-11 09:57:52 +00:00
Aaron
4b03c8b848 fix(infra): typo [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-11 05:55:02 -04:00
Aaron Pham
5329853b10 perf: compiled modules and enable lazyeval (#200) 2023-08-11 05:53:45 -04:00
Aaron Pham
c083990edd infra: migrate to initial openllm-node library (#199) 2023-08-10 18:54:00 -04:00
Aaron Pham
8c93b781b8 fix(release): fix exclude options within compiled wheels (#197) 2023-08-10 18:48:58 -04:00
aarnphm-ec2-dev
034610b6b0 fix(embeddings): correct imports
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-10 22:47:38 +00:00
aarnphm-ec2-dev
689b83bbe3 fix(loading): make sure not to load to cuda with kbit quantisation
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-10 19:39:01 +00:00
Aaron Pham
7c3646bb89 infra: bump to homebrew tap release to 0.2.20 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-10 03:26:20 +00:00
Aaron Pham
d99e342d88 infra: bump to dev version of 0.2.21.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-10 03:23:24 +00:00
Aaron Pham
78912a314c infra: prepare for release 0.2.20 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.20
2023-08-10 03:04:19 +00:00
Aaron
bc13b6f137 fix: update dependencies with brew tap
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-09 22:54:37 -04:00
Aaron
e0daea6e78 fix(compile): absolute import for compiled wheels
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-09 22:51:35 -04:00
Aaron Pham
6fbacecaf6 infra: bump to dev version of 0.2.20.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-10 02:49:10 +00:00
Aaron Pham
99c8f299ce infra: prepare for release 0.2.19 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.19
2023-08-10 02:26:15 +00:00
Aaron Pham
6143521547 fix: release compiled wheels and 0.2.18 tap (#193) 2023-08-09 21:14:34 -04:00
Aaron Pham
f0783420a2 infra: bump to dev version of 0.2.19.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-10 00:10:35 +00:00
Aaron Pham
221d959a46 infra: prepare for release 0.2.18 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.18
2023-08-09 23:50:37 +00:00
Aaron
a44a317825 fix(ci): running test tap [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-09 19:48:29 -04:00
Aaron
0d55d74868 fix: remove invalid token on dispatch [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-09 19:44:42 -04:00
aarnphm-ec2-dev
dfc4b489c5 feat(build): notes on compiled wheels for Bento
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-09 21:52:34 +00:00
aarnphm-ec2-dev
0640af026c chore(docs): add instruction for compiled module development [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-09 21:30:43 +00:00
Aaron Pham
b1445c6516 refactor(cli): compiled wheels and extension modules (#191) 2023-08-09 17:10:15 -04:00
Aaron
ae11e487d9 fix(brew): specific installation from gzip [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-08 22:32:11 -04:00
Aaron
aaa8ec433c chore(ci): running pyright last
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-08 22:19:08 -04:00
Aaron
21143fdfab fix(brew): set correct url for release
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-08 22:18:26 -04:00
Aaron Pham
b9dd54f634 feat: homebrew tap (#190) 2023-08-08 22:11:48 -04:00
aarnphm-ec2-dev
deaee67b47 fix(loading): make sure to cast the model to cuda if PyTorch
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-09 01:42:11 +00:00
aarnphm-ec2-dev
ae35ee8115 fix(build): set legacy serialisation for vllm on Bento
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-08 20:10:49 +00:00
Aaron Pham
2d47a54efd feat(strategy): spawn one runner instance (#189) 2023-08-08 05:47:11 -04:00
Aaron Pham
9c3019d236 infra: bump to dev version of 0.2.18.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-08 05:44:51 +00:00
Aaron Pham
126491f272 infra: prepare for release 0.2.17 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.17
2023-08-08 05:34:43 +00:00
Aaron Pham
cb6f3aa48e feat: --force-push to allow force push to bentocloud (#188) 2023-08-08 01:06:59 -04:00
Aaron
371a7c896c fix: loading models within k8s API server
remove a logic where the API server tries to load the model when it is
not available locally

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-08 00:22:48 -04:00
pre-commit-ci[bot]
0139613f3c ci: pre-commit autoupdate [pre-commit.ci] [skip ci] (#187)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-07 18:02:26 -04:00
Aaron Pham
21ea7e493f feat(generation): initial work for generating tokens (#186) 2023-08-06 20:04:40 -04:00
Aaron Pham
2d5be909cd fix(models): setup xformers and loading PyTorch meta weights (#185) 2023-08-06 03:25:02 -04:00
Aaron
96b25842d1 chore(docs): update security notes obsidian style [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-06 02:02:08 -04:00
Aaron Pham
74a928f6f3 chore: CODE_OF_CONDUCT.md [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-06 01:56:08 -04:00
Aaron Pham
752de09626 fix(ci): update version correctly [skip ci] (#184) 2023-08-06 01:18:33 -04:00
Aaron Pham
4875c3a109 feat: optimize model saving and loading on single GPU (#183) 2023-08-06 01:00:49 -04:00
aarnphm-ec2-dev
8bba90f611 fix: add release for correct CI version [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-08-05 07:41:50 +00:00
Aaron Pham
82d7ab67f3 infra: bump to dev version of ..1.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-04 16:56:48 +00:00
Aaron Pham
f68beb5ccb infra: prepare for release 0.2.16 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.16
2023-08-04 16:43:29 +00:00
Aaron
90072ec5ee fix(regression): setting quantize only if it is not None
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-04 12:40:55 -04:00
Aaron
ba07205156 fix: disable building xformers from source
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-04 12:14:04 -04:00
Aaron
794719670e chore: update README [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-04 12:10:21 -04:00
Aaron Pham
cdc6bae0e9 infra: bump to dev version of ..1.dev0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
2023-08-04 15:47:20 +00:00
Aaron Pham
9d1476e360 infra: prepare for release 0.2.15 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
v0.2.15
2023-08-04 15:32:47 +00:00
Aaron
287b7f9ab2 fix: releases issue when building new container [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-08-04 11:31:02 -04:00