Commit Graph

  • 689b83bbe3 fix(loading): make sure not to load to cuda with kbit quantisation aarnphm-ec2-dev 2023-08-10 19:39:01 +00:00
  • 7c3646bb89 infra: bump to homebrew tap release to 0.2.20 [generated] [skip ci] Aaron Pham 2023-08-10 03:26:20 +00:00
  • d99e342d88 infra: bump to dev version of 0.2.21.dev0 [generated] [skip ci] Aaron Pham 2023-08-10 03:23:24 +00:00
  • 78912a314c infra: prepare for release 0.2.20 [generated] [skip ci] v0.2.20 Aaron Pham 2023-08-10 03:04:19 +00:00
  • bc13b6f137 fix: update dependencies with brew tap Aaron 2023-08-09 22:54:37 -04:00
  • e0daea6e78 fix(compile): absolute import for compiled wheels Aaron 2023-08-09 22:46:17 -04:00
  • 6fbacecaf6 infra: bump to dev version of 0.2.20.dev0 [generated] [skip ci] Aaron Pham 2023-08-10 02:49:10 +00:00
  • 99c8f299ce infra: prepare for release 0.2.19 [generated] [skip ci] v0.2.19 Aaron Pham 2023-08-10 02:26:15 +00:00
  • 6143521547 fix: release compiled wheels and 0.2.18 tap (#193) Aaron Pham 2023-08-09 21:14:34 -04:00
  • f0783420a2 infra: bump to dev version of 0.2.19.dev0 [generated] [skip ci] Aaron Pham 2023-08-10 00:10:35 +00:00
  • 221d959a46 infra: prepare for release 0.2.18 [generated] [skip ci] v0.2.18 Aaron Pham 2023-08-09 23:50:37 +00:00
  • a44a317825 fix(ci): running test tap [skip ci] Aaron 2023-08-09 19:48:29 -04:00
  • 0d55d74868 fix: remove invalid token on dispatch [skip ci] Aaron 2023-08-09 19:44:42 -04:00
  • dfc4b489c5 feat(build): notes on compiled wheels for Bento aarnphm-ec2-dev 2023-08-09 21:52:34 +00:00
  • 0640af026c chore(docs): add instruction for compiled module development [skip ci] aarnphm-ec2-dev 2023-08-09 21:30:43 +00:00
  • b1445c6516 refactor(cli): compiled wheels and extension modules (#191) Aaron Pham 2023-08-09 17:10:15 -04:00
  • ae11e487d9 fix(brew): specific installation from gzip [skip ci] Aaron 2023-08-08 22:32:11 -04:00
  • aaa8ec433c chore(ci): running pyright last Aaron 2023-08-08 22:19:08 -04:00
  • 21143fdfab fix(brew): set correct url for release Aaron 2023-08-08 22:15:23 -04:00
  • b9dd54f634 feat: homebrew tap (#190) Aaron Pham 2023-08-08 22:11:48 -04:00
  • deaee67b47 fix(loading): make sure to cast the model to cuda if PyTorch aarnphm-ec2-dev 2023-08-09 01:42:11 +00:00
  • ae35ee8115 fix(build): set legacy serialisation for vllm on Bento aarnphm-ec2-dev 2023-08-08 10:17:48 +00:00
  • 2d47a54efd feat(strategy): spawn one runner instance (#189) Aaron Pham 2023-08-08 05:47:11 -04:00
  • 9c3019d236 infra: bump to dev version of 0.2.18.dev0 [generated] [skip ci] Aaron Pham 2023-08-08 05:44:51 +00:00
  • 126491f272 infra: prepare for release 0.2.17 [generated] [skip ci] v0.2.17 Aaron Pham 2023-08-08 05:34:43 +00:00
  • cb6f3aa48e feat: --force-push to allow force push to bentocloud (#188) Aaron Pham 2023-08-08 01:06:59 -04:00
  • 371a7c896c fix: loading models within k8s API server Aaron 2023-08-08 00:22:48 -04:00
  • 0139613f3c ci: pre-commit autoupdate [pre-commit.ci] [skip ci] (#187) pre-commit-ci[bot] 2023-08-07 18:02:26 -04:00
  • 21ea7e493f feat(generation): initial work for generating tokens (#186) Aaron Pham 2023-08-06 20:04:40 -04:00
  • 2d5be909cd fix(models): setup xformers and loading PyTorch meta weights (#185) Aaron Pham 2023-08-06 03:25:02 -04:00
  • 96b25842d1 chore(docs): update security notes obsidian style [skip ci] Aaron 2023-08-06 02:00:02 -04:00
  • 74a928f6f3 chore: CODE_OF_CONDUCT.md [skip ci] Aaron Pham 2023-08-06 01:56:08 -04:00
  • 752de09626 fix(ci): update version correctly [skip ci] (#184) Aaron Pham 2023-08-06 01:18:33 -04:00
  • 4875c3a109 feat: optimize model saving and loading on single GPU (#183) Aaron Pham 2023-08-06 01:00:49 -04:00
  • 8bba90f611 fix: add release for correct CI version [skip ci] aarnphm-ec2-dev 2023-08-05 07:41:50 +00:00
  • 82d7ab67f3 infra: bump to dev version of ..1.dev0 [generated] [skip ci] Aaron Pham 2023-08-04 16:56:48 +00:00
  • f68beb5ccb infra: prepare for release 0.2.16 [generated] [skip ci] v0.2.16 Aaron Pham 2023-08-04 16:43:29 +00:00
  • 90072ec5ee fix(regression): setting quantize only if it is not None Aaron 2023-08-04 12:40:55 -04:00
  • ba07205156 fix: disable building xformers from source Aaron 2023-08-04 12:14:04 -04:00
  • 794719670e chore: update README [skip ci] Aaron 2023-08-04 12:10:21 -04:00
  • cdc6bae0e9 infra: bump to dev version of ..1.dev0 [generated] [skip ci] Aaron Pham 2023-08-04 15:47:20 +00:00
  • 9d1476e360 infra: prepare for release 0.2.15 [generated] [skip ci] v0.2.15 Aaron Pham 2023-08-04 15:32:47 +00:00
  • 287b7f9ab2 fix: releases issue when building new container [skip ci] Aaron 2023-08-04 11:31:02 -04:00
  • 20deb3354d infra: bump to dev version of 0.2.15.dev0 [generated] [skip ci] Aaron 2023-08-04 11:11:14 -04:00
  • cb05446760 infra: prepare for release 0.2.14 [generated] [skip ci] v0.2.14 Aaron Pham 2023-08-04 14:50:59 +00:00
  • 975a1d0349 fix: remove tokens for release [skip ci] Aaron 2023-08-04 10:49:06 -04:00
  • 1e74e967d1 fix(container): correct cache directory Aaron 2023-08-04 10:31:06 -04:00
  • 2541a0f8dc infra: initial work on compiling mypyc wheels (#182) Aaron Pham 2023-08-04 10:20:03 -04:00
  • 2cc264aa72 fix(vllm): correctly load given model id from envvar (#181) Aaron Pham 2023-08-03 16:34:35 -04:00
  • db8e47bc5b fix(build): correct module type for stubs and strip assert [skip ci] Aaron 2023-08-03 04:15:55 -04:00
  • 8f74e24c2f fix: clone all for nightly strategy Aaron 2023-08-03 03:17:18 -04:00
  • b949106daf fix(ci): rename runner name [skip ci] Aaron 2023-08-03 02:24:45 -04:00
  • e9eff70978 infra: bump to dev version of 0.2.14.dev0 [generated] [skip ci] Aaron Pham 2023-08-03 06:18:57 +00:00
  • 8428692d45 infra: prepare for release 0.2.13 [generated] [skip ci] v0.2.13 Aaron Pham 2023-08-03 06:06:09 +00:00
  • cac7a19be9 fix(build): to run on tags [skip ci] Aaron 2023-08-03 01:59:18 -04:00
  • 29ca9f398f fix: add arch_list for cross compiling aarnphm-ec2-dev 2023-08-03 04:33:48 +00:00
  • f5eb21ede0 revert: "chore(aws): use g4dn for more availability" Aaron 2023-08-02 23:55:29 -04:00
  • a01d867bc7 chore(base): add auto-gptq CUDA kernel aarnphm-ec2-dev 2023-08-03 02:40:06 +00:00
  • 820b4991fa chore(stubs): add generated for auto-gptq and vllm [skip ci] aarnphm-ec2-dev 2023-08-03 02:28:24 +00:00
  • a06464bdc7 chore(aws): use g4dn for more availability aarnphm-ec2-dev 2023-08-03 02:17:37 +00:00
  • af64a6dfd5 chore(docs): update to obsidian README format Aaron 2023-08-02 21:49:33 -04:00
  • b349820429 fix(build): add `--device` into envvar aarnphm-ec2-dev 2023-08-03 00:44:40 +00:00
  • cfc7f3888d chore(vllm): add all supported models (#179) Aaron Pham 2023-08-02 17:42:02 -04:00
  • 72337410cf fix: nightly resolver for correct tag (#177) Aaron Pham 2023-08-02 13:10:50 -04:00
  • d4fbfa5e5c fix: custom release strategy for correct naming Aaron 2023-08-02 03:03:21 -04:00
  • acb81a6e1a fix(build): dispatch container via workflow calls (#174) Aaron Pham 2023-08-02 01:54:10 -04:00
  • f989ebd4b9 infra: bump to dev version of 0.2.13.dev0 [generated] [skip ci] Aaron 2023-08-01 19:52:56 -04:00
  • 57fdbda192 infra: prepare for release 0.2.12 [generated] [skip ci] v0.2.12 Aaron Pham 2023-08-01 23:27:01 +00:00
  • af54ff299f fix(ec2): increase subnet availability to all available zone with g5 instances Aaron 2023-08-01 16:07:41 -04:00
  • c2ed1d56da chore(release): update base container restriction (#173) pre-commit-ci[bot] 2023-08-01 15:25:17 -04:00
  • 6ba8899743 fix: remove invalid OPENLLMDEVDEBUG envvar Aaron 2023-08-01 01:52:08 -04:00
  • 961455c762 fix(cli): always --force on --push Aaron 2023-07-31 23:39:56 -04:00
  • ca5e3c7ae5 fix: correct setup property for envvar instance Aaron 2023-07-31 23:34:42 -04:00
  • 16f032417e revert: "infra: reduce instance type for more lenient" Aaron 2023-07-31 21:34:56 -04:00
  • 4a1d849203 infra: reduce instance type for more lenient Aaron 2023-07-31 21:25:59 -04:00
  • 23c5aa5958 revert: remove unreleased changelog Aaron 2023-07-31 21:07:00 -04:00
  • fa0e947dd0 chore: add editorconfig [skip ci] Aaron 2023-07-31 20:21:22 -04:00
  • 729e423b17 chore(bnb): filter warnings message on CPU (#170) Aaron Pham 2023-07-31 15:48:59 -04:00
  • 19d88d4cb8 infra: ignore rev that update styling [skip ci] Aaron 2023-07-31 09:07:58 -04:00
  • e01853a81c chore(infra): disable update-changelog for now [skip ci] Aaron 2023-07-31 09:05:50 -04:00
  • ec3c381e8c infra: add instruction for using docker images from release notes (#169) Aaron Pham 2023-07-31 08:39:10 -04:00
  • 8c2867d26d style: define experimental guidelines (#168) Aaron Pham 2023-07-31 07:54:26 -04:00
  • 2c2070f69f chore(deps): bump docker/setup-qemu-action from 2.1.0 to 2.2.0 [skip ci] (#165) dependabot[bot] 2023-07-31 07:52:16 -04:00
  • 94c949c22c chore(deps): bump aws-actions/configure-aws-credentials from 1 to 2 [skip ci] (#167) dependabot[bot] 2023-07-31 07:45:50 -04:00
  • 9592ca02fb chore(deps): bump docker/setup-buildx-action from 2.5.0 to 2.9.1 [skip ci] (#164) dependabot[bot] 2023-07-31 07:45:26 -04:00
  • 4d566fee09 chore(deps): bump peter-evans/create-pull-request from 4 to 5 [skip ci] (#166) dependabot[bot] 2023-07-31 07:45:05 -04:00
  • b5652e7d66 fix(ci): agree with signing Aaron 2023-07-31 06:40:14 -04:00
  • 431b326dd3 chore(deps): bump docker/login-action from 2.1.0 to 2.2.0 (#163) dependabot[bot] 2023-07-31 09:26:06 +00:00
  • ae17322b73 fix(ci): correct set digest for signing images Aaron 2023-07-31 04:27:15 -04:00
  • 4fbfb363bf infra: update changelog and added readme badges [generated] (#162) Aaron Pham 2023-07-31 04:02:02 -04:00
  • fec68d732b fix(ci): Correctly set signing for pushing container images (#161) Aaron Pham 2023-07-31 03:43:07 -04:00
  • ef94c6b98a feat(container): vLLM build and base image strategies (#142) Aaron Pham 2023-07-31 02:44:52 -04:00
  • 001ff6b5ac docs: update README.md typos (#155) RichardScottOZ 2023-07-29 19:10:40 +09:30
  • 0c79fabd1a chore(release): add darwin binary to release notes (#154) Aaron Pham 2023-07-28 15:00:42 -04:00
  • 4de0ca8a13 infra: bump to dev version of 0.2.12.dev0 [generated] [skip ci] Aaron Pham 2023-07-28 00:14:52 +00:00
  • 7d9dcb5d40 infra: prepare for release 0.2.11 [generated] [skip ci] v0.2.11 Aaron Pham 2023-07-28 00:04:32 +00:00
  • fc66ff275b fix: make sure to add torch to dependencies aarnphm-ec2-dev 2023-07-28 00:01:52 +00:00
  • 15640a85cd feat: supports embeddings for T5 and ChatGLM family generation (#153) Aaron Pham 2023-07-27 16:43:43 -04:00
  • e075bd25ea chore: add NousResearch's as non-gated Llama (#152) Aaron Pham 2023-07-27 15:30:56 -04:00
  • eacd8d9f46 fix(pre-commit): disable auto fixes (#151) Aaron Pham 2023-07-27 13:37:09 -04:00