Aaron Pham
|
fddd0bf95e
|
feat: bootstrap documentation site (#252)
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: GutZuFusss <leon.ikinger@googlemail.com>
Co-authored-by: GutZuFusss <leon.ikinger@googlemail.com>
|
2023-09-12 12:28:29 -04:00 |
|
aarnphm-ec2-dev
|
8530a067ea
|
chore(serialisation): dump quantization_config.json to conform with
optimum load
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 16:50:50 +00:00 |
|
Aaron
|
0d50aa00b9
|
chore: add openllm-core as meta dependencies
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 10:31:40 -04:00 |
|
Aaron Pham
|
7e2c8428bb
|
infra: prepare for release 0.3.3 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 01:51:18 +00:00 |
|
aarnphm-ec2-dev
|
8173cb09a5
|
fix(quantize): dyn quant for int8 and int4
only set tokenizer when it is gptq
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-07 01:48:45 +00:00 |
|
Aaron Pham
|
fd18c8be01
|
infra: prepare for release 0.3.2 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-09-06 19:03:40 +00:00 |
|
Aaron
|
675b372981
|
fix: synchronize device for inference
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-06 14:08:52 -04:00 |
|
Aaron Pham
|
b61005424b
|
infra: prepare for release 0.3.1 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-09-06 17:49:58 +00:00 |
|
Aaron
|
887ffa9aa0
|
chore: cleanup pre-commit jobs and update usage
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-05 10:06:36 -04:00 |
|
aarnphm-ec2-dev
|
f43c721579
|
chore: only add bentomodel branch during generated service with
OpenLLM
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-05 01:08:23 +00:00 |
|
Aaron Pham
|
06a68ade7d
|
infra: prepare for release 0.3.0 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-09-04 19:03:41 +00:00 |
|
Aaron
|
5eea40a599
|
chore(readme): update README for release [generated] [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-04 15:02:01 -04:00 |
|
Aaron Pham
|
956b3a53bc
|
fix(gptq): use upstream integration (#297)
* wip
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* feat: GPTQ transformers integration
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* fix: only load if variable is available and add changelog
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: remove boilerplate check
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-04 14:05:50 -04:00 |
|
aarnphm-ec2-dev
|
7d893e6cd2
|
chore: ignore new lines split [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 17:00:49 +00:00 |
|
Aaron Pham
|
608de0b667
|
fix(serving): vllm distributed size (#285)
* chore(weights): ignore gguf pattern for non GGML backend
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
* chore: correct fix num_gpus to be divisble by 2
This depends on the attention_heads from given models
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
---------
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 12:37:10 -04:00 |
|
Aaron Pham
|
b7af7765d4
|
fix(yapf): align weird new lines break [generated] [skip ci] (#284)
fix(yapf): align weird new lines break
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 05:34:22 -04:00 |
|
Aaron Pham
|
3e45530abd
|
refactor(breaking): unify LLM API (#283)
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-09-01 05:15:19 -04:00 |
|
Aaron
|
b545ad2ad1
|
style: google
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-30 13:52:35 -04:00 |
|
Aaron Pham
|
c9cef1d773
|
fix: persistent styling between ruff and yapf (#279)
|
2023-08-30 11:37:41 -04:00 |
|
Aaron Pham
|
2036d4e015
|
chore(build): use latest vllm pre-built kernel (#261)
|
2023-08-26 09:02:52 -04:00 |
|
aarnphm-ec2-dev
|
806a663e4a
|
chore(style): add one blank line
to conform with Google style
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-26 11:36:57 +00:00 |
|
Aaron Pham
|
938fd362bb
|
feat(vllm): streaming (#260)
|
2023-08-26 07:27:32 -04:00 |
|
Aaron Pham
|
46c8904806
|
cron(style): run formatter [generated] [skip ci] (#257)
|
2023-08-25 06:38:59 -04:00 |
|
Aaron Pham
|
816bfdcc19
|
infra: prepare for release 0.2.27 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-25 09:28:49 +00:00 |
|
aarnphm-ec2-dev
|
dae38cdba1
|
chore: update external dependencies [skip ci]
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-25 09:27:26 +00:00 |
|
Aaron Pham
|
08dc6ed2ba
|
chore: ignore peft and fix adapter loading issue (#255)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-08-25 04:36:35 -04:00 |
|
Aaron
|
787ce1b3b6
|
chore(style): synchronized style across packages [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-23 08:46:22 -04:00 |
|
Aaron Pham
|
bbd9aa7646
|
refactor(contrib): similar namespace [clojure-ui build] (#251)
|
2023-08-23 00:21:59 -04:00 |
|
aarnphm-ec2-dev
|
eddbc06374
|
chore(style): reduce line length and truncate compression
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-22 17:02:00 +00:00 |
|
aarnphm-ec2-dev
|
1488fbb167
|
chore(style): enable yapf to match with style guidelines
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-22 14:03:06 +00:00 |
|
Aaron Pham
|
3ffb25a872
|
refactor: packages (#249)
|
2023-08-22 08:55:46 -04:00 |
|
aarnphm-ec2-dev
|
9e371d2ead
|
fix(generate): Correct set batch output for generate from iterator
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-21 12:02:35 +00:00 |
|
Aaron Pham
|
9e205b4963
|
feat: token streaming and SSE support (#240)
|
2023-08-20 07:32:49 -04:00 |
|
Aaron Pham
|
ec3852b5b8
|
infra: prepare for release 0.2.26 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-17 19:53:56 +00:00 |
|
Aaron Pham
|
4140d160b8
|
feat(embedding): Adding generic endpoint (#227)
|
2023-08-17 15:17:00 -04:00 |
|
aarnphm-ec2-dev
|
d5c4066ff4
|
fix(generation): unecessary casting [ec2 build] [wheel build]
This breaks when compiled to cfunc
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-17 16:34:59 +00:00 |
|
Aaron Pham
|
3ca1fde9ff
|
fix(binary): correct folders when building standalone installer (#228)
|
2023-08-17 09:56:08 -04:00 |
|
Aaron Pham
|
017e57653b
|
infra: prepare for release 0.2.25 [generated] [skip ci]
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 11:42:47 +00:00 |
|
aarnphm-ec2-dev
|
3363ee158b
|
fix(container): set correct PyTorch version not to override cuda
wheels
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 10:46:49 +00:00 |
|
Aaron Pham
|
8796d0d63d
|
feat(models): add vLLM support for Falcon (#223)
|
2023-08-16 05:57:42 -04:00 |
|
Aaron Pham
|
3a73aacb01
|
chore(ci): add dependabot and fix vllm release container (#217)
|
2023-08-16 05:43:41 -04:00 |
|
Aaron Pham
|
ccca49af04
|
fix(ci): remove broken build hooks (#216)
|
2023-08-16 04:49:12 -04:00 |
|
Aaron
|
af8cb73832
|
fix: latest vllm build
sync changelog with monorepo for sdist installation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 04:03:34 -04:00 |
|
GutZuFusss
|
4cad367ab5
|
feat(contrib): ClojureScript UI (#89)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-08-16 03:30:44 -04:00 |
|
Aaron
|
6b0ab17018
|
chore: remove unnecessary headers
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-15 18:15:54 -04:00 |
|
Aaron
|
78ae2b3843
|
fix(metadata): hooks for metadata pypi [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-15 06:15:51 -04:00 |
|
Aaron
|
21ce11aaa8
|
chore: add symlink for changelog [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-15 05:27:14 -04:00 |
|
Aaron
|
43740aca8b
|
fix(metadata): include hatch-fancy-pypi-readme into subdir [skip ci]
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-15 05:06:48 -04:00 |
|
Aaron
|
accc8d0d15
|
fix: editable install
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-08-15 03:57:53 -04:00 |
|
Aaron Pham
|
cd872ef631
|
refactor: monorepo (#203)
|
2023-08-15 02:11:14 -04:00 |
|