Aaron Pham
|
8c2867d26d
|
style: define experimental guidelines (#168)
|
2023-07-31 07:54:26 -04:00 |
|
Aaron Pham
|
ef94c6b98a
|
feat(container): vLLM build and base image strategies (#142)
|
2023-07-31 02:44:52 -04:00 |
|
Aaron Pham
|
60c725a21f
|
ci: release PyPI before building binary (#138)
|
2023-07-24 16:39:51 -04:00 |
|
Aaron Pham
|
7eabcd4355
|
feat: vLLM integration for PagedAttention (#134)
|
2023-07-24 15:42:17 -04:00 |
|
Aaron Pham
|
c1ddb9ed7c
|
feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-19 18:12:12 -04:00 |
|
Aaron Pham
|
c7f4dc7bb2
|
feat(test): snapshot testing (#107)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-10 17:23:19 -04:00 |
|
Aaron Pham
|
d6303d306a
|
perf: fixing import custom paths and cleanup serialisation (#102)
|
2023-07-04 12:49:14 -04:00 |
|
Aaron Pham
|
3593c764f0
|
fix(test): robustness (#64)
|
2023-06-24 11:10:07 -04:00 |
|
Aaron Pham
|
98328be394
|
peft(models): improve implementation (#60)
If you have a local Dolly-V2 version, please do `openllm prune`
|
2023-06-24 05:22:18 -04:00 |
|
Aaron Pham
|
1435478f6c
|
fix(cli): ensure we parse tag for download (#58)
|
2023-06-23 21:24:53 -04:00 |
|
Aaron Pham
|
03758a5487
|
fix(tools): adhere to style guidelines (#31)
|
2023-06-18 20:03:17 -04:00 |
|
Aaron Pham
|
ded8a9f809
|
feat: quantization (#27)
|
2023-06-16 18:10:50 -04:00 |
|
Aaron Pham
|
f8ebb36e15
|
tests: fastpath (#17)
added fastpath cases for configuration and Flan-T5
fixes respecting model_id into lifecycle hooks.
update CLI to cleanup models info
|
2023-06-12 14:18:26 -04:00 |
|