Aaron Pham
|
2541a0f8dc
|
infra: initial work on compiling mypyc wheels (#182)
|
2023-08-04 10:20:03 -04:00 |
|
Aaron Pham
|
8c2867d26d
|
style: define experimental guidelines (#168)
|
2023-07-31 07:54:26 -04:00 |
|
Aaron Pham
|
ef94c6b98a
|
feat(container): vLLM build and base image strategies (#142)
|
2023-07-31 02:44:52 -04:00 |
|
Aaron Pham
|
c1ddb9ed7c
|
feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-19 18:12:12 -04:00 |
|
Aaron Pham
|
c7f4dc7bb2
|
feat(test): snapshot testing (#107)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2023-07-10 17:23:19 -04:00 |
|
Aaron Pham
|
03758a5487
|
fix(tools): adhere to style guidelines (#31)
|
2023-06-18 20:03:17 -04:00 |
|
Aaron Pham
|
ded8a9f809
|
feat: quantization (#27)
|
2023-06-16 18:10:50 -04:00 |
|
aarnphm-ec2-dev
|
81d46ca211
|
feat(type): support annotations
openllm.LLM now supports fully typed-strict
openllm.LLM[ModelType, TokenizerType] -> self.model, self.tokenizer
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
|
2023-06-11 14:58:17 +00:00 |
|
Aaron
|
aa50b5279e
|
fix(falcon): loading based on model registration
remove duplicate events
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-06-06 22:42:28 -04:00 |
|
Aaron
|
52d65f999f
|
feat(telemetry): add support for usage tracking
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
|
2023-05-27 20:39:13 -07:00 |
|