aarnphm-ec2-dev
a056365d48
fix(ci): always run create coverage
...
this is to stop evergreen to fail on main
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-20 13:01:32 +00:00
aarnphm-ec2-dev
e319a2977f
fix(ci): editable install
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-19 22:29:29 +00:00
Aaron Pham
c1ddb9ed7c
feat: GPTQ + vLLM and LlaMA ( #113 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00
dependabot[bot]
9833d2f46f
fix(ci): correct setup tests and auto-bot ( #118 )
...
Bump pypa/gh-action-pypi-publish from 1.8.7 to 1.8.8
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-17 14:37:46 -04:00
aarnphm-ec2-dev
d37d14e52b
fix(tests): mark package on CI to xfail
...
XXX: @aarnphm to solve build isolation when have bandwidth. Currently
this is not a problem when running locally.
`openllm build` just works, where as `openllm.build` won't work
sequentially.
Address some type stubs for jupytext
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-15 12:48:28 +00:00
Aaron Pham
b2dba6143f
fix(resource): correctly parse CUDA_VISIBLE_DEVICES ( #114 )
2023-07-15 07:19:35 -04:00
Aaron Pham
c7f4dc7bb2
feat(test): snapshot testing ( #107 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 17:23:19 -04:00
Aaron
ec4293091d
ci: wait for auto-bot to run check
...
only run evergreen on PR
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-07-05 10:53:45 -04:00
Aaron Pham
59b1d89971
feat: custom dockerfile templates ( #95 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-30 13:44:11 -04:00
Aaron
0c39435ed9
chore: remove cron for tests
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-25 22:38:53 -04:00
Aaron Pham
74fdd5e259
feat: release binary distribution ( #66 )
2023-06-25 10:38:03 -04:00
Aaron Pham
f0773f2d01
chore: add more test matrices ( #70 )
2023-06-25 03:44:03 -04:00
Aaron Pham
acb6a3cb32
fix: converting envvar to string ( #68 )
2023-06-25 03:40:45 -04:00
Aaron Pham
3593c764f0
fix(test): robustness ( #64 )
2023-06-24 11:10:07 -04:00
Aaron Pham
03758a5487
fix(tools): adhere to style guidelines ( #31 )
2023-06-18 20:03:17 -04:00
Aaron Pham
6f724416c0
perf: build quantization and better transformer behaviour ( #28 )
...
Fixes quantization_config and low_cpu_mem_usage to be available on PyTorch implementation only
See changelog for more details on #28
2023-06-17 08:56:14 -04:00
Aaron Pham
ded8a9f809
feat: quantization ( #27 )
2023-06-16 18:10:50 -04:00
Aaron Pham
f8ebb36e15
tests: fastpath ( #17 )
...
added fastpath cases for configuration and Flan-T5
fixes respecting model_id into lifecycle hooks.
update CLI to cleanup models info
2023-06-12 14:18:26 -04:00
Aaron
ec941c95d5
chore: add license header
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-04 16:22:37 -07:00
Aaron
ac4ada243b
fix(ci): make sure to set correct hatch env
...
Add a quick note about running release from gh
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-03 23:24:31 -07:00
Aaron
4990cf40c2
chore(ci): update automatic release note
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-03 22:56:00 -07:00
dependabot[bot]
a440bea184
build(deps): Bump bufbuild/buf-setup-action from 1.19.0 to 1.20.0 ( #8 )
...
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-31 22:18:46 -07:00
dependabot[bot]
9cdc3545aa
build(deps): Bump bufbuild/buf-setup-action from 1.17.0 to 1.19.0 ( #2 )
...
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-25 16:23:04 -07:00
Chaoyu
dd8b6050b2
feat: FLAN-T5 supports
...
- add infrastructure, to be implemented: cache, chat history
- Base Runnable Implementation, that fits LangChain API
- Added a Prompt descriptor and utils.
feat: license headers and auto factory impl and CLI
Auto construct args from pydantic config
Add auto factory for ease of use
only provide `/generate` to streamline UX experience
CLI > envvar > input contract for configuration
fix: serve from a thread
fix CLI args
chore: cleanup names and refactor imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-03 17:50:14 -07:00