Aaron Pham
ef94c6b98a
feat(container): vLLM build and base image strategies ( #142 )
2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev
6dc0bf0b12
fix: remove breakpoint on CLI
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-25 16:30:16 +00:00
aarnphm-ec2-dev
b23b59e1c9
fix(embeddings): correctly set JSON data via CLI client
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-07-25 16:26:01 +00:00
Aaron Pham
1940086bec
feat(client): embeddings ( #146 )
2023-07-25 05:44:21 -04:00
Aaron Pham
693631958a
feat(service): provisional API ( #133 )
2023-07-23 02:15:39 -04:00
Aaron Pham
c1ddb9ed7c
feat: GPTQ + vLLM and LlaMA ( #113 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00
Aaron Pham
c7f4dc7bb2
feat(test): snapshot testing ( #107 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 17:23:19 -04:00
Aaron Pham
9f6b254086
qa: improvements and agents log ( #105 )
2023-07-05 08:39:31 -04:00
Aaron Pham
01db504e7d
feat: MPT ( #91 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-28 23:12:15 -04:00
Aaron Pham
1435478f6c
fix(cli): ensure we parse tag for download ( #58 )
2023-06-23 21:24:53 -04:00
Aaron Pham
dfca956fad
feat: serve adapter layers ( #52 )
2023-06-23 10:07:15 -04:00
Aaron
1ed0ae7787
fix(log): make sure to configure OpenLLM logs correctly
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-19 06:19:06 -04:00
Aaron Pham
03758a5487
fix(tools): adhere to style guidelines ( #31 )
2023-06-18 20:03:17 -04:00
Aaron Pham
4fcd7c8ac9
integration: HuggingFace Agent ( #29 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-18 00:13:53 -04:00
Aaron Pham
6f724416c0
perf: build quantization and better transformer behaviour ( #28 )
...
Fixes quantization_config and low_cpu_mem_usage to be available on PyTorch implementation only
See changelog for more details on #28
2023-06-17 08:56:14 -04:00
Aaron Pham
19bc7e3116
feat: fine-tuning [part 1] ( #23 )
2023-06-16 00:19:01 -04:00
Aaron
528f76e1d0
fix(client): using httpx for running calls within async context
...
This is so that client.query works within a async context
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-15 01:58:49 -04:00
Aaron
50d59cdf8d
types: rename interface
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-14 02:45:34 -04:00
Aaron
cb76a894cf
feat(metadata): add configuration to metadata endpoint
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-13 07:09:31 -04:00
Aaron
71070b90b4
chore(metadata): fix model_id to be respected on service.py
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-12 16:04:52 -04:00
aarnphm-ec2-dev
81d46ca211
feat(type): support annotations
...
openllm.LLM now supports fully typed-strict
openllm.LLM[ModelType, TokenizerType] -> self.model, self.tokenizer
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 14:58:17 +00:00
aarnphm-ec2-dev
4db141c649
feat(gpu): support passing GPU per LLM
...
respect CUDA_VISIBLE_DEVICES and optionally --device
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-10 09:47:16 +00:00
Aaron
afddaed08c
fix(perf): respect per request information
...
remove use_default_prompt_template options
add pretrained to list of start help docstring
fix flax generation config
improve flax and tensorflow implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-10 02:14:13 -04:00
aarnphm-ec2-dev
c960b3edff
feat(client): add postprocess for processing client output call
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-07 00:24:20 -04:00
Aaron
f840222d12
feat(service): add timeout to metadata
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-05 00:46:02 -07:00
Aaron
8ef4c9cb19
fix(types): broken import and add hints for client
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-05 00:10:44 -07:00
Aaron
ec941c95d5
chore: add license header
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-04 16:22:37 -07:00
Aaron
5a09b11519
refactor: implement a new interface for processing parameters
...
add documentation for fields
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-03 21:46:37 -07:00
Aaron
9441917749
feat(runner): dynamic generate runner class
...
fix a client dump
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-03 18:48:22 -07:00
Aaron
0df8d8b9a6
perf: reduce unecessary object creation for config class
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-28 05:22:22 -07:00
Aaron
e0fc37e47f
fix(docs): update docs about saving custom fine-tuned
...
and update annotations for client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-27 21:15:44 -07:00
Aaron
52d65f999f
feat(telemetry): add support for usage tracking
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-27 20:39:13 -07:00
Aaron
cf4e55c36f
fix(client): implement per client framework and model_name getters
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-27 05:20:07 -07:00
aarnphm-ec2-dev
8ee5b048f3
feat(client): Async and Sync client
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-26 22:51:21 -07:00
aarnphm-ec2-dev
4127961c5c
feat: openllm.client
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev
20b3a0260f
refactor: move Prompt object to client specific attributes
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
Aaron
d31d450526
feat: Adding central service definition and init openllm_client
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-15 00:33:05 -07:00
Aaron
426a61713f
feat: start and start_grpc API
...
with_options listen from environment variable for said models.
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194
feat: initial openllm_client implementation
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2
feat: FLAN-T5 supports
...
- add infrastructure, to be implemented: cache, chat history
- Base Runnable Implementation, that fits LangChain API
- Added a Prompt descriptor and utils.
feat: license headers and auto factory impl and CLI
Auto construct args from pydantic config
Add auto factory for ease of use
only provide `/generate` to streamline UX experience
CLI > envvar > input contract for configuration
fix: serve from a thread
fix CLI args
chore: cleanup names and refactor imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-03 17:50:14 -07:00