Commit Graph

40 Commits

Author SHA1 Message Date
Aaron Pham
ef94c6b98a feat(container): vLLM build and base image strategies (#142) 2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev
6dc0bf0b12 fix: remove breakpoint on CLI
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-25 16:30:16 +00:00
aarnphm-ec2-dev
b23b59e1c9 fix(embeddings): correctly set JSON data via CLI client
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-07-25 16:26:01 +00:00
Aaron Pham
1940086bec feat(client): embeddings (#146) 2023-07-25 05:44:21 -04:00
Aaron Pham
693631958a feat(service): provisional API (#133) 2023-07-23 02:15:39 -04:00
Aaron Pham
c1ddb9ed7c feat: GPTQ + vLLM and LlaMA (#113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-19 18:12:12 -04:00
Aaron Pham
c7f4dc7bb2 feat(test): snapshot testing (#107)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 17:23:19 -04:00
Aaron Pham
9f6b254086 qa: improvements and agents log (#105) 2023-07-05 08:39:31 -04:00
Aaron Pham
01db504e7d feat: MPT (#91)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-28 23:12:15 -04:00
Aaron Pham
1435478f6c fix(cli): ensure we parse tag for download (#58) 2023-06-23 21:24:53 -04:00
Aaron Pham
dfca956fad feat: serve adapter layers (#52) 2023-06-23 10:07:15 -04:00
Aaron
1ed0ae7787 fix(log): make sure to configure OpenLLM logs correctly
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-19 06:19:06 -04:00
Aaron Pham
03758a5487 fix(tools): adhere to style guidelines (#31) 2023-06-18 20:03:17 -04:00
Aaron Pham
4fcd7c8ac9 integration: HuggingFace Agent (#29)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-18 00:13:53 -04:00
Aaron Pham
6f724416c0 perf: build quantization and better transformer behaviour (#28)
Fixes quantization_config and low_cpu_mem_usage to be available on PyTorch implementation only

See changelog for more details on #28
2023-06-17 08:56:14 -04:00
Aaron Pham
19bc7e3116 feat: fine-tuning [part 1] (#23) 2023-06-16 00:19:01 -04:00
Aaron
528f76e1d0 fix(client): using httpx for running calls within async context
This is so that client.query works within a async context

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-15 01:58:49 -04:00
Aaron
50d59cdf8d types: rename interface
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-14 02:45:34 -04:00
Aaron
cb76a894cf feat(metadata): add configuration to metadata endpoint
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-13 07:09:31 -04:00
Aaron
71070b90b4 chore(metadata): fix model_id to be respected on service.py
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-12 16:04:52 -04:00
aarnphm-ec2-dev
81d46ca211 feat(type): support annotations
openllm.LLM now supports fully typed-strict

openllm.LLM[ModelType, TokenizerType] -> self.model, self.tokenizer

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-11 14:58:17 +00:00
aarnphm-ec2-dev
4db141c649 feat(gpu): support passing GPU per LLM
respect CUDA_VISIBLE_DEVICES and optionally --device

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-10 09:47:16 +00:00
Aaron
afddaed08c fix(perf): respect per request information
remove use_default_prompt_template options

add pretrained to list of start help docstring

fix flax generation config

improve flax and tensorflow implementation

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-10 02:14:13 -04:00
aarnphm-ec2-dev
c960b3edff feat(client): add postprocess for processing client output call
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-07 00:24:20 -04:00
Aaron
f840222d12 feat(service): add timeout to metadata
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-05 00:46:02 -07:00
Aaron
8ef4c9cb19 fix(types): broken import and add hints for client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-05 00:10:44 -07:00
Aaron
ec941c95d5 chore: add license header
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-04 16:22:37 -07:00
Aaron
5a09b11519 refactor: implement a new interface for processing parameters
add documentation for fields

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-03 21:46:37 -07:00
Aaron
9441917749 feat(runner): dynamic generate runner class
fix a client dump

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-03 18:48:22 -07:00
Aaron
0df8d8b9a6 perf: reduce unecessary object creation for config class
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-28 05:22:22 -07:00
Aaron
e0fc37e47f fix(docs): update docs about saving custom fine-tuned
and update annotations for client

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 21:15:44 -07:00
Aaron
52d65f999f feat(telemetry): add support for usage tracking
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 20:39:13 -07:00
Aaron
cf4e55c36f fix(client): implement per client framework and model_name getters
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 05:20:07 -07:00
aarnphm-ec2-dev
8ee5b048f3 feat(client): Async and Sync client
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-26 22:51:21 -07:00
aarnphm-ec2-dev
4127961c5c feat: openllm.client
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev
20b3a0260f refactor: move Prompt object to client specific attributes
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
Aaron
d31d450526 feat: Adding central service definition and init openllm_client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 00:33:05 -07:00
Aaron
426a61713f feat: start and start_grpc API
with_options listen from environment variable for said models.

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194 feat: initial openllm_client implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2 feat: FLAN-T5 supports
- add infrastructure, to be implemented: cache, chat history

- Base Runnable Implementation, that fits LangChain API

- Added a Prompt descriptor and utils.

feat: license headers and auto factory impl and CLI

Auto construct args from pydantic config

Add auto factory for ease of use

only provide `/generate` to streamline UX experience

CLI > envvar > input contract for configuration

fix: serve from a thread

fix CLI args

chore: cleanup names and refactor imports

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-03 17:50:14 -07:00