OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-05-04 22:02:45 -04:00

Author	SHA1	Message	Date
Aaron Pham	ef94c6b98a	feat(container): vLLM build and base image strategies (#142 )	2023-07-31 02:44:52 -04:00
aarnphm-ec2-dev	6dc0bf0b12	fix: remove breakpoint on CLI Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-25 16:30:16 +00:00
aarnphm-ec2-dev	b23b59e1c9	fix(embeddings): correctly set JSON data via CLI client Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-07-25 16:26:01 +00:00
Aaron Pham	1940086bec	feat(client): embeddings (#146 )	2023-07-25 05:44:21 -04:00
Aaron Pham	693631958a	feat(service): provisional API (#133 )	2023-07-23 02:15:39 -04:00
Aaron Pham	c1ddb9ed7c	feat: GPTQ + vLLM and LlaMA (#113 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-19 18:12:12 -04:00
Aaron Pham	c7f4dc7bb2	feat(test): snapshot testing (#107 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-07-10 17:23:19 -04:00
Aaron Pham	9f6b254086	qa: improvements and agents log (#105 )	2023-07-05 08:39:31 -04:00
Aaron Pham	01db504e7d	feat: MPT (#91 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-28 23:12:15 -04:00
Aaron Pham	1435478f6c	fix(cli): ensure we parse tag for download (#58 )	2023-06-23 21:24:53 -04:00
Aaron Pham	dfca956fad	feat: serve adapter layers (#52 )	2023-06-23 10:07:15 -04:00
Aaron	1ed0ae7787	fix(log): make sure to configure OpenLLM logs correctly Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-19 06:19:06 -04:00
Aaron Pham	03758a5487	fix(tools): adhere to style guidelines (#31 )	2023-06-18 20:03:17 -04:00
Aaron Pham	4fcd7c8ac9	integration: HuggingFace Agent (#29 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2023-06-18 00:13:53 -04:00
Aaron Pham	6f724416c0	perf: build quantization and better transformer behaviour (#28 ) Fixes quantization_config and low_cpu_mem_usage to be available on PyTorch implementation only See changelog for more details on #28	2023-06-17 08:56:14 -04:00
Aaron Pham	19bc7e3116	feat: fine-tuning [part 1] (#23 )	2023-06-16 00:19:01 -04:00
Aaron	528f76e1d0	fix(client): using httpx for running calls within async context This is so that client.query works within a async context Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-15 01:58:49 -04:00
Aaron	50d59cdf8d	types: rename interface Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-14 02:45:34 -04:00
Aaron	cb76a894cf	feat(metadata): add configuration to metadata endpoint Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-13 07:09:31 -04:00
Aaron	71070b90b4	chore(metadata): fix model_id to be respected on service.py Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-12 16:04:52 -04:00
aarnphm-ec2-dev	81d46ca211	feat(type): support annotations openllm.LLM now supports fully typed-strict openllm.LLM[ModelType, TokenizerType] -> self.model, self.tokenizer Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-06-11 14:58:17 +00:00
aarnphm-ec2-dev	4db141c649	feat(gpu): support passing GPU per LLM respect CUDA_VISIBLE_DEVICES and optionally --device Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-06-10 09:47:16 +00:00
Aaron	afddaed08c	fix(perf): respect per request information remove use_default_prompt_template options add pretrained to list of start help docstring fix flax generation config improve flax and tensorflow implementation Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-10 02:14:13 -04:00
aarnphm-ec2-dev	c960b3edff	feat(client): add postprocess for processing client output call Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-07 00:24:20 -04:00
Aaron	f840222d12	feat(service): add timeout to metadata Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-05 00:46:02 -07:00
Aaron	8ef4c9cb19	fix(types): broken import and add hints for client Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-05 00:10:44 -07:00
Aaron	ec941c95d5	chore: add license header Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-04 16:22:37 -07:00
Aaron	5a09b11519	refactor: implement a new interface for processing parameters add documentation for fields Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-03 21:46:37 -07:00
Aaron	9441917749	feat(runner): dynamic generate runner class fix a client dump Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-03 18:48:22 -07:00
Aaron	0df8d8b9a6	perf: reduce unecessary object creation for config class Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-28 05:22:22 -07:00
Aaron	e0fc37e47f	fix(docs): update docs about saving custom fine-tuned and update annotations for client Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-27 21:15:44 -07:00
Aaron	52d65f999f	feat(telemetry): add support for usage tracking Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-27 20:39:13 -07:00
Aaron	cf4e55c36f	fix(client): implement per client framework and model_name getters Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-27 05:20:07 -07:00
aarnphm-ec2-dev	8ee5b048f3	feat(client): Async and Sync client Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-26 22:51:21 -07:00
aarnphm-ec2-dev	4127961c5c	feat: openllm.client Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev	20b3a0260f	refactor: move Prompt object to client specific attributes Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:06 -07:00
Aaron	d31d450526	feat: Adding central service definition and init openllm_client Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-15 00:33:05 -07:00
Aaron	426a61713f	feat: start and start_grpc API with_options listen from environment variable for said models. Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-05 11:07:52 -07:00
Aaron	3e32b24194	feat: initial openllm_client implementation Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-05 02:43:35 -07:00
Chaoyu	dd8b6050b2	feat: FLAN-T5 supports - add infrastructure, to be implemented: cache, chat history - Base Runnable Implementation, that fits LangChain API - Added a Prompt descriptor and utils. feat: license headers and auto factory impl and CLI Auto construct args from pydantic config Add auto factory for ease of use only provide `/generate` to streamline UX experience CLI > envvar > input contract for configuration fix: serve from a thread fix CLI args chore: cleanup names and refactor imports Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-03 17:50:14 -07:00

40 Commits