OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-03-19 15:36:13 -04:00

Author	SHA1	Message	Date
aarnphm-ec2-dev	4127961c5c	feat: openllm.client Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev	ac933d60f1	fix(cli): Make sure to skips models that only runs on GPU Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>	2023-05-26 00:43:14 +00:00
aarnphm-ec2-dev	fed17fafdc	migrate(configuration): remove deprecated max_length in favor of max_new_tokens Preparation for transformers 5 Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev	b502703f67	fix(chatglm): make sure to check for required dependencies cpm_kernels Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev	5c416fa218	feat: StarCoder Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev	3fe6b14dbf	fix(cli): make sure __main__ is not convoluted CLI should lives under openllm.cli, and the actual click.Group can be created from create_cli lazily Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev	765c1a6e5c	feat: requires_gpu for specific LLM. This will determine the behaviour of SUPPORTED_RESOURCES TODO: Support TPU supports for requirements for specific LLM Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev	83a8a7cb4f	docs(codegen): make sure the generated dostring is correct Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev	20b3a0260f	refactor: move Prompt object to client specific attributes Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev	73d152fc77	feat(gpu): Make sure that we run models on GPU if available Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-24 19:31:22 -07:00
Aaron	135bafacaf	fix(chatglm): support MacOS deployment Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-24 05:06:06 -07:00
Aaron	2676085b59	feat: chatglm and configuration naming type by default, it is dasherize, but for cases like chatglm, it can be lowercase as well Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-24 04:20:24 -07:00
Aaron	162c021cae	feat(timeout): support server_timeout and LLM timeout Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-23 16:48:01 -07:00
Aaron	b1c07946c1	feat: dolly-v2 and general cleanup Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-23 14:27:27 -07:00
Aaron	a63cec8fa3	improve(flan-t5): update default generation config Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-16 12:20:38 -07:00
Aaron	602294b782	fix(start): silence error logs for now respect BENTOML_HOME and BENTOML_DO_NOT_TRACK Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-16 12:19:23 -07:00
Aaron	549b0c54e9	feat: codegen and bundle build fix configuration generation for runnable Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-15 18:22:25 -07:00
Aaron	d31d450526	feat: Adding central service definition and init openllm_client Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-15 00:33:05 -07:00
Aaron	2a53faee9c	infra: add structure and cleanup separation of tokenizer since tokenizer are relatively light, all default LLM will bundle the tokenizer with itself. Maybe we can put the tokenizer in its own runner in the future Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-05 11:57:39 -07:00
Aaron	426a61713f	feat: start and start_grpc API with_options listen from environment variable for said models. Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-05 11:07:52 -07:00
Aaron	3e32b24194	feat: initial openllm_client implementation Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-05 02:43:35 -07:00
Chaoyu	dd8b6050b2	feat: FLAN-T5 supports - add infrastructure, to be implemented: cache, chat history - Base Runnable Implementation, that fits LangChain API - Added a Prompt descriptor and utils. feat: license headers and auto factory impl and CLI Auto construct args from pydantic config Add auto factory for ease of use only provide `/generate` to streamline UX experience CLI > envvar > input contract for configuration fix: serve from a thread fix CLI args chore: cleanup names and refactor imports Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-05-03 17:50:14 -07:00

22 Commits