Commit Graph

22 Commits

Author SHA1 Message Date
aarnphm-ec2-dev
4127961c5c feat: openllm.client
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev
ac933d60f1 fix(cli): Make sure to skips models that only runs on GPU
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-26 00:43:14 +00:00
aarnphm-ec2-dev
fed17fafdc migrate(configuration): remove deprecated max_length in favor of
max_new_tokens

Preparation for transformers 5

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev
b502703f67 fix(chatglm): make sure to check for required dependencies cpm_kernels
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev
5c416fa218 feat: StarCoder
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev
3fe6b14dbf fix(cli): make sure __main__ is not convoluted
CLI should lives under openllm.cli, and the actual click.Group can be
created from create_cli lazily

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
765c1a6e5c feat: requires_gpu for specific LLM.
This will determine the behaviour of SUPPORTED_RESOURCES

TODO: Support TPU

supports for requirements for specific LLM

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
83a8a7cb4f docs(codegen): make sure the generated dostring is correct
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
20b3a0260f refactor: move Prompt object to client specific attributes
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
73d152fc77 feat(gpu): Make sure that we run models on GPU if available
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-24 19:31:22 -07:00
Aaron
135bafacaf fix(chatglm): support MacOS deployment
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-24 05:06:06 -07:00
Aaron
2676085b59 feat: chatglm and configuration naming type
by default, it is dasherize, but for cases like chatglm, it can be
lowercase as well

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-24 04:20:24 -07:00
Aaron
162c021cae feat(timeout): support server_timeout and LLM timeout
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 16:48:01 -07:00
Aaron
b1c07946c1 feat: dolly-v2 and general cleanup
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 14:27:27 -07:00
Aaron
a63cec8fa3 improve(flan-t5): update default generation config
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:20:38 -07:00
Aaron
602294b782 fix(start): silence error logs for now
respect BENTOML_HOME and BENTOML_DO_NOT_TRACK

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:19:23 -07:00
Aaron
549b0c54e9 feat: codegen and bundle build
fix configuration generation for runnable

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 18:22:25 -07:00
Aaron
d31d450526 feat: Adding central service definition and init openllm_client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 00:33:05 -07:00
Aaron
2a53faee9c infra: add structure and cleanup separation of tokenizer
since tokenizer are relatively light, all default LLM will bundle the
tokenizer with itself.

Maybe we can put the tokenizer in its own runner in the future

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:57:39 -07:00
Aaron
426a61713f feat: start and start_grpc API
with_options listen from environment variable for said models.

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194 feat: initial openllm_client implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2 feat: FLAN-T5 supports
- add infrastructure, to be implemented: cache, chat history

- Base Runnable Implementation, that fits LangChain API

- Added a Prompt descriptor and utils.

feat: license headers and auto factory impl and CLI

Auto construct args from pydantic config

Add auto factory for ease of use

only provide `/generate` to streamline UX experience

CLI > envvar > input contract for configuration

fix: serve from a thread

fix CLI args

chore: cleanup names and refactor imports

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-03 17:50:14 -07:00