aarnphm-ec2-dev
4127961c5c
feat: openllm.client
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev
ac933d60f1
fix(cli): Make sure to skips models that only runs on GPU
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-05-26 00:43:14 +00:00
aarnphm-ec2-dev
fed17fafdc
migrate(configuration): remove deprecated max_length in favor of
...
max_new_tokens
Preparation for transformers 5
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev
b502703f67
fix(chatglm): make sure to check for required dependencies cpm_kernels
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev
5c416fa218
feat: StarCoder
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:07 -07:00
aarnphm-ec2-dev
3fe6b14dbf
fix(cli): make sure __main__ is not convoluted
...
CLI should lives under openllm.cli, and the actual click.Group can be
created from create_cli lazily
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
765c1a6e5c
feat: requires_gpu for specific LLM.
...
This will determine the behaviour of SUPPORTED_RESOURCES
TODO: Support TPU
supports for requirements for specific LLM
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
83a8a7cb4f
docs(codegen): make sure the generated dostring is correct
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
20b3a0260f
refactor: move Prompt object to client specific attributes
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
73d152fc77
feat(gpu): Make sure that we run models on GPU if available
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-24 19:31:22 -07:00
Aaron
135bafacaf
fix(chatglm): support MacOS deployment
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-24 05:06:06 -07:00
Aaron
2676085b59
feat: chatglm and configuration naming type
...
by default, it is dasherize, but for cases like chatglm, it can be
lowercase as well
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-24 04:20:24 -07:00
Aaron
162c021cae
feat(timeout): support server_timeout and LLM timeout
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-23 16:48:01 -07:00
Aaron
b1c07946c1
feat: dolly-v2 and general cleanup
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-23 14:27:27 -07:00
Aaron
a63cec8fa3
improve(flan-t5): update default generation config
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-16 12:20:38 -07:00
Aaron
602294b782
fix(start): silence error logs for now
...
respect BENTOML_HOME and BENTOML_DO_NOT_TRACK
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-16 12:19:23 -07:00
Aaron
549b0c54e9
feat: codegen and bundle build
...
fix configuration generation for runnable
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-15 18:22:25 -07:00
Aaron
d31d450526
feat: Adding central service definition and init openllm_client
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-15 00:33:05 -07:00
Aaron
2a53faee9c
infra: add structure and cleanup separation of tokenizer
...
since tokenizer are relatively light, all default LLM will bundle the
tokenizer with itself.
Maybe we can put the tokenizer in its own runner in the future
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 11:57:39 -07:00
Aaron
426a61713f
feat: start and start_grpc API
...
with_options listen from environment variable for said models.
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194
feat: initial openllm_client implementation
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2
feat: FLAN-T5 supports
...
- add infrastructure, to be implemented: cache, chat history
- Base Runnable Implementation, that fits LangChain API
- Added a Prompt descriptor and utils.
feat: license headers and auto factory impl and CLI
Auto construct args from pydantic config
Add auto factory for ease of use
only provide `/generate` to streamline UX experience
CLI > envvar > input contract for configuration
fix: serve from a thread
fix CLI args
chore: cleanup names and refactor imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-03 17:50:14 -07:00