aarnphm-ec2-dev
83a8a7cb4f
docs(codegen): make sure the generated dostring is correct
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
20b3a0260f
refactor: move Prompt object to client specific attributes
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
545515c01f
infra: Install BentoML from main and its auxilary dependencies
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
73d152fc77
feat(gpu): Make sure that we run models on GPU if available
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-24 19:31:22 -07:00
Aaron
135bafacaf
fix(chatglm): support MacOS deployment
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-24 05:06:06 -07:00
aarnphm-ec2-dev
9139360426
fix(coverage): Make sure to exclude the correct TYPE_CHECKING in
...
openllm
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-05-24 11:32:06 +00:00
Aaron
2676085b59
feat: chatglm and configuration naming type
...
by default, it is dasherize, but for cases like chatglm, it can be
lowercase as well
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-24 04:20:24 -07:00
Aaron Pham
11c7783a0e
fix(infra): feature_request.yml missing title
2023-05-23 16:54:38 -07:00
Aaron Pham
427106df98
fix(infra): bug_report.yml missing title
2023-05-23 16:54:09 -07:00
Aaron
162c021cae
feat(timeout): support server_timeout and LLM timeout
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-23 16:48:01 -07:00
Aaron
b1c07946c1
feat: dolly-v2 and general cleanup
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-23 14:27:27 -07:00
Aaron
a63cec8fa3
improve(flan-t5): update default generation config
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-16 12:20:38 -07:00
Aaron
602294b782
fix(start): silence error logs for now
...
respect BENTOML_HOME and BENTOML_DO_NOT_TRACK
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-16 12:19:23 -07:00
Aaron
549b0c54e9
feat: codegen and bundle build
...
fix configuration generation for runnable
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-15 18:22:25 -07:00
Aaron
d31d450526
feat: Adding central service definition and init openllm_client
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-15 00:33:05 -07:00
Aaron
2a53faee9c
infra: add structure and cleanup separation of tokenizer
...
since tokenizer are relatively light, all default LLM will bundle the
tokenizer with itself.
Maybe we can put the tokenizer in its own runner in the future
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 11:57:39 -07:00
Aaron
426a61713f
feat: start and start_grpc API
...
with_options listen from environment variable for said models.
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194
feat: initial openllm_client implementation
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2
feat: FLAN-T5 supports
...
- add infrastructure, to be implemented: cache, chat history
- Base Runnable Implementation, that fits LangChain API
- Added a Prompt descriptor and utils.
feat: license headers and auto factory impl and CLI
Auto construct args from pydantic config
Add auto factory for ease of use
only provide `/generate` to streamline UX experience
CLI > envvar > input contract for configuration
fix: serve from a thread
fix CLI args
chore: cleanup names and refactor imports
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-05-03 17:50:14 -07:00