Commit Graph

10 Commits

Author SHA1 Message Date
Aaron
162c021cae feat(timeout): support server_timeout and LLM timeout
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 16:48:01 -07:00
Aaron
b1c07946c1 feat: dolly-v2 and general cleanup
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 14:27:27 -07:00
Aaron
a63cec8fa3 improve(flan-t5): update default generation config
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:20:38 -07:00
Aaron
602294b782 fix(start): silence error logs for now
respect BENTOML_HOME and BENTOML_DO_NOT_TRACK

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:19:23 -07:00
Aaron
549b0c54e9 feat: codegen and bundle build
fix configuration generation for runnable

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 18:22:25 -07:00
Aaron
d31d450526 feat: Adding central service definition and init openllm_client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 00:33:05 -07:00
Aaron
2a53faee9c infra: add structure and cleanup separation of tokenizer
since tokenizer are relatively light, all default LLM will bundle the
tokenizer with itself.

Maybe we can put the tokenizer in its own runner in the future

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:57:39 -07:00
Aaron
426a61713f feat: start and start_grpc API
with_options listen from environment variable for said models.

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194 feat: initial openllm_client implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2 feat: FLAN-T5 supports
- add infrastructure, to be implemented: cache, chat history

- Base Runnable Implementation, that fits LangChain API

- Added a Prompt descriptor and utils.

feat: license headers and auto factory impl and CLI

Auto construct args from pydantic config

Add auto factory for ease of use

only provide `/generate` to streamline UX experience

CLI > envvar > input contract for configuration

fix: serve from a thread

fix CLI args

chore: cleanup names and refactor imports

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-03 17:50:14 -07:00