Commit Graph

19 Commits

Author SHA1 Message Date
aarnphm-ec2-dev
83a8a7cb4f docs(codegen): make sure the generated dostring is correct
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
20b3a0260f refactor: move Prompt object to client specific attributes
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
545515c01f infra: Install BentoML from main and its auxilary dependencies
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
aarnphm-ec2-dev
73d152fc77 feat(gpu): Make sure that we run models on GPU if available
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-24 19:31:22 -07:00
Aaron
135bafacaf fix(chatglm): support MacOS deployment
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-24 05:06:06 -07:00
aarnphm-ec2-dev
9139360426 fix(coverage): Make sure to exclude the correct TYPE_CHECKING in
openllm

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-24 11:32:06 +00:00
Aaron
2676085b59 feat: chatglm and configuration naming type
by default, it is dasherize, but for cases like chatglm, it can be
lowercase as well

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-24 04:20:24 -07:00
Aaron Pham
11c7783a0e fix(infra): feature_request.yml missing title 2023-05-23 16:54:38 -07:00
Aaron Pham
427106df98 fix(infra): bug_report.yml missing title 2023-05-23 16:54:09 -07:00
Aaron
162c021cae feat(timeout): support server_timeout and LLM timeout
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 16:48:01 -07:00
Aaron
b1c07946c1 feat: dolly-v2 and general cleanup
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 14:27:27 -07:00
Aaron
a63cec8fa3 improve(flan-t5): update default generation config
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:20:38 -07:00
Aaron
602294b782 fix(start): silence error logs for now
respect BENTOML_HOME and BENTOML_DO_NOT_TRACK

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:19:23 -07:00
Aaron
549b0c54e9 feat: codegen and bundle build
fix configuration generation for runnable

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 18:22:25 -07:00
Aaron
d31d450526 feat: Adding central service definition and init openllm_client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 00:33:05 -07:00
Aaron
2a53faee9c infra: add structure and cleanup separation of tokenizer
since tokenizer are relatively light, all default LLM will bundle the
tokenizer with itself.

Maybe we can put the tokenizer in its own runner in the future

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:57:39 -07:00
Aaron
426a61713f feat: start and start_grpc API
with_options listen from environment variable for said models.

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194 feat: initial openllm_client implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 02:43:35 -07:00
Chaoyu
dd8b6050b2 feat: FLAN-T5 supports
- add infrastructure, to be implemented: cache, chat history

- Base Runnable Implementation, that fits LangChain API

- Added a Prompt descriptor and utils.

feat: license headers and auto factory impl and CLI

Auto construct args from pydantic config

Add auto factory for ease of use

only provide `/generate` to streamline UX experience

CLI > envvar > input contract for configuration

fix: serve from a thread

fix CLI args

chore: cleanup names and refactor imports

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-03 17:50:14 -07:00