Commit Graph

51 Commits

Author SHA1 Message Date
Chaoyu
e2b26adf2f chore(docs): update README.md
See #12
2023-06-10 00:21:21 -04:00
aarnphm-ec2-dev
0f7840626d fix(cli): make sure to allow user to pass endpointu
--endpoint http://0.0.0.0:3000

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 19:23:04 +00:00
Aaron
a84661142c chore(cli): remove --local for query
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 14:53:11 -04:00
Aaron
c0418b76ec feat(infra): add tools for managing optional-dependencies
based on llm config

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-08 08:57:19 -04:00
aarnphm-ec2-dev
e9e12a66a8 fix(falcon): custom load
This has to do with pipeline load is pretty magical and broken
on transformers

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-08 09:03:34 +00:00
Aaron
f2771bfe49 chore(cli): move back --version
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-07 03:41:50 -04:00
aarnphm-ec2-dev
170be0ebc8 fix(cli): make sure make_tag to respect config trust_remote_code
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-07 04:35:15 +00:00
Aaron
d6d2de6748 feat(cli): prune
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 23:24:50 -04:00
Aaron
aa50b5279e fix(falcon): loading based on model registration
remove duplicate events

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 22:42:28 -04:00
Aaron
8823c70e5a chore: rename variants to pretrained for consistency
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 18:45:45 -04:00
Aaron
f78d55f0fd fix(cli): type handling for specific container types
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 17:18:25 -04:00
Aaron
b446b65642 chore(cli): remove alias and use build to be consistent with BentoML
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 15:51:13 -04:00
Aaron
a0749d0a80 chore: update version message
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 08:31:40 -04:00
Aaron
1707beb7aa feat(cli): openllm query
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-06 08:05:13 -04:00
Aaron
64d783107d chore(cli): update namespace and show better traceback
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-03 06:39:01 -07:00
Aaron
49cb02d2f2 perf(cli): improve printing speed that respect terminal_size
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-02 06:58:11 -07:00
aarnphm-ec2-dev
c3aeb43997 fix: generation serde
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-02 07:06:04 +00:00
aarnphm-ec2-dev
07d42daaec fix: make sure we evolve the attribute from CLI
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-02 05:52:04 +00:00
aarnphm-ec2-dev
a94294bc65 fix: generate attrs class internally to conform with interface
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-06-01 19:06:06 +00:00
Aaron
84358b28cd chore: handle KeyboardInterrupt correctly
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-01 01:23:44 -07:00
Aaron
e86dc35ec5 chore: migrate service to use JSON
until we have attrs io descriptor, this should do it

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-06-01 00:28:43 -07:00
Aaron
4e2d5e330c refactor(cli): move CLI to address anti-pattern
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-31 13:53:40 -07:00
Aaron
33e7004e66 format: consistent CLI outputs
vendorred type-related module from bentoml._internal.types

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-30 14:56:11 -07:00
Aaron
fa16c67131 fix(cli): remove debug print
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-30 12:02:00 -07:00
Aaron Pham
01517e37c6 migration: attrs (#7)
Move configuration to attrs

Depends on https://github.com/bentoml/BentoML/pull/3906
2023-05-30 11:59:21 -07:00
Aaron
ac710dfd54 revert(perf): remove group alias
There is no need for this feature

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-28 10:04:33 -07:00
Aaron
435129372e perf(cli): lazily cached start commands
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-28 09:59:43 -07:00
aarnphm-ec2-dev
9c1c4ca0bf perf(cli): using click instead of rich console
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-28 09:32:56 -07:00
aarnphm-ec2-dev
3f36d81744 infra: docs and normalize formatting
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-28 15:00:17 +00:00
aarnphm-ec2-dev
8ca488d8fc fix(stablelm): Ensure passing EOS_TOKEN_ID for generation
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-28 14:43:00 +00:00
Aaron
b4403c24b0 fix(model): Make sure we download the model before starting the
service

This will ensure we don't deadlock among processes

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-28 14:01:49 +00:00
Aaron
0df8d8b9a6 perf: reduce unecessary object creation for config class
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-28 05:22:22 -07:00
Aaron
3fb1e5338a feat(dependencies): add optional for model
pretty print failed models loading due to missing dependencies

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-28 00:11:36 -07:00
Aaron
c84f653b77 feat(cli): add output for build and bundle
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 22:37:10 -07:00
Aaron
f24f13e6e4 chore(cli): consistency between table and json format
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 22:29:54 -07:00
Aaron
186658be63 docs: delay GPU to model check,
allow users to package and interact with models that requires GPU even on device without GPU

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 22:24:18 -07:00
Aaron
52d65f999f feat(telemetry): add support for usage tracking
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 20:39:13 -07:00
Aaron
a55817d647 feat(cli): update nicely formatted commands with shared output logics
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 15:51:03 -07:00
Aaron
fa895c329c feat: pre-commit setup
also sync JS release with Python version

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-27 06:54:22 -07:00
Aaron
b7f3a10910 refactor: migrate __init_subclass__ to Metaclass
LLMMetaclass will now responsible for generate internal attributes

add llm_type and identifying_params to Runnable class

subclass of openllm.LLM now can set a class attribute
__openllm_internal__ to let openllm knows that this is an internal class
implementation, instead of providing a _internal in the class
initialization.

support for preprocess_parameters and postprocess_parameters on client
side for better client UX

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-27 03:09:45 +00:00
Aaron
85252f13c4 fix(cli): simplify register code for start
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-26 01:45:03 -07:00
aarnphm-ec2-dev
4127961c5c feat: openllm.client
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-26 07:17:28 +00:00
aarnphm-ec2-dev
ac933d60f1 fix(cli): Make sure to skips models that only runs on GPU
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
2023-05-26 00:43:14 +00:00
aarnphm-ec2-dev
3fe6b14dbf fix(cli): make sure __main__ is not convoluted
CLI should lives under openllm.cli, and the actual click.Group can be
created from create_cli lazily

Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-25 16:22:06 -07:00
Aaron
162c021cae feat(timeout): support server_timeout and LLM timeout
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 16:48:01 -07:00
Aaron
b1c07946c1 feat: dolly-v2 and general cleanup
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-23 14:27:27 -07:00
Aaron
602294b782 fix(start): silence error logs for now
respect BENTOML_HOME and BENTOML_DO_NOT_TRACK

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-16 12:19:23 -07:00
Aaron
d31d450526 feat: Adding central service definition and init openllm_client
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-15 00:33:05 -07:00
Aaron
426a61713f feat: start and start_grpc API
with_options listen from environment variable for said models.

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 11:07:52 -07:00
Aaron
3e32b24194 feat: initial openllm_client implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
2023-05-05 02:43:35 -07:00