aarnphm-ec2-dev
ef40fdf5c8
fix(build): quote environment variables
...
Make sure that the config is quoted properly in generated Dockerfile
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-21 11:28:37 +00:00
Aaron
de665def5c
fix(cli): support loading model-id from local path
...
SDK should already support loading from local-path, but on CLI there was
a bug with start where it sets the choice for model-id to only
pretrained set of model-id
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-21 07:26:12 -04:00
Aaron Pham
ca802d9d1a
fix: agent log ( #37 )
2023-06-19 14:11:39 -04:00
aarnphm-ec2-dev
70c7c0a9b7
fix(cli): use correct API for client
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-19 18:04:27 +00:00
aarnphm-ec2-dev
6d43bdbcdb
fix(instruct): remove breakpoint
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-19 18:00:00 +00:00
Aaron
1ed0ae7787
fix(log): make sure to configure OpenLLM logs correctly
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-19 06:19:06 -04:00
Aaron Pham
03758a5487
fix(tools): adhere to style guidelines ( #31 )
2023-06-18 20:03:17 -04:00
Aaron Pham
4fcd7c8ac9
integration: HuggingFace Agent ( #29 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-18 00:13:53 -04:00
Aaron Pham
6f724416c0
perf: build quantization and better transformer behaviour ( #28 )
...
Fixes quantization_config and low_cpu_mem_usage to be available on PyTorch implementation only
See changelog for more details on #28
2023-06-17 08:56:14 -04:00
Aaron Pham
ded8a9f809
feat: quantization ( #27 )
2023-06-16 18:10:50 -04:00
Aaron Pham
19bc7e3116
feat: fine-tuning [part 1] ( #23 )
2023-06-16 00:19:01 -04:00
Aaron Pham
5e1445218b
refactor: toplevel CLI ( #26 )
...
Move up CLI outside of the factory function to simplify workflow
2023-06-15 02:32:46 -04:00
aarnphm-ec2-dev
dfe71d7867
chore(cli): redirect download models into subcontext
...
utilise click subcontext for nicer CLI interaction
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-14 11:44:39 +00:00
Aaron
d7e92ae525
feat(cli): --device all --workers-per-resource
...
synonymous to the configuration arguments
add support for --device all
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-14 06:36:54 -04:00
Aaron
111d205f63
perf: faster LLM loading
...
using attrs for faster class creation opposed to metaclass
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-14 01:36:42 -04:00
Aaron Pham
dd20941050
chore: metadata ( #19 )
2023-06-13 04:09:33 -04:00
Aaron
71070b90b4
chore(metadata): fix model_id to be respected on service.py
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-12 16:04:52 -04:00
Aaron
4717989384
fix(tokenizers): allow forking by default
...
address message about forking in tokenizers
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-12 15:47:51 -04:00
Aaron Pham
f8ebb36e15
tests: fastpath ( #17 )
...
added fastpath cases for configuration and Flan-T5
fixes respecting model_id into lifecycle hooks.
update CLI to cleanup models info
2023-06-12 14:18:26 -04:00
Aaron
f8e99dd8f5
chore(configuration): clean house implementation
...
Using Attrs implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-11 18:46:15 -04:00
aarnphm-ec2-dev
1847209489
feat(cli): --workers
...
provide workers-per-resource configuration on CLI for build and start
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 16:22:13 +00:00
aarnphm-ec2-dev
81d46ca211
feat(type): support annotations
...
openllm.LLM now supports fully typed-strict
openllm.LLM[ModelType, TokenizerType] -> self.model, self.tokenizer
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 14:58:17 +00:00
aarnphm-ec2-dev
2e453fb005
refactor(configuration): __config__ and perf
...
move model_ids and default_id to config class declaration,
cleanup dependencies between config and LLM implementation
lazy load module during LLM creation to llm_post_init
fix post_init hooks to run load_in_mha.
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 12:53:15 +00:00
aarnphm-ec2-dev
17241292da
feat(cli): show runtime implementation
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 05:29:40 +00:00
aarnphm-ec2-dev
8762a56093
revert: broken KeyboardInterrupt change
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 04:20:07 +00:00
aarnphm-ec2-dev
512cd0715c
feat(service): implementing with lifecycle hooks
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-11 04:14:18 +00:00
Aaron
6a937d8b51
feat(scheduling): custom GPU offload strategy
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-10 22:57:54 -04:00
Aaron
b22468e8c4
feat(cli): openllm models --show-available
...
show available models locally
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-10 20:46:11 -04:00
aarnphm-ec2-dev
bb37f7e238
feat(utils): lazy load modules and fix typo
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-10 22:18:37 +00:00
Aaron
05fa34f9e6
refactor: pretrained => model_id
...
I think model_id makes more sense than calling it pretrained
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-10 17:36:02 -04:00
aarnphm-ec2-dev
4db141c649
feat(gpu): support passing GPU per LLM
...
respect CUDA_VISIBLE_DEVICES and optionally --device
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-10 09:47:16 +00:00
aarnphm-ec2-dev
8fbf352ec6
docs: add more information about pretrained weights
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-10 06:58:36 +00:00
Aaron
afddaed08c
fix(perf): respect per request information
...
remove use_default_prompt_template options
add pretrained to list of start help docstring
fix flax generation config
improve flax and tensorflow implementation
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-10 02:14:13 -04:00
Chaoyu
e2b26adf2f
chore(docs): update README.md
...
See #12
2023-06-10 00:21:21 -04:00
aarnphm-ec2-dev
0f7840626d
fix(cli): make sure to allow user to pass endpointu
...
--endpoint http://0.0.0.0:3000
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-08 19:23:04 +00:00
Aaron
a84661142c
chore(cli): remove --local for query
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-08 14:53:11 -04:00
Aaron
c0418b76ec
feat(infra): add tools for managing optional-dependencies
...
based on llm config
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-08 08:57:19 -04:00
aarnphm-ec2-dev
e9e12a66a8
fix(falcon): custom load
...
This has to do with pipeline load is pretty magical and broken
on transformers
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-08 09:03:34 +00:00
Aaron
f2771bfe49
chore(cli): move back --version
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-07 03:41:50 -04:00
aarnphm-ec2-dev
170be0ebc8
fix(cli): make sure make_tag to respect config trust_remote_code
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-07 04:35:15 +00:00
Aaron
d6d2de6748
feat(cli): prune
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 23:24:50 -04:00
Aaron
aa50b5279e
fix(falcon): loading based on model registration
...
remove duplicate events
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 22:42:28 -04:00
Aaron
8823c70e5a
chore: rename variants to pretrained for consistency
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 18:45:45 -04:00
Aaron
f78d55f0fd
fix(cli): type handling for specific container types
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 17:18:25 -04:00
Aaron
b446b65642
chore(cli): remove alias and use build to be consistent with BentoML
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 15:51:13 -04:00
Aaron
a0749d0a80
chore: update version message
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 08:31:40 -04:00
Aaron
1707beb7aa
feat(cli): openllm query
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-06 08:05:13 -04:00
Aaron
64d783107d
chore(cli): update namespace and show better traceback
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-03 06:39:01 -07:00
Aaron
49cb02d2f2
perf(cli): improve printing speed that respect terminal_size
...
Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com >
2023-06-02 06:58:11 -07:00
aarnphm-ec2-dev
c3aeb43997
fix: generation serde
...
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com >
2023-06-02 07:06:04 +00:00