feat(tooling): add script to auto update readme table of supported

models

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
This commit is contained in:
Aaron
2023-06-08 08:22:55 -04:00
parent 0680059a21
commit 23d98a2729
11 changed files with 188 additions and 40 deletions

View File

@@ -1,44 +1,76 @@
# Adding a New Model
OpenLLM encourages contributions by welcoming users to incorporate their custom Large Language Models (LLMs) into the ecosystem. You can set up your development environment by referring to our [Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md).
OpenLLM encourages contributions by welcoming users to incorporate their custom
Large Language Models (LLMs) into the ecosystem. You can set up your development
environment by referring to our
[Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md).
## Procedure
All the relevant code for incorporating a new model resides within `src/openllm/models`. Start by creating a new folder named after your `model_name` in snake_case. Here's your roadmap:
All the relevant code for incorporating a new model resides within
`src/openllm/models`. Start by creating a new folder named after your
`model_name` in snake_case. Here's your roadmap:
- [ ] Generate model configuration file: `src/openllm/models/{model_name}/configuration_{model_name}.py`
- [ ] Establish model implementation files: `src/openllm/models/{model_name}/modeling_{runtime}_{model_name}.py`
- [ ] Create module's `__init__.py`: `src/openllm/models/{model_name}/__init__.py`
- [ ] Generate model configuration file:
`src/openllm/models/{model_name}/configuration_{model_name}.py`
- [ ] Establish model implementation files:
`src/openllm/models/{model_name}/modeling_{runtime}_{model_name}.py`
- [ ] Create module's `__init__.py`:
`src/openllm/models/{model_name}/__init__.py`
- [ ] Adjust the entrypoints for files at `src/openllm/models/auto/*`
- [ ] Modify the main `__init__.py`: `src/openllm/models/__init__.py`
- [ ] Develop or adjust dummy objects for dependencies, a task exclusive to the `utils` directory: `src/openllm/utils/*`
- [ ] Develop or adjust dummy objects for dependencies, a task exclusive to the
`utils` directory: `src/openllm/utils/*`
For a working example, check out any pre-implemented model.
> We are developing a CLI command and helper script to generate these files, which would further streamline the process. Until then, manual creation is necessary.
> We are developing a CLI command and helper script to generate these files,
> which would further streamline the process. Until then, manual creation is
> necessary.
### Model Configuration
File Name: `configuration_{model_name}.py`
This file is dedicated to specifying docstrings, default prompt templates, default parameters, as well as additional fields for the models.
This file is dedicated to specifying docstrings, default prompt templates,
default parameters, as well as additional fields for the models.
### Model Implementation
File Name: `modeling_{runtime}_{model_name}.py`
For each runtime, i.e., torch (default with no prefix), TensorFlow -`tf`, Flax - `flax`, it is necessary to implement a class that adheres to the `openllm.LLM` interface. The conventional class name follows the `RuntimeModelName` pattern, e.g., `FlaxFlanT5`.
For each runtime, i.e., torch (default with no prefix), TensorFlow -`tf`, Flax -
`flax`, it is necessary to implement a class that adheres to the `openllm.LLM`
interface. The conventional class name follows the `RuntimeModelName` pattern,
e.g., `FlaxFlanT5`.
### Initialization Files
The `__init__.py` files facilitate intelligent imports, type checking, and auto-completions for the OpenLLM codebase and CLIs.
### Initialization Files
The `__init__.py` files facilitate intelligent imports, type checking, and
auto-completions for the OpenLLM codebase and CLIs.
### Entrypoint
After establishing the model config and implementation class, register them in the `auto` folder files. There are four entrypoint files:
* `configuration_auto.py`: Registers `ModelConfig` classes
* `modeling_auto.py`: Registers a model's PyTorch implementation
* `modeling_tf_auto.py`: Registers a model's TensorFlow implementation
* `modeling_flax_auto.py`: Registers a model's Flax implementation
After establishing the model config and implementation class, register them in
the `auto` folder files. There are four entrypoint files:
- `configuration_auto.py`: Registers `ModelConfig` classes
- `modeling_auto.py`: Registers a model's PyTorch implementation
- `modeling_tf_auto.py`: Registers a model's TensorFlow implementation
- `modeling_flax_auto.py`: Registers a model's Flax implementation
### Dummy Objects
In the `src/openllm/utils` directory, dummy objects are created for each model and runtime implementation. These specify the dependencies required for each model.
In the `src/openllm/utils` directory, dummy objects are created for each model
and runtime implementation. These specify the dependencies required for each
model.
### Updating README.md
Run `./tools/update-readme.py` to update the README.md file with the new model.
## Raise a Pull Request
Once you have completed the checklist above, raise a PR and the OpenLLMs maintainer will review it ASAP. Once the PR is merged, you should be able to see your model in the next release! 🎉 🎊
Once you have completed the checklist above, raise a PR and the OpenLLMs
maintainer will review it ASAP. Once the PR is merged, you should be able to see
your model in the next release! 🎉 🎊

View File

@@ -97,6 +97,13 @@ start interacting with the model:
>>> client.query('Explain to me the difference between "further" and "farther"')
```
You can also use the `openllm query` command to query the model from the
terminal:
```bash
openllm query --local 'Explain to me the difference between "further" and "farther"'
```
## 🚀 Deploying to Production
To deploy your LLMs into production:
@@ -131,27 +138,23 @@ To deploy your LLMs into production:
OpenLLM currently supports the following:
- [dolly-v2](https://github.com/databrickslabs/dolly)
- [flan-t5](https://huggingface.co/docs/transformers/model_doc/flan-t5)
- [chatglm](https://github.com/THUDM/ChatGLM-6B)
- [falcon](https://falconllm.tii.ae/)
- [starcoder](https://github.com/bigcode-project/starcoder)
<!-- update-readme.py: start -->
### Model-specific Dependencies
| Model | CPU | GPU | Optional |
| --------------------------------------------------------------------- | --- | --- | -------------------------------- |
| [flan-t5](https://huggingface.co/docs/transformers/model_doc/flan-t5) | ✅ | ✅ | `pip install openllm[flan-t5]` |
| [dolly-v2](https://github.com/databrickslabs/dolly) | ✅ | ✅ | 👾 (not needed) |
| [chatglm](https://github.com/THUDM/ChatGLM-6B) | ❌ | ✅ | `pip install openllm[chatglm]` |
| [starcoder](https://github.com/bigcode-project/starcoder) | ❌ | ✅ | `pip install openllm[starcoder]` |
| [falcon](https://falconllm.tii.ae/) | ❌ | ✅ | `pip install openllm[falcon]` |
| [stablelm](https://github.com/Stability-AI/StableLM) | ✅ | ✅ | 👾 (not needed) |
We respect your system's space and efficiency. That's why we don't force users
to install dependencies for all models. By default, you can run `dolly-v2` and
`flan-t5` without installing any additional packages.
> NOTE: We respect users' system disk space. Hence, OpenLLM doesn't enforce to
> install dependencies to run all models. If one wishes to use any of the
> aforementioned models, make sure to install the optional dependencies
> mentioned above.
To enable support for a specific model, you'll need to install its corresponding
dependencies. You can do this by using `pip install "openllm[model_name]"`. For
example, to use **chatglm**:
```bash
pip install "openllm[chatglm]"
```
This will install `cpm_kernels` and `sentencepiece` additionally
<!-- update-readme.py: stop -->
### Runtime Implementations

View File

@@ -98,6 +98,7 @@ dependencies = [
"pytest-randomly",
"pytest-rerunfailures",
"pre-commit",
"tomlkit",
]
[tool.hatch.envs.default.scripts]
cov = ["test-cov", "cov-report"]

View File

@@ -649,6 +649,9 @@ class LLMConfig:
__openllm_hints__: dict[str, t.Any] = Field(None, init=False)
"""An internal cache of resolved types for this LLMConfig."""
__openllm_url__: str = Field(None, init=False)
"""The resolved url for this LLMConfig."""
GenerationConfig: type = type
"""Users can override this subclass of any given LLMConfig to provide GenerationConfig
default value. For example:
@@ -678,6 +681,7 @@ class LLMConfig:
default_timeout: int | None = None,
trust_remote_code: bool = False,
requires_gpu: bool = False,
url: str | None = None,
):
if name_type == "dasherize":
model_name = inflection.underscore(cls.__name__.replace("Config", ""))
@@ -694,6 +698,7 @@ class LLMConfig:
cls.__openllm_model_name__ = model_name
cls.__openllm_start_name__ = start_name
cls.__openllm_env__ = openllm.utils.ModelEnv(model_name)
cls.__openllm_url__ = url or "(not set)"
# NOTE: Since we want to enable a pydantic-like experience
# this means we will have to hide the attr abstraction, and generate

View File

@@ -22,6 +22,7 @@ class ChatGLMConfig(
trust_remote_code=True,
default_timeout=3600000,
requires_gpu=True,
url="https://github.com/THUDM/ChatGLM-6B",
):
"""
ChatGLM is an open bilingual language model based on

View File

@@ -20,7 +20,12 @@ from __future__ import annotations
import openllm
class DollyV2Config(openllm.LLMConfig, default_timeout=3600000, trust_remote_code=True):
class DollyV2Config(
openllm.LLMConfig,
default_timeout=3600000,
trust_remote_code=True,
url="https://github.com/databrickslabs/dolly",
):
"""Databricks Dolly is an instruction-following large language model trained on the Databricks
machine learning platform that is licensed for commercial use.

View File

@@ -22,6 +22,7 @@ class FalconConfig(
trust_remote_code=True,
requires_gpu=True,
default_timeout=3600000,
url="https://falconllm.tii.ae/",
):
"""Falcon-7B is a 7B parameters causal decoder-only model built by
TII and trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)

View File

@@ -40,7 +40,7 @@ saved pretrained, or a fine-tune FLAN-T5, provide ``OPENLLM_FLAN_T5_PRETRAINED='
DEFAULT_PROMPT_TEMPLATE = """Answer the following question:\nQuestion: {instruction}\nAnswer:"""
class FlanT5Config(openllm.LLMConfig):
class FlanT5Config(openllm.LLMConfig, url="https://huggingface.co/docs/transformers/model_doc/flan-t5"):
"""FLAN-T5 was released in the paper [Scaling Instruction-Finetuned Language Models](https://arxiv.org/pdf/2210.11416.pdf)
- it is an enhanced version of T5 that has been finetuned in a mixture of tasks.

View File

@@ -16,7 +16,7 @@ from __future__ import annotations
import openllm
class StableLMConfig(openllm.LLMConfig, name_type="lowercase"):
class StableLMConfig(openllm.LLMConfig, name_type="lowercase", url="https://github.com/Stability-AI/StableLM"):
"""StableLM-Base-Alpha is a suite of 3B and 7B parameter decoder-only language models
pre-trained on a diverse collection of English datasets with a sequence
length of 4096 to push beyond the context window limitations of existing open-source language models.

View File

@@ -16,7 +16,12 @@ from __future__ import annotations
import openllm
class StarCoderConfig(openllm.LLMConfig, name_type="lowercase", requires_gpu=True):
class StarCoderConfig(
openllm.LLMConfig,
name_type="lowercase",
requires_gpu=True,
url="https://github.com/bigcode-project/starcoder",
):
"""The StarCoder models are 15.5B parameter models trained on 80+ programming languages from
[The Stack (v1.2)](https://huggingface.co/datasets/bigcode/the-stack), with opt-out requests excluded.

95
tools/update-readme.py Executable file
View File

@@ -0,0 +1,95 @@
#!/usr/bin/env python3
from __future__ import annotations
import os
import typing as t
import inflection
import tomlkit
import openllm
START_COMMENT = f"<!-- {os.path.basename(__file__)}: start -->\n"
END_COMMENT = f"<!-- {os.path.basename(__file__)}: stop -->\n"
ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
def main() -> int:
with open(os.path.join(ROOT, "pyproject.toml"), "r") as f:
deps = tomlkit.parse(f.read()).value["project"]["optional-dependencies"]
with open(os.path.join(ROOT, "README.md"), "r") as f:
readme = f.readlines()
start_index, stop_index = readme.index(START_COMMENT), readme.index(END_COMMENT)
formatted: dict[t.Literal["Model", "CPU", "GPU", "Optional"], list[str]] = {
"Model": [],
"CPU": [],
"GPU": [],
"Optional": [],
}
max_name_len_div = 0
max_install_len_div = 0
does_not_need_custom_installation: list[str] = []
for name, config in openllm.CONFIG_MAPPING.items():
dashed = inflection.dasherize(name)
model_name = f"[{dashed}]({config.__openllm_url__})"
if len(model_name) > max_name_len_div:
max_name_len_div = len(model_name)
formatted["Model"].append(model_name)
formatted["GPU"].append("")
formatted["CPU"].append("" if not config.__openllm_requires_gpu__ else "")
instruction = "👾 (not needed)"
if dashed in deps:
instruction = f"`pip install openllm[{dashed}]`"
else:
does_not_need_custom_installation.append(model_name)
if len(instruction) > max_install_len_div:
max_install_len_div = len(instruction)
formatted["Optional"].append(instruction)
meta = ["\n"]
# NOTE: headers
meta += f"| Model {' ' * (max_name_len_div - 6)} | CPU | GPU | Optional {' ' * (max_install_len_div - 8)}|\n"
# NOTE: divs
meta += f"| {'-' * max_name_len_div}" + " | --- | --- | " + f"{'-' * max_install_len_div} |\n"
# NOTE: rows
for links, cpu, gpu, custom_installation in t.cast("tuple[str, str, str, str]", zip(*formatted.values())):
meta += (
"| "
+ links
+ " " * (max_name_len_div - len(links))
+ f" | {cpu} | {gpu} | "
+ custom_installation
+ " "
* (
max_install_len_div
- len(custom_installation)
- (0 if links not in does_not_need_custom_installation else 1)
)
+ " |\n"
)
meta += "\n"
# NOTE: adding notes
meta += """\
> NOTE: We respect users' system disk space. Hence, OpenLLM doesn't enforce to
> install dependencies to run all models. If one wishes to use any of the
> aforementioned models, make sure to install the optional dependencies
> mentioned above.
"""
readme = readme[:start_index] + [START_COMMENT] + meta + [END_COMMENT] + readme[stop_index + 1 :]
with open(os.path.join(ROOT, "README.md"), "w") as f:
f.writelines(readme)
return 0
if __name__ == "__main__":
raise SystemExit(main())