diff --git a/ADDING_NEW_MODEL.md b/ADDING_NEW_MODEL.md
index da062c49..b128d084 100644
--- a/ADDING_NEW_MODEL.md
+++ b/ADDING_NEW_MODEL.md
@@ -1,44 +1,76 @@
 # Adding a New Model
 
-OpenLLM encourages contributions by welcoming users to incorporate their custom Large Language Models (LLMs) into the ecosystem. You can set up your development environment by referring to our [Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md).
+OpenLLM encourages contributions by welcoming users to incorporate their custom
+Large Language Models (LLMs) into the ecosystem. You can set up your development
+environment by referring to our
+[Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md).
 
 ## Procedure
 
-All the relevant code for incorporating a new model resides within `src/openllm/models`. Start by creating a new folder named after your `model_name` in snake_case. Here's your roadmap:
+All the relevant code for incorporating a new model resides within
+`src/openllm/models`. Start by creating a new folder named after your
+`model_name` in snake_case. Here's your roadmap:
 
-- [ ] Generate model configuration file: `src/openllm/models/{model_name}/configuration_{model_name}.py`
-- [ ] Establish model implementation files: `src/openllm/models/{model_name}/modeling_{runtime}_{model_name}.py`
-- [ ] Create module's `__init__.py`: `src/openllm/models/{model_name}/__init__.py`
+- [ ] Generate model configuration file:
+      `src/openllm/models/{model_name}/configuration_{model_name}.py`
+- [ ] Establish model implementation files:
+      `src/openllm/models/{model_name}/modeling_{runtime}_{model_name}.py`
+- [ ] Create module's `__init__.py`:
+      `src/openllm/models/{model_name}/__init__.py`
 - [ ] Adjust the entrypoints for files at `src/openllm/models/auto/*`
 - [ ] Modify the main `__init__.py`: `src/openllm/models/__init__.py`
-- [ ] Develop or adjust dummy objects for dependencies, a task exclusive to the `utils` directory: `src/openllm/utils/*` 
+- [ ] Develop or adjust dummy objects for dependencies, a task exclusive to the
+      `utils` directory: `src/openllm/utils/*`
 
 For a working example, check out any pre-implemented model.
 
-> We are developing a CLI command and helper script to generate these files, which would further streamline the process. Until then, manual creation is necessary.
+> We are developing a CLI command and helper script to generate these files,
+> which would further streamline the process. Until then, manual creation is
+> necessary.
 
 ### Model Configuration
+
 File Name: `configuration_{model_name}.py`
 
-This file is dedicated to specifying docstrings, default prompt templates, default parameters, as well as additional fields for the models.
+This file is dedicated to specifying docstrings, default prompt templates,
+default parameters, as well as additional fields for the models.
 
 ### Model Implementation
+
 File Name: `modeling_{runtime}_{model_name}.py`
 
-For each runtime, i.e., torch (default with no prefix), TensorFlow -`tf`, Flax - `flax`, it is necessary to implement a class that adheres to the `openllm.LLM` interface. The conventional class name follows the `RuntimeModelName` pattern, e.g., `FlaxFlanT5`.
+For each runtime, i.e., torch (default with no prefix), TensorFlow -`tf`, Flax -
+`flax`, it is necessary to implement a class that adheres to the `openllm.LLM`
+interface. The conventional class name follows the `RuntimeModelName` pattern,
+e.g., `FlaxFlanT5`.
 
-### Initialization Files 
-The `__init__.py` files facilitate intelligent imports, type checking, and auto-completions for the OpenLLM codebase and CLIs.
+### Initialization Files
+
+The `__init__.py` files facilitate intelligent imports, type checking, and
+auto-completions for the OpenLLM codebase and CLIs.
 
 ### Entrypoint
-After establishing the model config and implementation class, register them in the `auto` folder files. There are four entrypoint files:
-* `configuration_auto.py`: Registers `ModelConfig` classes
-* `modeling_auto.py`: Registers a model's PyTorch implementation
-* `modeling_tf_auto.py`: Registers a model's TensorFlow implementation
-* `modeling_flax_auto.py`: Registers a model's Flax implementation
+
+After establishing the model config and implementation class, register them in
+the `auto` folder files. There are four entrypoint files:
+
+- `configuration_auto.py`: Registers `ModelConfig` classes
+- `modeling_auto.py`: Registers a model's PyTorch implementation
+- `modeling_tf_auto.py`: Registers a model's TensorFlow implementation
+- `modeling_flax_auto.py`: Registers a model's Flax implementation
 
 ### Dummy Objects
-In the `src/openllm/utils` directory, dummy objects are created for each model and runtime implementation. These specify the dependencies required for each model.
+
+In the `src/openllm/utils` directory, dummy objects are created for each model
+and runtime implementation. These specify the dependencies required for each
+model.
+
+### Updating README.md
+
+Run `./tools/update-readme.py` to update the README.md file with the new model.
 
 ## Raise a Pull Request
-Once you have completed the checklist above, raise a PR and the OpenLLMs maintainer will review it ASAP. Once the PR is merged, you should be able to see your model in the next release! 🎉 🎊
\ No newline at end of file
+
+Once you have completed the checklist above, raise a PR and the OpenLLMs
+maintainer will review it ASAP. Once the PR is merged, you should be able to see
+your model in the next release! 🎉 🎊
diff --git a/README.md b/README.md
index ffba36b5..f7f39c78 100644
--- a/README.md
+++ b/README.md
@@ -97,6 +97,13 @@ start interacting with the model:
 >>> client.query('Explain to me the difference between "further" and "farther"')
 ```
 
+You can also use the `openllm query` command to query the model from the
+terminal:
+
+```bash
+openllm query --local 'Explain to me the difference between "further" and "farther"'
+```
+
 ## 🚀 Deploying to Production
 
 To deploy your LLMs into production:
@@ -131,27 +138,23 @@ To deploy your LLMs into production:
 
 OpenLLM currently supports the following:
 
-- [dolly-v2](https://github.com/databrickslabs/dolly)
-- [flan-t5](https://huggingface.co/docs/transformers/model_doc/flan-t5)
-- [chatglm](https://github.com/THUDM/ChatGLM-6B)
-- [falcon](https://falconllm.tii.ae/)
-- [starcoder](https://github.com/bigcode-project/starcoder)
+<!-- update-readme.py: start -->
 
-### Model-specific Dependencies
+| Model                                                                 | CPU | GPU | Optional                         |
+| --------------------------------------------------------------------- | --- | --- | -------------------------------- |
+| [flan-t5](https://huggingface.co/docs/transformers/model_doc/flan-t5) | ✅  | ✅  | `pip install openllm[flan-t5]`   |
+| [dolly-v2](https://github.com/databrickslabs/dolly)                   | ✅  | ✅  | 👾 (not needed)                  |
+| [chatglm](https://github.com/THUDM/ChatGLM-6B)                        | ❌  | ✅  | `pip install openllm[chatglm]`   |
+| [starcoder](https://github.com/bigcode-project/starcoder)             | ❌  | ✅  | `pip install openllm[starcoder]` |
+| [falcon](https://falconllm.tii.ae/)                                   | ❌  | ✅  | `pip install openllm[falcon]`    |
+| [stablelm](https://github.com/Stability-AI/StableLM)                  | ✅  | ✅  | 👾 (not needed)                  |
 
-We respect your system's space and efficiency. That's why we don't force users
-to install dependencies for all models. By default, you can run `dolly-v2` and
-`flan-t5` without installing any additional packages.
+> NOTE: We respect users' system disk space. Hence, OpenLLM doesn't enforce to
+> install dependencies to run all models. If one wishes to use any of the
+> aforementioned models, make sure to install the optional dependencies
+> mentioned above.
 
-To enable support for a specific model, you'll need to install its corresponding
-dependencies. You can do this by using `pip install "openllm[model_name]"`. For
-example, to use **chatglm**:
-
-```bash
-pip install "openllm[chatglm]"
-```
-
-This will install `cpm_kernels` and `sentencepiece` additionally
+<!-- update-readme.py: stop -->
 
 ### Runtime Implementations
 
diff --git a/pyproject.toml b/pyproject.toml
index 1282f7c4..dd7048e2 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -98,6 +98,7 @@ dependencies = [
     "pytest-randomly",
     "pytest-rerunfailures",
     "pre-commit",
+    "tomlkit",
 ]
 [tool.hatch.envs.default.scripts]
 cov = ["test-cov", "cov-report"]
diff --git a/src/openllm/_configuration.py b/src/openllm/_configuration.py
index 5d1fa75e..355fcfec 100644
--- a/src/openllm/_configuration.py
+++ b/src/openllm/_configuration.py
@@ -649,6 +649,9 @@ class LLMConfig:
         __openllm_hints__: dict[str, t.Any] = Field(None, init=False)
         """An internal cache of resolved types for this LLMConfig."""
 
+        __openllm_url__: str = Field(None, init=False)
+        """The resolved url for this LLMConfig."""
+
         GenerationConfig: type = type
         """Users can override this subclass of any given LLMConfig to provide GenerationConfig
         default value. For example:
@@ -678,6 +681,7 @@ class LLMConfig:
         default_timeout: int | None = None,
         trust_remote_code: bool = False,
         requires_gpu: bool = False,
+        url: str | None = None,
     ):
         if name_type == "dasherize":
             model_name = inflection.underscore(cls.__name__.replace("Config", ""))
@@ -694,6 +698,7 @@ class LLMConfig:
         cls.__openllm_model_name__ = model_name
         cls.__openllm_start_name__ = start_name
         cls.__openllm_env__ = openllm.utils.ModelEnv(model_name)
+        cls.__openllm_url__ = url or "(not set)"
 
         # NOTE: Since we want to enable a pydantic-like experience
         # this means we will have to hide the attr abstraction, and generate
diff --git a/src/openllm/models/chatglm/configuration_chatglm.py b/src/openllm/models/chatglm/configuration_chatglm.py
index 2e06afdb..8c742729 100644
--- a/src/openllm/models/chatglm/configuration_chatglm.py
+++ b/src/openllm/models/chatglm/configuration_chatglm.py
@@ -22,6 +22,7 @@ class ChatGLMConfig(
     trust_remote_code=True,
     default_timeout=3600000,
     requires_gpu=True,
+    url="https://github.com/THUDM/ChatGLM-6B",
 ):
     """
     ChatGLM is an open bilingual language model based on
diff --git a/src/openllm/models/dolly_v2/configuration_dolly_v2.py b/src/openllm/models/dolly_v2/configuration_dolly_v2.py
index 5faaa599..38aa9881 100644
--- a/src/openllm/models/dolly_v2/configuration_dolly_v2.py
+++ b/src/openllm/models/dolly_v2/configuration_dolly_v2.py
@@ -20,7 +20,12 @@ from __future__ import annotations
 import openllm
 
 
-class DollyV2Config(openllm.LLMConfig, default_timeout=3600000, trust_remote_code=True):
+class DollyV2Config(
+    openllm.LLMConfig,
+    default_timeout=3600000,
+    trust_remote_code=True,
+    url="https://github.com/databrickslabs/dolly",
+):
     """Databricks’ Dolly is an instruction-following large language model trained on the Databricks
     machine learning platform that is licensed for commercial use.
 
diff --git a/src/openllm/models/falcon/configuration_falcon.py b/src/openllm/models/falcon/configuration_falcon.py
index 3d5a63b2..36818860 100644
--- a/src/openllm/models/falcon/configuration_falcon.py
+++ b/src/openllm/models/falcon/configuration_falcon.py
@@ -22,6 +22,7 @@ class FalconConfig(
     trust_remote_code=True,
     requires_gpu=True,
     default_timeout=3600000,
+    url="https://falconllm.tii.ae/",
 ):
     """Falcon-7B is a 7B parameters causal decoder-only model built by
     TII and trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)
diff --git a/src/openllm/models/flan_t5/configuration_flan_t5.py b/src/openllm/models/flan_t5/configuration_flan_t5.py
index ca5354ea..9f0584e8 100644
--- a/src/openllm/models/flan_t5/configuration_flan_t5.py
+++ b/src/openllm/models/flan_t5/configuration_flan_t5.py
@@ -40,7 +40,7 @@ saved pretrained, or a fine-tune FLAN-T5, provide ``OPENLLM_FLAN_T5_PRETRAINED='
 DEFAULT_PROMPT_TEMPLATE = """Answer the following question:\nQuestion: {instruction}\nAnswer:"""
 
 
-class FlanT5Config(openllm.LLMConfig):
+class FlanT5Config(openllm.LLMConfig, url="https://huggingface.co/docs/transformers/model_doc/flan-t5"):
     """FLAN-T5 was released in the paper [Scaling Instruction-Finetuned Language Models](https://arxiv.org/pdf/2210.11416.pdf)
     - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.
 
diff --git a/src/openllm/models/stablelm/configuration_stablelm.py b/src/openllm/models/stablelm/configuration_stablelm.py
index fbab48a0..05089715 100644
--- a/src/openllm/models/stablelm/configuration_stablelm.py
+++ b/src/openllm/models/stablelm/configuration_stablelm.py
@@ -16,7 +16,7 @@ from __future__ import annotations
 import openllm
 
 
-class StableLMConfig(openllm.LLMConfig, name_type="lowercase"):
+class StableLMConfig(openllm.LLMConfig, name_type="lowercase", url="https://github.com/Stability-AI/StableLM"):
     """StableLM-Base-Alpha is a suite of 3B and 7B parameter decoder-only language models
     pre-trained on a diverse collection of English datasets with a sequence
     length of 4096 to push beyond the context window limitations of existing open-source language models.
diff --git a/src/openllm/models/starcoder/configuration_starcoder.py b/src/openllm/models/starcoder/configuration_starcoder.py
index 259146b4..c210392f 100644
--- a/src/openllm/models/starcoder/configuration_starcoder.py
+++ b/src/openllm/models/starcoder/configuration_starcoder.py
@@ -16,7 +16,12 @@ from __future__ import annotations
 import openllm
 
 
-class StarCoderConfig(openllm.LLMConfig, name_type="lowercase", requires_gpu=True):
+class StarCoderConfig(
+    openllm.LLMConfig,
+    name_type="lowercase",
+    requires_gpu=True,
+    url="https://github.com/bigcode-project/starcoder",
+):
     """The StarCoder models are 15.5B parameter models trained on 80+ programming languages from
     [The Stack (v1.2)](https://huggingface.co/datasets/bigcode/the-stack), with opt-out requests excluded.
 
diff --git a/tools/update-readme.py b/tools/update-readme.py
new file mode 100755
index 00000000..7c75daf2
--- /dev/null
+++ b/tools/update-readme.py
@@ -0,0 +1,95 @@
+#!/usr/bin/env python3
+
+from __future__ import annotations
+
+import os
+import typing as t
+
+import inflection
+import tomlkit
+
+import openllm
+
+START_COMMENT = f"<!-- {os.path.basename(__file__)}: start -->\n"
+END_COMMENT = f"<!-- {os.path.basename(__file__)}: stop -->\n"
+
+ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+
+
+def main() -> int:
+    with open(os.path.join(ROOT, "pyproject.toml"), "r") as f:
+        deps = tomlkit.parse(f.read()).value["project"]["optional-dependencies"]
+
+    with open(os.path.join(ROOT, "README.md"), "r") as f:
+        readme = f.readlines()
+
+    start_index, stop_index = readme.index(START_COMMENT), readme.index(END_COMMENT)
+    formatted: dict[t.Literal["Model", "CPU", "GPU", "Optional"], list[str]] = {
+        "Model": [],
+        "CPU": [],
+        "GPU": [],
+        "Optional": [],
+    }
+    max_name_len_div = 0
+    max_install_len_div = 0
+    does_not_need_custom_installation: list[str] = []
+    for name, config in openllm.CONFIG_MAPPING.items():
+        dashed = inflection.dasherize(name)
+        model_name = f"[{dashed}]({config.__openllm_url__})"
+        if len(model_name) > max_name_len_div:
+            max_name_len_div = len(model_name)
+        formatted["Model"].append(model_name)
+        formatted["GPU"].append("✅")
+        formatted["CPU"].append("✅" if not config.__openllm_requires_gpu__ else "❌")
+        instruction = "👾 (not needed)"
+        if dashed in deps:
+            instruction = f"`pip install openllm[{dashed}]`"
+        else:
+            does_not_need_custom_installation.append(model_name)
+        if len(instruction) > max_install_len_div:
+            max_install_len_div = len(instruction)
+        formatted["Optional"].append(instruction)
+
+    meta = ["\n"]
+
+    # NOTE: headers
+    meta += f"| Model {' ' * (max_name_len_div - 6)} | CPU | GPU | Optional {' ' * (max_install_len_div - 8)}|\n"
+    # NOTE: divs
+    meta += f"| {'-' * max_name_len_div}" + " | --- | --- | " + f"{'-' * max_install_len_div} |\n"
+    # NOTE: rows
+    for links, cpu, gpu, custom_installation in t.cast("tuple[str, str, str, str]", zip(*formatted.values())):
+        meta += (
+            "| "
+            + links
+            + " " * (max_name_len_div - len(links))
+            + f" | {cpu}  | {gpu}  | "
+            + custom_installation
+            + " "
+            * (
+                max_install_len_div
+                - len(custom_installation)
+                - (0 if links not in does_not_need_custom_installation else 1)
+            )
+            + " |\n"
+        )
+    meta += "\n"
+
+    # NOTE: adding notes
+    meta += """\
+> NOTE: We respect users' system disk space. Hence, OpenLLM doesn't enforce to
+> install dependencies to run all models. If one wishes to use any of the
+> aforementioned models, make sure to install the optional dependencies
+> mentioned above.
+
+"""
+
+    readme = readme[:start_index] + [START_COMMENT] + meta + [END_COMMENT] + readme[stop_index + 1 :]
+
+    with open(os.path.join(ROOT, "README.md"), "w") as f:
+        f.writelines(readme)
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())