diff --git a/openllm-python/README.md b/openllm-python/README.md index cf7afaab..8ba82b92 100644 --- a/openllm-python/README.md +++ b/openllm-python/README.md @@ -22,7 +22,7 @@ Hatch - code style + code style Ruff @@ -407,17 +407,19 @@ pip install "openllm[baichuan]" ### Runtime Implementations (Experimental) Different LLMs may have multiple runtime implementations. For instance, they -might use Pytorch (`pt`), Tensorflow (`tf`), or Flax (`flax`). +might use Pytorch (`pt`), Tensorflow (`tf`), Flax (`flax`) or vLLM (`vllm`). If you wish to specify a particular runtime for a model, you can do so by -setting the `OPENLLM_{MODEL_NAME}_FRAMEWORK={runtime}` environment variable +setting the `OPENLLM_BACKEND={runtime}` environment variable before running `openllm start`. For example, if you want to use the Tensorflow (`tf`) implementation for the `flan-t5` model, you can use the following command: ```bash -OPENLLM_FLAN_T5_FRAMEWORK=tf openllm start flan-t5 +OPENLLM_BACKEND=tf openllm start flan-t5 + +openllm start flan-t5 --backend tf ``` > [!NOTE] @@ -425,6 +427,9 @@ OPENLLM_FLAN_T5_FRAMEWORK=tf openllm start flan-t5 > [Jax's installation](https://github.com/google/jax#pip-installation-gpu-cuda-installed-via-pip-easier) > to make sure that you have Jax support for the corresponding CUDA version. +> [!IMPORTANT] +> To use vLLM backend, at least a GPU with Ampere or newer architecture and CUDA 11.8 is required. + ### Quantisation OpenLLM supports quantisation with @@ -543,10 +548,10 @@ client.embed("I like to eat apples") The following UIs are currently available for OpenLLM: -| UI | Owner | Type | Progress | -|-------------------------------------------------------------------------------------------|----------------------------------------------|----------------------|----------| -| [Clojure](https://github.com/bentoml/OpenLLM/blob/main/openllm-contrib/clojure/README.md) | [@GutZuFusss](https://github.com/GutZuFusss) | Community-maintained | 🔧 | -| TS | BentoML Team | | 🚧 | +| UI | Owner | Type | Progress | +|-----------------------------------------------------------------------------------|-----------------------------------------------|----------------------|----------| +| [Clojure](https://github.com/bentoml/OpenLLM/blob/main/contrib/clojure/README.md) | [@GutZuFusss](https://github.com/GutZuFusss) | Community-maintained | 🔧 | +| TS | BentoML Team | | 🚧 | ## ⚙️ Integrations