From 5e97329bcb1ef1fbe254f01fbe3a070f3c04c0b6 Mon Sep 17 00:00:00 2001 From: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Date: Thu, 23 May 2024 12:50:01 -0400 Subject: [PATCH] infra: prepare 0.5 releases (#996) * chore: prepare for 0.5 Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update changelogs Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: fix to lowest python version supported Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> * chore: update scripts Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> --- .python-version-default | 2 +- README.md | 125 ++---------------------------------- changelog.d/996.breaking.md | 7 ++ openllm-python/README.md | 125 ++---------------------------------- 4 files changed, 22 insertions(+), 237 deletions(-) create mode 100644 changelog.d/996.breaking.md diff --git a/.python-version-default b/.python-version-default index 2c073331..bd28b9c5 100644 --- a/.python-version-default +++ b/.python-version-default @@ -1 +1 @@ -3.11 +3.9 diff --git a/README.md b/README.md index 412c99c1..040f9ea2 100644 --- a/README.md +++ b/README.md @@ -23,13 +23,11 @@ OpenLLM helps developers **run any open-source LLMs**, such as Llama 2 and Mistral, as **OpenAI-compatible API endpoints**, locally and in the cloud, optimized for serving throughput and production deployment. - - πŸš‚ Support a wide range of open-source LLMs including LLMs fine-tuned with your own data - ⛓️ OpenAI compatible API endpoints for seamless transition from your LLM app to open-source LLMs - πŸ”₯ State-of-the-art serving and inference performance - 🎯 Simplified cloud deployment via [BentoML](https://www.bentoml.com) - ![Gif showing OpenLLM Intro](/.github/assets/output.gif) @@ -46,29 +44,13 @@ For starter, we provide two ways to quickly try out OpenLLM: Try this [OpenLLM tutorial in Google Colab: Serving Llama 2 with OpenLLM](https://colab.research.google.com/github/bentoml/OpenLLM/blob/main/examples/llama2.ipynb). -### Docker - -We provide a docker container that helps you start running OpenLLM: - -```bash -docker run --rm -it -p 3000:3000 ghcr.io/bentoml/openllm start facebook/opt-1.3b --backend pt -``` - -> [!NOTE] -> Given you have access to GPUs and have setup [nvidia-docker](https://github.com/NVIDIA/nvidia-container-toolkit), you can additionally pass in `--gpus` -> to use GPU for faster inference and optimization ->```bash -> docker run --rm --gpus all -p 3000:3000 -it ghcr.io/bentoml/openllm start HuggingFaceH4/zephyr-7b-beta --backend vllm -> ``` - - ## πŸƒ Get started The following provides instructions for how to get started with OpenLLM locally. ### Prerequisites -You have installed Python 3.8 (or later) andΒ `pip`. We highly recommend using a [Virtual Environment](https://docs.python.org/3/library/venv.html) to prevent package conflicts. +You have installed Python 3.9 (or later) andΒ `pip`. We highly recommend using a [Virtual Environment](https://docs.python.org/3/library/venv.html) to prevent package conflicts. ### Install OpenLLM @@ -82,65 +64,23 @@ To verify the installation, run: ```bash $ openllm -h - -Usage: openllm [OPTIONS] COMMAND [ARGS]... - - β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—β–ˆβ–ˆβ•— β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ•— - β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ•‘ - β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β–ˆβ–ˆβ–ˆβ–ˆβ•”β–ˆβ–ˆβ•‘ - β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β•β• β–ˆβ–ˆβ•”β•β•β• β–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘ - β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘ β•šβ•β• β–ˆβ–ˆβ•‘ - β•šβ•β•β•β•β•β• β•šβ•β• β•šβ•β•β•β•β•β•β•β•šβ•β• β•šβ•β•β•β•β•šβ•β•β•β•β•β•β•β•šβ•β•β•β•β•β•β•β•šβ•β• β•šβ•β•. - - An open platform for operating large language models in production. - Fine-tune, serve, deploy, and monitor any LLMs with ease. - -Options: - -v, --version Show the version and exit. - -h, --help Show this message and exit. - -Commands: - build Package a given models into a BentoLLM. - import Setup LLM interactively. - models List all supported models. - prune Remove all saved models, (and optionally bentos) built with OpenLLM locally. - query Query a LLM interactively, from a terminal. - start Start a LLMServer for any supported LLM. - -Extensions: - build-base-container Base image builder for BentoLLM. - dive-bentos Dive into a BentoLLM. - get-containerfile Return Containerfile of any given Bento. - get-prompt Get the default prompt used by OpenLLM. - list-bentos List available bentos built by OpenLLM. - list-models This is equivalent to openllm models... - playground OpenLLM Playground. ``` ### Start a LLM server -OpenLLM allows you to quickly spin up an LLM server using `openllm start`. For example, to start aΒ [phi-2](https://huggingface.co/microsoft/phi-2)Β server, run the following: +OpenLLM allows you to quickly spin up an LLM server using `openllm start`. For example, to start aΒ [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)Β server, run the following: ```bash -TRUST_REMOTE_CODE=True openllm start microsoft/phi-2 +openllm start meta-llama/Meta-Llama-3-8B ``` -This starts the server atΒ [http://0.0.0.0:3000/](http://0.0.0.0:3000/). OpenLLM downloads the model to the BentoML local Model Store if it has not been registered before. To view your local models, run `bentoml models list`. - To interact with the server, you can visit the web UI atΒ [http://0.0.0.0:3000/](http://0.0.0.0:3000/) or send a request usingΒ `curl`. You can also use OpenLLM’s built-in Python client to interact with the server: ```python import openllm -client = openllm.client.HTTPClient('http://localhost:3000') -client.query('Explain to me the difference between "further" and "farther"') -``` - -Alternatively, use theΒ `openllm query`Β command to query the model: - -```bash -export OPENLLM_ENDPOINT=http://localhost:3000 -openllm query 'Explain to me the difference between "further" and "farther"' +client = openllm.HTTPClient('http://localhost:3000') +client.generate('Explain to me the difference between "further" and "farther"') ``` OpenLLM seamlessly supports many models and their variants. You can specify different variants of the model to be served. For example: @@ -155,15 +95,6 @@ openllm start -- > architecture. Use theΒ `openllm models`Β command to see the complete list of supported > models, their architectures, and their variants. -> [!IMPORTANT] -> If you are testing OpenLLM on CPU, you might want to pass in `DTYPE=float32`. By default, -> OpenLLM will set model `dtype` to `bfloat16` for the best performance. -> ```bash -> DTYPE=float32 openllm start microsoft/phi-2 -> ``` -> This will also applies to older GPUs. If your GPUs doesn't support `bfloat16`, then you also -> want to set `DTYPE=float16`. - ## 🧩 Supported models OpenLLM currently supports the following models. By default, OpenLLM doesn't include dependencies to run all models. The extra model-specific dependencies can be installed with the instructions below. @@ -1097,7 +1028,6 @@ openllm build facebook/opt-6.7b --adapter-id ./path/to/adapter_id --build-ctx . > [!IMPORTANT] > Fine-tuning support is still experimental and currently only works with PyTorch backend. vLLM support is coming soon. - ## βš™οΈ Integrations OpenLLM is not just a standalone product; it's a building block designed to @@ -1115,11 +1045,9 @@ specify the base_url to `llm-endpoint/v1` and you are good to go: ```python import openai -client = openai.OpenAI( - base_url='http://localhost:3000/v1', api_key='na' -) # Here the server is running on localhost:3000 +client = openai.OpenAI(base_url='http://localhost:3000/v1', api_key='na') # Here the server is running on 0.0.0.0:3000 -completions = client.completions.create( +completions = client.chat.completions.create( prompt='Write me a tag line for an ice cream shop.', model=model, max_tokens=64, stream=stream ) ``` @@ -1130,7 +1058,6 @@ The compatible endpoints supports `/completions`, `/chat/completions`, and `/mod > You can find out OpenAI example clients under the > [examples](https://github.com/bentoml/OpenLLM/tree/main/examples) folder. - ### [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/openllm/) To start a local LLM with `llama_index`, simply use `llama_index.llms.openllm.OpenLLM`: @@ -1172,24 +1099,6 @@ llm = OpenLLM(server_url='http://44.23.123.1:3000', server_type='http') llm('What is the difference between a duck and a goose? And why there are so many Goose in Canada?') ``` -### Transformers Agents - -OpenLLM seamlessly integrates with -[Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents). - -> [!WARNING] -> The Transformers Agent is still at an experimental stage. It is -> recommended to install OpenLLM with `pip install -r nightly-requirements.txt` -> to get the latest API update for HuggingFace agent. - -```python -import transformers - -agent = transformers.HfAgent('http://localhost:3000/hf/agent') # URL that runs the OpenLLM server - -agent.run('Is the following `text` positive or negative?', text="I don't like how this models is generate inputs") -``` - ![Gif showing Agent integration](/.github/assets/agent.gif) @@ -1280,26 +1189,6 @@ Checkout our [Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md) if you wish to contribute to OpenLLM's codebase. -## πŸ‡ Telemetry - -OpenLLM collects usage data to enhance user experience and improve the product. -We only report OpenLLM's internal API calls and ensure maximum privacy by -excluding sensitive information. We will never collect user code, model data, or -stack traces. For usage tracking, check out the -[code](https://github.com/bentoml/OpenLLM/blob/main/openllm-core/src/openllm_core/utils/analytics.py). - -You can opt out of usage tracking by using the `--do-not-track` CLI option: - -```bash -openllm [command] --do-not-track -``` - -Or by setting the environment variable `OPENLLM_DO_NOT_TRACK=True`: - -```bash -export OPENLLM_DO_NOT_TRACK=True -``` - ## πŸ“” Citation If you use OpenLLM in your research, we provide a [citation](./CITATION.cff) to diff --git a/changelog.d/996.breaking.md b/changelog.d/996.breaking.md new file mode 100644 index 00000000..e091084f --- /dev/null +++ b/changelog.d/996.breaking.md @@ -0,0 +1,7 @@ +Now, OpenLLM is compatible with BentoML 1.2 and above architecture. + +Additionally, `openllm` CLI will only offer `start` and `build` to simplify the workflow. + +OpenLLM will also now require vllm by default, and CPU support is currently turning off. We will look into supporting CPU in later version as our main focus is on accelerator. + +Python API is also considered deprecated and internal only. If you are using this in your old service, make sure to set `IMPLEMENTATION=deprecated` as environment variable to avoid breaking changes. We recommend users to upgrade to BentoML 1.2. diff --git a/openllm-python/README.md b/openllm-python/README.md index 412c99c1..040f9ea2 100644 --- a/openllm-python/README.md +++ b/openllm-python/README.md @@ -23,13 +23,11 @@ OpenLLM helps developers **run any open-source LLMs**, such as Llama 2 and Mistral, as **OpenAI-compatible API endpoints**, locally and in the cloud, optimized for serving throughput and production deployment. - - πŸš‚ Support a wide range of open-source LLMs including LLMs fine-tuned with your own data - ⛓️ OpenAI compatible API endpoints for seamless transition from your LLM app to open-source LLMs - πŸ”₯ State-of-the-art serving and inference performance - 🎯 Simplified cloud deployment via [BentoML](https://www.bentoml.com) - ![Gif showing OpenLLM Intro](/.github/assets/output.gif) @@ -46,29 +44,13 @@ For starter, we provide two ways to quickly try out OpenLLM: Try this [OpenLLM tutorial in Google Colab: Serving Llama 2 with OpenLLM](https://colab.research.google.com/github/bentoml/OpenLLM/blob/main/examples/llama2.ipynb). -### Docker - -We provide a docker container that helps you start running OpenLLM: - -```bash -docker run --rm -it -p 3000:3000 ghcr.io/bentoml/openllm start facebook/opt-1.3b --backend pt -``` - -> [!NOTE] -> Given you have access to GPUs and have setup [nvidia-docker](https://github.com/NVIDIA/nvidia-container-toolkit), you can additionally pass in `--gpus` -> to use GPU for faster inference and optimization ->```bash -> docker run --rm --gpus all -p 3000:3000 -it ghcr.io/bentoml/openllm start HuggingFaceH4/zephyr-7b-beta --backend vllm -> ``` - - ## πŸƒ Get started The following provides instructions for how to get started with OpenLLM locally. ### Prerequisites -You have installed Python 3.8 (or later) andΒ `pip`. We highly recommend using a [Virtual Environment](https://docs.python.org/3/library/venv.html) to prevent package conflicts. +You have installed Python 3.9 (or later) andΒ `pip`. We highly recommend using a [Virtual Environment](https://docs.python.org/3/library/venv.html) to prevent package conflicts. ### Install OpenLLM @@ -82,65 +64,23 @@ To verify the installation, run: ```bash $ openllm -h - -Usage: openllm [OPTIONS] COMMAND [ARGS]... - - β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—β–ˆβ–ˆβ•— β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ•— - β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ•‘ - β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β–ˆβ–ˆβ•— β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β–ˆβ–ˆβ–ˆβ–ˆβ•”β–ˆβ–ˆβ•‘ - β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β•β• β–ˆβ–ˆβ•”β•β•β• β–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘β•šβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘ - β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘ β•šβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘ β•šβ•β• β–ˆβ–ˆβ•‘ - β•šβ•β•β•β•β•β• β•šβ•β• β•šβ•β•β•β•β•β•β•β•šβ•β• β•šβ•β•β•β•β•šβ•β•β•β•β•β•β•β•šβ•β•β•β•β•β•β•β•šβ•β• β•šβ•β•. - - An open platform for operating large language models in production. - Fine-tune, serve, deploy, and monitor any LLMs with ease. - -Options: - -v, --version Show the version and exit. - -h, --help Show this message and exit. - -Commands: - build Package a given models into a BentoLLM. - import Setup LLM interactively. - models List all supported models. - prune Remove all saved models, (and optionally bentos) built with OpenLLM locally. - query Query a LLM interactively, from a terminal. - start Start a LLMServer for any supported LLM. - -Extensions: - build-base-container Base image builder for BentoLLM. - dive-bentos Dive into a BentoLLM. - get-containerfile Return Containerfile of any given Bento. - get-prompt Get the default prompt used by OpenLLM. - list-bentos List available bentos built by OpenLLM. - list-models This is equivalent to openllm models... - playground OpenLLM Playground. ``` ### Start a LLM server -OpenLLM allows you to quickly spin up an LLM server using `openllm start`. For example, to start aΒ [phi-2](https://huggingface.co/microsoft/phi-2)Β server, run the following: +OpenLLM allows you to quickly spin up an LLM server using `openllm start`. For example, to start aΒ [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)Β server, run the following: ```bash -TRUST_REMOTE_CODE=True openllm start microsoft/phi-2 +openllm start meta-llama/Meta-Llama-3-8B ``` -This starts the server atΒ [http://0.0.0.0:3000/](http://0.0.0.0:3000/). OpenLLM downloads the model to the BentoML local Model Store if it has not been registered before. To view your local models, run `bentoml models list`. - To interact with the server, you can visit the web UI atΒ [http://0.0.0.0:3000/](http://0.0.0.0:3000/) or send a request usingΒ `curl`. You can also use OpenLLM’s built-in Python client to interact with the server: ```python import openllm -client = openllm.client.HTTPClient('http://localhost:3000') -client.query('Explain to me the difference between "further" and "farther"') -``` - -Alternatively, use theΒ `openllm query`Β command to query the model: - -```bash -export OPENLLM_ENDPOINT=http://localhost:3000 -openllm query 'Explain to me the difference between "further" and "farther"' +client = openllm.HTTPClient('http://localhost:3000') +client.generate('Explain to me the difference between "further" and "farther"') ``` OpenLLM seamlessly supports many models and their variants. You can specify different variants of the model to be served. For example: @@ -155,15 +95,6 @@ openllm start -- > architecture. Use theΒ `openllm models`Β command to see the complete list of supported > models, their architectures, and their variants. -> [!IMPORTANT] -> If you are testing OpenLLM on CPU, you might want to pass in `DTYPE=float32`. By default, -> OpenLLM will set model `dtype` to `bfloat16` for the best performance. -> ```bash -> DTYPE=float32 openllm start microsoft/phi-2 -> ``` -> This will also applies to older GPUs. If your GPUs doesn't support `bfloat16`, then you also -> want to set `DTYPE=float16`. - ## 🧩 Supported models OpenLLM currently supports the following models. By default, OpenLLM doesn't include dependencies to run all models. The extra model-specific dependencies can be installed with the instructions below. @@ -1097,7 +1028,6 @@ openllm build facebook/opt-6.7b --adapter-id ./path/to/adapter_id --build-ctx . > [!IMPORTANT] > Fine-tuning support is still experimental and currently only works with PyTorch backend. vLLM support is coming soon. - ## βš™οΈ Integrations OpenLLM is not just a standalone product; it's a building block designed to @@ -1115,11 +1045,9 @@ specify the base_url to `llm-endpoint/v1` and you are good to go: ```python import openai -client = openai.OpenAI( - base_url='http://localhost:3000/v1', api_key='na' -) # Here the server is running on localhost:3000 +client = openai.OpenAI(base_url='http://localhost:3000/v1', api_key='na') # Here the server is running on 0.0.0.0:3000 -completions = client.completions.create( +completions = client.chat.completions.create( prompt='Write me a tag line for an ice cream shop.', model=model, max_tokens=64, stream=stream ) ``` @@ -1130,7 +1058,6 @@ The compatible endpoints supports `/completions`, `/chat/completions`, and `/mod > You can find out OpenAI example clients under the > [examples](https://github.com/bentoml/OpenLLM/tree/main/examples) folder. - ### [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/openllm/) To start a local LLM with `llama_index`, simply use `llama_index.llms.openllm.OpenLLM`: @@ -1172,24 +1099,6 @@ llm = OpenLLM(server_url='http://44.23.123.1:3000', server_type='http') llm('What is the difference between a duck and a goose? And why there are so many Goose in Canada?') ``` -### Transformers Agents - -OpenLLM seamlessly integrates with -[Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents). - -> [!WARNING] -> The Transformers Agent is still at an experimental stage. It is -> recommended to install OpenLLM with `pip install -r nightly-requirements.txt` -> to get the latest API update for HuggingFace agent. - -```python -import transformers - -agent = transformers.HfAgent('http://localhost:3000/hf/agent') # URL that runs the OpenLLM server - -agent.run('Is the following `text` positive or negative?', text="I don't like how this models is generate inputs") -``` - ![Gif showing Agent integration](/.github/assets/agent.gif) @@ -1280,26 +1189,6 @@ Checkout our [Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md) if you wish to contribute to OpenLLM's codebase. -## πŸ‡ Telemetry - -OpenLLM collects usage data to enhance user experience and improve the product. -We only report OpenLLM's internal API calls and ensure maximum privacy by -excluding sensitive information. We will never collect user code, model data, or -stack traces. For usage tracking, check out the -[code](https://github.com/bentoml/OpenLLM/blob/main/openllm-core/src/openllm_core/utils/analytics.py). - -You can opt out of usage tracking by using the `--do-not-track` CLI option: - -```bash -openllm [command] --do-not-track -``` - -Or by setting the environment variable `OPENLLM_DO_NOT_TRACK=True`: - -```bash -export OPENLLM_DO_NOT_TRACK=True -``` - ## πŸ“” Citation If you use OpenLLM in your research, we provide a [citation](./CITATION.cff) to