From 4443d3ce9ac507c2d3fa48a239d3198f9478f92b Mon Sep 17 00:00:00 2001 From: Alex Cheema Date: Wed, 2 Oct 2024 16:39:19 +0400 Subject: [PATCH] update README with docs on exo run command --- README.md | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index d729e16e..f8430e5d 100644 --- a/README.md +++ b/README.md @@ -130,13 +130,13 @@ exo starts a ChatGPT-like WebUI (powered by [tinygrad tinychat](https://github.c For developers, exo also starts a ChatGPT-compatible API endpoint on http://localhost:8000/v1/chat/completions. Examples with curl: -#### Llama 3.1 8B: +#### Llama 3.2 3B: ```sh curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ - "model": "llama-3.1-8b", + "model": "llama-3.2-3b", "messages": [{"role": "user", "content": "What is the meaning of exo?"}], "temperature": 0.7 }' @@ -201,6 +201,17 @@ Linux devices will automatically default to using the **tinygrad** inference eng You can read about tinygrad-specific env vars [here](https://docs.tinygrad.org/env_vars/). For example, you can configure tinygrad to use the cpu by specifying `CLANG=1`. +### Example Usage on a single device with "exo run" command + +```sh +exo run llama-3.2-3b +``` + +With a custom prompt: + +```sh +exo run llama-3.2-3b --prompt "What is the meaning of exo?" +``` ## Debugging