OpenLLM

mirror of https://github.com/bentoml/OpenLLM.git synced 2026-01-21 14:02:20 -05:00

Files

Aaron Pham e01f93f0c3 examples: improve instructions and cleanup simple API server (#684 )

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

2023-11-17 11:53:56 -05:00

langchain-chains-demo

infra: using ruff formatter (#594 )

2023-11-09 12:44:05 -05:00

langchain-tools-demo

fix(examples): correct dependencies in requirements.txt [skip ci] (#575 )

2023-11-07 23:02:53 -05:00

openllm-llama2-demo

fix(examples): update notebook with new API (#662 )

2023-11-15 22:28:40 -05:00

api_server.py

examples: improve instructions and cleanup simple API server (#684 )

2023-11-17 11:53:56 -05:00

bentofile.yaml

examples: improve instructions and cleanup simple API server (#684 )

2023-11-17 11:53:56 -05:00

openai_chat_completion_client.py

feat(client): support authentication token and shim implementation (#605 )

2023-11-10 17:44:31 -05:00

openai_completion_client.py

infra: using ruff formatter (#594 )

2023-11-09 12:44:05 -05:00

README.md

examples: improve instructions and cleanup simple API server (#684 )

2023-11-17 11:53:56 -05:00

README.md

Examples with OpenLLM

You can find the following examples to interact with OpenLLM features. See more here

OpenAI-compatible endpoints

The openai_completion_client.py demos how to use the OpenAI-compatible /v1/completions to generate text.

export OPENLLM_ENDPOINT=https://api.openllm.com
python openai_completion_client.py

# For streaming set STREAM=True
STREAM=True python openai_completion_client.py

The openai_chat_completion_client.py demos how to use the OpenAI-compatible /v1/chat/completions to chat with a model.

export OPENLLM_ENDPOINT=https://api.openllm.com
python openai_chat_completion_client.py

# For streaming set STREAM=True
STREAM=True python openai_chat_completion_client.py

TinyLLM

The api_server.py demos how one can easily write production-ready BentoML service with OpenLLM and vLLM.

Install requirements:

pip install -U "openllm[vllm]"

To serve the Bento (given you have access to GPU):

bentoml serve api_server:svc

To build the Bento do the following:

bentoml build -f bentofile.yaml .