From 0fdfe786f340574a00b1564dc9bbd019086d17aa Mon Sep 17 00:00:00 2001
From: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Date: Thu, 16 Nov 2023 16:24:43 -0500
Subject: [PATCH] docs: add LlamaIndex integration (#646)

* docs: add LlamaIndex integration

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>

* ci: auto fixes from pre-commit.ci

For more information, see https://pre-commit.ci

---------

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
---
 README.md | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index d05a6e33..6cb1d6dd 100644
--- a/README.md
+++ b/README.md
@@ -1296,6 +1296,7 @@ OpenLLM is not just a standalone product; it's a building block designed to
 integrate with other powerful tools easily. We currently offer integration with
 [BentoML](https://github.com/bentoml/BentoML),
 [OpenAI's Compatible Endpoints](https://platform.openai.com/docs/api-reference/completions/object),
+[LlamaIndex](https://www.llamaindex.ai/),
 [LangChain](https://github.com/hwchase17/langchain), and
 [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents).
 
@@ -1340,6 +1341,33 @@ async def prompt(input_text: str) -> str:
   return generation.outputs[0].text
 ```
 
+### [LlamaIndex](https://docs.llamaindex.ai/en/stable/module_guides/models/llms/modules.html#openllm)
+
+To start a local LLM with `llama_index`, simply use `llama_index.llms.openllm.OpenLLM`:
+
+```python
+import asyncio
+from llama_index.llms.openllm import OpenLLM
+
+llm = OpenLLM('HuggingFaceH4/zephyr-7b-alpha')
+
+llm.complete("The meaning of life is")
+
+async def main(prompt, **kwargs):
+  async for it in llm.astream_chat(prompt, **kwargs): print(it)
+
+asyncio.run(main("The time at San Francisco is"))
+```
+
+If there is a remote LLM Server running elsewhere, then you can use `llama_index.llms.openllm.OpenLLMAPI`:
+
+```python
+from llama_index.llms.openllm import OpenLLMAPI
+```
+
+> [!NOTE]
+> All synchronous and asynchronous API from `llama_index.llms.LLM` are supported.
+
 ### [LangChain](https://python.langchain.com/docs/ecosystem/integrations/openllm)
 
 To quickly start a local LLM with `langchain`, simply do the following:
@@ -1372,7 +1400,6 @@ To integrate a LangChain agent with BentoML, you can do the following:
 
 ```python
 llm = OpenLLM(
-    model_name='flan-t5',
     model_id='google/flan-t5-large',
     embedded=False,
     serialisation="legacy"