|
|
|
@@ -5,9 +5,6 @@
|
|
|
|
|
<br>
|
|
|
|
|
</h1>
|
|
|
|
|
|
|
|
|
|
> :warning: This project has been renamed from `llama-cli` to `LocalAI` to reflect the fact that we are focusing on a fast drop-in OpenAI API rather than on the CLI interface. We think that there are already many projects that can be used as a CLI interface already, for instance [llama.cpp](https://github.com/ggerganov/llama.cpp) and [gpt4all](https://github.com/nomic-ai/gpt4all). If you are using `llama-cli` for CLI interactions and want to keep using it, use older versions or please open up an issue - contributions are welcome!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml) [](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml)
|
|
|
|
|
|
|
|
|
|
[](https://discord.gg/uJAeKSAGDy)
|
|
|
|
@@ -20,7 +17,16 @@
|
|
|
|
|
- Support for prompt templates
|
|
|
|
|
- Doesn't shell-out, but uses C bindings for a faster inference and better performance. Uses [go-llama.cpp](https://github.com/go-skynet/go-llama.cpp) and [go-gpt4all-j.cpp](https://github.com/go-skynet/go-gpt4all-j.cpp).
|
|
|
|
|
|
|
|
|
|
Reddit post: https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai_openai_compatible_api_to_run_llm_models/
|
|
|
|
|
LocalAI is a community-driven project, focused on making the AI accessible to anyone. Any contribution, feedback and PR is welcome! It was initially created by [mudler](https://github.com/mudler/) at the [SpectroCloud OSS Office](https://github.com/spectrocloud).
|
|
|
|
|
|
|
|
|
|
### Socials and community chatter
|
|
|
|
|
- Follow [@LocalAI_API](https://twitter.com/LocalAI_API) on twitter.
|
|
|
|
|
|
|
|
|
|
- [Reddit post](https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai_openai_compatible_api_to_run_llm_models/) about LocalAI.
|
|
|
|
|
|
|
|
|
|
- [Hacker news post](https://news.ycombinator.com/item?id=35726934) - help us out by voting if you like this project.
|
|
|
|
|
|
|
|
|
|
- [Tutorial to use k8sgpt with LocalAI](https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65) - excellent usecase for localAI, using AI to analyse Kubernetes clusters.
|
|
|
|
|
|
|
|
|
|
## Model compatibility
|
|
|
|
|
|
|
|
|
@@ -116,7 +122,9 @@ To build locally, run `make build` (see below).
|
|
|
|
|
|
|
|
|
|
## Other examples
|
|
|
|
|
|
|
|
|
|
To see other examples on how to integrate with other projects, see: [examples](https://github.com/go-skynet/LocalAI/tree/master/examples/).
|
|
|
|
|

|
|
|
|
|
|
|
|
|
|
To see other examples on how to integrate with other projects for instance chatbot-ui, see: [examples](https://github.com/go-skynet/LocalAI/tree/master/examples/).
|
|
|
|
|
|
|
|
|
|
## Prompt templates
|
|
|
|
|
|
|
|
|
@@ -138,6 +146,36 @@ See the [prompt-templates](https://github.com/go-skynet/LocalAI/tree/master/prom
|
|
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
|
|
|
|
Currently LocalAI comes as container images and can be used with docker or a containre engine of choice.
|
|
|
|
|
|
|
|
|
|
### Run LocalAI in Kubernetes
|
|
|
|
|
|
|
|
|
|
LocalAI can be installed inside Kubernetes with helm.
|
|
|
|
|
|
|
|
|
|
<details>
|
|
|
|
|
The local-ai Helm chart supports two options for the LocalAI server's models directory:
|
|
|
|
|
1. Basic deployment with no persistent volume. You must manually update the Deployment to configure your own models directory.
|
|
|
|
|
|
|
|
|
|
Install the chart with `.Values.deployment.volumes.enabled == false` and `.Values.dataVolume.enabled == false`.
|
|
|
|
|
|
|
|
|
|
2. Advanced, two-phase deployment to provision the models directory using a DataVolume. Requires [Containerized Data Importer CDI](https://github.com/kubevirt/containerized-data-importer) to be pre-installed in your cluster.
|
|
|
|
|
|
|
|
|
|
First, install the chart with `.Values.deployment.volumes.enabled == false` and `.Values.dataVolume.enabled == true`:
|
|
|
|
|
```bash
|
|
|
|
|
helm install local-ai charts/local-ai -n local-ai --create-namespace
|
|
|
|
|
```
|
|
|
|
|
Wait for CDI to create an importer Pod for the DataVolume and for the importer pod to finish provisioning the model archive inside the PV.
|
|
|
|
|
|
|
|
|
|
Once the PV is provisioned and the importer Pod removed, set `.Values.deployment.volumes.enabled == true` and `.Values.dataVolume.enabled == false` and upgrade the chart:
|
|
|
|
|
```bash
|
|
|
|
|
helm upgrade local-ai -n local-ai charts/local-ai
|
|
|
|
|
```
|
|
|
|
|
This will update the local-ai Deployment to mount the PV that was provisioned by the DataVolume.
|
|
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
## API
|
|
|
|
|
|
|
|
|
|
`LocalAI` provides an API for running text generation as a service, that follows the OpenAI reference and can be used as a drop-in. The models once loaded the first time will be kept in memory.
|
|
|
|
@@ -176,14 +214,12 @@ The API takes takes the following parameters:
|
|
|
|
|
| address | ADDRESS | :8080 | The address and port to listen on. |
|
|
|
|
|
| context-size | CONTEXT_SIZE | 512 | Default token context size. |
|
|
|
|
|
| debug | DEBUG | false | Enable debug mode. |
|
|
|
|
|
| config-file | CONFIG_FILE | empty | Path to a LocalAI config file. |
|
|
|
|
|
|
|
|
|
|
Once the server is running, you can start making requests to it using HTTP, using the OpenAI API.
|
|
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
## Advanced configuration
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Supported OpenAI API endpoints
|
|
|
|
|
|
|
|
|
|
You can check out the [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create).
|
|
|
|
@@ -214,7 +250,9 @@ Available additional parameters: `top_p`, `top_k`, `max_tokens`
|
|
|
|
|
#### Completions
|
|
|
|
|
|
|
|
|
|
<details>
|
|
|
|
|
|
|
|
|
|
To generate a completion, you can send a POST request to the `/v1/completions` endpoint with the instruction as per the request body:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
|
|
|
|
|
"model": "ggml-koala-7b-model-q4_0-r2.bin",
|
|
|
|
@@ -238,35 +276,75 @@ curl http://localhost:8080/v1/models
|
|
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
## Helm Chart Installation (run LocalAI in Kubernetes)
|
|
|
|
|
## Advanced configuration
|
|
|
|
|
|
|
|
|
|
LocalAI can be installed inside Kubernetes with helm.
|
|
|
|
|
LocalAI can be configured to serve user-defined models with a set of default parameters and templates.
|
|
|
|
|
|
|
|
|
|
<details>
|
|
|
|
|
The local-ai Helm chart supports two options for the LocalAI server's models directory:
|
|
|
|
|
1. Basic deployment with no persistent volume. You must manually update the Deployment to configure your own models directory.
|
|
|
|
|
You can create multiple `yaml` files in the models path or either specify a single YAML configuration file.
|
|
|
|
|
|
|
|
|
|
Install the chart with `.Values.deployment.volumes.enabled == false` and `.Values.dataVolume.enabled == false`.
|
|
|
|
|
For instance, a configuration file (`gpt-3.5-turbo.yaml`) can be declaring the "gpt-3.5-turbo" model but backed by the "testmodel" model file:
|
|
|
|
|
|
|
|
|
|
2. Advanced, two-phase deployment to provision the models directory using a DataVolume. Requires [Containerized Data Importer CDI](https://github.com/kubevirt/containerized-data-importer) to be pre-installed in your cluster.
|
|
|
|
|
```yaml
|
|
|
|
|
name: gpt-3.5-turbo
|
|
|
|
|
parameters:
|
|
|
|
|
model: testmodel
|
|
|
|
|
context_size: 512
|
|
|
|
|
threads: 10
|
|
|
|
|
stopwords:
|
|
|
|
|
- "HUMAN:"
|
|
|
|
|
- "### Response:"
|
|
|
|
|
roles:
|
|
|
|
|
user: "HUMAN:"
|
|
|
|
|
system: "GPT:"
|
|
|
|
|
template:
|
|
|
|
|
completion: completion
|
|
|
|
|
chat: ggml-gpt4all-j
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
First, install the chart with `.Values.deployment.volumes.enabled == false` and `.Values.dataVolume.enabled == true`:
|
|
|
|
|
```bash
|
|
|
|
|
helm install local-ai charts/local-ai -n local-ai --create-namespace
|
|
|
|
|
```
|
|
|
|
|
Wait for CDI to create an importer Pod for the DataVolume and for the importer pod to finish provisioning the model archive inside the PV.
|
|
|
|
|
Specifying a `config-file` via CLI allows to declare models in a single file as a list, for instance:
|
|
|
|
|
|
|
|
|
|
Once the PV is provisioned and the importer Pod removed, set `.Values.deployment.volumes.enabled == true` and `.Values.dataVolume.enabled == false` and upgrade the chart:
|
|
|
|
|
```bash
|
|
|
|
|
helm upgrade local-ai -n local-ai charts/local-ai
|
|
|
|
|
```
|
|
|
|
|
This will update the local-ai Deployment to mount the PV that was provisioned by the DataVolume.
|
|
|
|
|
```yaml
|
|
|
|
|
- name: list1
|
|
|
|
|
parameters:
|
|
|
|
|
model: testmodel
|
|
|
|
|
context_size: 512
|
|
|
|
|
threads: 10
|
|
|
|
|
stopwords:
|
|
|
|
|
- "HUMAN:"
|
|
|
|
|
- "### Response:"
|
|
|
|
|
roles:
|
|
|
|
|
user: "HUMAN:"
|
|
|
|
|
system: "GPT:"
|
|
|
|
|
template:
|
|
|
|
|
completion: completion
|
|
|
|
|
chat: ggml-gpt4all-j
|
|
|
|
|
- name: list2
|
|
|
|
|
parameters:
|
|
|
|
|
model: testmodel
|
|
|
|
|
context_size: 512
|
|
|
|
|
threads: 10
|
|
|
|
|
stopwords:
|
|
|
|
|
- "HUMAN:"
|
|
|
|
|
- "### Response:"
|
|
|
|
|
roles:
|
|
|
|
|
user: "HUMAN:"
|
|
|
|
|
system: "GPT:"
|
|
|
|
|
template:
|
|
|
|
|
completion: completion
|
|
|
|
|
chat: ggml-gpt4all-j
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
See also [chatbot-ui](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) as an example on how to use config files.
|
|
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
## Blog posts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Blog posts and other articles
|
|
|
|
|
|
|
|
|
|
- https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65
|
|
|
|
|
- https://kairos.io/docs/examples/localai/
|
|
|
|
|
|
|
|
|
|
## Windows compatibility
|
|
|
|
|
|
|
|
|
@@ -352,20 +430,29 @@ Feel free to open up a PR to get your project listed!
|
|
|
|
|
- [Kairos](https://github.com/kairos-io/kairos)
|
|
|
|
|
- [k8sgpt](https://github.com/k8sgpt-ai/k8sgpt#running-local-models)
|
|
|
|
|
|
|
|
|
|
## Blog posts and other articles on LocalAI
|
|
|
|
|
|
|
|
|
|
- https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65
|
|
|
|
|
- https://kairos.io/docs/examples/localai/
|
|
|
|
|
|
|
|
|
|
## Short-term roadmap
|
|
|
|
|
|
|
|
|
|
- [x] Mimic OpenAI API (https://github.com/go-skynet/LocalAI/issues/10)
|
|
|
|
|
- [ ] Binary releases (https://github.com/go-skynet/LocalAI/issues/6)
|
|
|
|
|
- [ ] Upstream our golang bindings to llama.cpp (https://github.com/ggerganov/llama.cpp/issues/351) and gpt4all
|
|
|
|
|
- [ ] Upstream our golang bindings to llama.cpp (https://github.com/ggerganov/llama.cpp/issues/351) and [gpt4all](https://github.com/go-skynet/LocalAI/issues/85)
|
|
|
|
|
- [x] Multi-model support
|
|
|
|
|
- [ ] Have a webUI!
|
|
|
|
|
- [ ] Allow configuration of defaults for models.
|
|
|
|
|
- [ ] Enable automatic downloading of models from a curated gallery, with only free-licensed models.
|
|
|
|
|
- [x] Have a webUI!
|
|
|
|
|
- [x] Allow configuration of defaults for models.
|
|
|
|
|
- [ ] Enable automatic downloading of models from a curated gallery, with only free-licensed models, directly from the webui.
|
|
|
|
|
|
|
|
|
|
## Star history
|
|
|
|
|
|
|
|
|
|
[](https://star-history.com/#go-skynet/LocalAI&Date)
|
|
|
|
|
|
|
|
|
|
## License
|
|
|
|
|
|
|
|
|
|
LocalAI is a community-driven project. It was initially created by [mudler](https://github.com/mudler/) at the [SpectroCloud OSS Office](https://github.com/spectrocloud).
|
|
|
|
|
|
|
|
|
|
MIT
|
|
|
|
|
|
|
|
|
|
## Acknowledgements
|
|
|
|
@@ -374,3 +461,9 @@ MIT
|
|
|
|
|
- https://github.com/tatsu-lab/stanford_alpaca
|
|
|
|
|
- https://github.com/cornelk/llama-go for the initial ideas
|
|
|
|
|
- https://github.com/antimatter15/alpaca.cpp for the light model version (this is compatible and tested only with that checkpoint model!)
|
|
|
|
|
|
|
|
|
|
## Contributors
|
|
|
|
|
|
|
|
|
|
<a href="https://github.com/go-skynet/LocalAI/graphs/contributors">
|
|
|
|
|
<img src="https://contrib.rocks/image?repo=go-skynet/LocalAI" />
|
|
|
|
|
</a>
|
|
|
|
|