Compare commits

..

218 Commits

Author SHA1 Message Date
renovate[bot]
274ace2898 fix(deps): update github.com/tmc/langchaingo digest to 2c309cf (#1097)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `9c8845b` -> `2c309cf` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzYuOTcuMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-24 14:55:17 +02:00
Aisuko
a8cc3709c6 Add the CONTRIBUTING.md (#1098)
**Description**

This PR is related to  #105 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 
<!--
Thank you for contributing to LocalAI! 

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`). 
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

---------

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Aisuko <urakiny@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-09-24 14:54:55 +02:00
Ettore Di Giacinto
a28ab18987 feat(vllm): Allow to set quantization (#1094)
This particularly useful to set AWQ

**Description**

Follow up of #1015 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-22 15:52:38 +02:00
lunamidori5
048b81373d Requested Changes from GPT4ALL to Luna-AI-Llama2 (#1092)
**Description**

This PR fixes #na

**Notes for Reviewers**
n/a

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
2023-09-22 11:22:17 +02:00
renovate[bot]
aea1d62ae6 fix(deps): update module google.golang.org/grpc to v1.58.2 (#1090)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [google.golang.org/grpc](https://togithub.com/grpc/grpc-go) | require
| patch | `v1.58.1` -> `v1.58.2` |

---

### Release Notes

<details>
<summary>grpc/grpc-go (google.golang.org/grpc)</summary>

### [`v1.58.2`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.2):
Release 1.58.2

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.58.1...v1.58.2)

### Bug Fixes

-   balancer/weighted_round_robin: fix ticker leak on update

A new ticker is created every time there is an update of addresses or
configuration, but was not properly stopped. This change stops the
ticker when it is no longer needed.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzYuOTcuMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-22 08:44:45 +02:00
Ettore Di Giacinto
601e54000d fix(llama.cpp): update, run go mod tidy (#1088)
**Description**

This PR supersedes #1086

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-22 00:45:02 +02:00
ci-robbot [bot]
7bdf707dd3 ⬆️ Update go-skynet/go-llama.cpp (#1084)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-20 19:48:38 +02:00
Ettore Di Giacinto
4a7e7e9fdb fix(vall-e-x): copy vall-e-x next to the local-ai binary in the container image (#1082)
**Description**

This PR fixes vall-e-x in the container image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-19 21:30:51 +02:00
Ettore Di Giacinto
bdf3f95346 feat(python-grpc): allow to set max workers with PYTHON_GRPC_MAX_WORKERS (#1081)
**Description**

this allows to customize the maximum number of grpc workers for python
backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-19 21:30:39 +02:00
Ettore Di Giacinto
453e9c5da9 fix(vllm): set default top_p with vllm (#1078)
**Description**

This PR fixes vllm when called with a request with an empty top_p

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-19 18:10:23 +02:00
Ettore Di Giacinto
3a69bd3ef5 Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-09-19 11:23:20 +02:00
renovate[bot]
a69c0f765e fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to e86c637 (#1059)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `cf4eb53` -> `e86c637` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-18 17:10:23 +02:00
renovate[bot]
97d1367764 fix(deps): update github.com/go-skynet/go-llama.cpp digest to b471eb7 (#1050)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `cc8a123` -> `b471eb7` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-18 17:09:51 +02:00
renovate[bot]
880e21288e fix(deps): update module github.com/valyala/fasthttp to v1.50.0 (#1060)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/valyala/fasthttp](https://togithub.com/valyala/fasthttp) |
require | minor | `v1.49.0` -> `v1.50.0` |

---

### Release Notes

<details>
<summary>valyala/fasthttp (github.com/valyala/fasthttp)</summary>

###
[`v1.50.0`](https://togithub.com/valyala/fasthttp/releases/tag/v1.50.0)

[Compare
Source](https://togithub.com/valyala/fasthttp/compare/v1.49.0...v1.50.0)

- [`8cc5539`](https://togithub.com/valyala/fasthttp/commit/8cc5539) Fix
various request timeout issues (Erik Dubbelboer)
- [`34e7da1`](https://togithub.com/valyala/fasthttp/commit/34e7da1)
Allow connection close for custom streams
([#&#8203;1603](https://togithub.com/valyala/fasthttp/issues/1603))
(Armin Becher)
- [`8236f8d`](https://togithub.com/valyala/fasthttp/commit/8236f8d)
fasthttpproxy: fix doc examples (Oleksandr Redko)
- [`4ec5c5a`](https://togithub.com/valyala/fasthttp/commit/4ec5c5a)
docs: fix typos in comments and tests (Oleksandr Redko)
- [`9aa666e`](https://togithub.com/valyala/fasthttp/commit/9aa666e)
Enable gocritic linter; fix lint issues
([#&#8203;1612](https://togithub.com/valyala/fasthttp/issues/1612))
(Oleksandr Redko)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-18 16:43:24 +02:00
James Braza
2ba9762255 Cleaned up chatbot-ui READMEs (#1075)
This PR cleans up the `chatbot-ui`/`-manual` examples:
- Fixes `Dockerfile` vs `docker-compose` confusion
- Makes it clear where to view the web UI in `## Run` sections

---------

Signed-off-by: James Braza <jamesbraza@gmail.com>
2023-09-18 16:43:06 +02:00
renovate[bot]
30f120ee6a fix(deps): update module github.com/gofiber/fiber/v2 to v2.49.2 (#1049)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/gofiber/fiber/v2](https://togithub.com/gofiber/fiber) |
require | patch | `v2.49.1` -> `v2.49.2` |

---

### Release Notes

<details>
<summary>gofiber/fiber (github.com/gofiber/fiber/v2)</summary>

### [`v2.49.2`](https://togithub.com/gofiber/fiber/releases/tag/v2.49.2)

[Compare
Source](https://togithub.com/gofiber/fiber/compare/v2.49.1...v2.49.2)

#### 🧹 Updates

- Middleware/logger: Enabling color changes padding for some fields
[#&#8203;2604](https://togithub.com/gofiber/fiber/issues/2604)
([#&#8203;2616](https://togithub.com/gofiber/fiber/issues/2616))
- Bump actions/checkout from 3 to 4
([#&#8203;2618](https://togithub.com/gofiber/fiber/issues/2618))
- Bump golang.org/x/sys from 0.11.0 to 0.12.0
([#&#8203;2617](https://togithub.com/gofiber/fiber/issues/2617))

#### 🐛 Fixes

-   Vulnerability in Ctx.IsFromLocal()

#### 📚 Documentation

- Replaced double quotes with backticks in all route parameter strings
([#&#8203;2591](https://togithub.com/gofiber/fiber/issues/2591))

**Full Changelog**:
https://github.com/gofiber/fiber/compare/v2.49.1...v2.49.2

Thank you [@&#8203;11-aryan](https://togithub.com/11-aryan) and
[@&#8203;AKARSHITJOSHI](https://togithub.com/AKARSHITJOSHI) for making
this update possible.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-17 08:39:06 +02:00
renovate[bot]
28a36e20aa fix(deps): update module google.golang.org/grpc to v1.58.1 (#1020)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [google.golang.org/grpc](https://togithub.com/grpc/grpc-go) | require
| minor | `v1.57.0` -> `v1.58.1` |

---

### Release Notes

<details>
<summary>grpc/grpc-go (google.golang.org/grpc)</summary>

### [`v1.58.1`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.1):
Release 1.58.1

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.58.0...v1.58.1)

### Bug Fixes

- grpc: fix a bug that was decrementing active RPC count too early for
streaming RPCs; leading to channel moving to IDLE even though it had
open streams
- grpc: fix a bug where transports were not being closed upon channel
entering IDLE

### [`v1.58.0`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.0):
Release 1.58.0

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.57.0...v1.58.0)

### API Changes

See [#&#8203;6472](https://togithub.com/grpc/grpc-go/issues/6472) for
details about these changes.

- balancer: add `StateListener` to `NewSubConnOptions` for `SubConn`
state updates and deprecate `Balancer.UpdateSubConnState`
([#&#8203;6481](https://togithub.com/grpc/grpc-go/issues/6481))
    -   `UpdateSubConnState` will be deleted in the future.
- balancer: add `SubConn.Shutdown` and deprecate
`Balancer.RemoveSubConn`
([#&#8203;6493](https://togithub.com/grpc/grpc-go/issues/6493))
    -   `RemoveSubConn` will be deleted in the future.
- resolver: remove deprecated `AddressType`
([#&#8203;6451](https://togithub.com/grpc/grpc-go/issues/6451))
- This was previously used as a signal to enable the "grpclb" load
balancing policy, and to pass LB addresses to the policy. Instead,
`balancer/grpclb/state.Set()` should be used to add these addresses to
the name resolver's output. The built-in "dns" name resolver already
does this.
- resolver: add new field `Endpoints` to `State` and deprecate
`Addresses`
([#&#8203;6471](https://togithub.com/grpc/grpc-go/issues/6471))
    -   `Addresses` will be deleted in the future.

### New Features

- balancer/leastrequest: Add experimental support for least request LB
policy and least request configured as a custom xDS policy
([#&#8203;6510](https://togithub.com/grpc/grpc-go/issues/6510),
[#&#8203;6517](https://togithub.com/grpc/grpc-go/issues/6517))
    -   Set `GRPC_EXPERIMENTAL_ENABLE_LEAST_REQUEST=true` to enable
- stats: Add an RPC event for blocking caused by the LB policy's picker
([#&#8203;6422](https://togithub.com/grpc/grpc-go/issues/6422))

### Bug Fixes

- clusterresolver: fix deadlock when dns resolver responds inline with
update or error at build time
([#&#8203;6563](https://togithub.com/grpc/grpc-go/issues/6563))
- grpc: fix a bug where the channel could erroneously report
`TRANSIENT_FAILURE` when actually moving to `IDLE`
([#&#8203;6497](https://togithub.com/grpc/grpc-go/issues/6497))
- balancergroup: do not cache closed sub-balancers by default; affects
`rls`, `weightedtarget` and `clustermanager` LB policies
([#&#8203;6523](https://togithub.com/grpc/grpc-go/issues/6523))
- client: fix a bug that prevented detection of RPC status in
trailers-only RPC responses when using `ClientStream.Header()`, and
prevented retry of the RPC
([#&#8203;6557](https://togithub.com/grpc/grpc-go/issues/6557))

### Performance Improvements

- client & server: Add experimental `[With]SharedWriteBuffer` to improve
performance by reducing allocations when sending RPC messages. (Disabled
by default.)
([#&#8203;6309](https://togithub.com/grpc/grpc-go/issues/6309))
- Special Thanks:
[@&#8203;s-matyukevich](https://togithub.com/s-matyukevich)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-17 08:38:52 +02:00
ci-robbot [bot]
a8fb4d23f8 ⬆️ Update go-skynet/go-llama.cpp (#1062)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-17 08:38:28 +02:00
Manohar Joshi
f37a4ec9c8 1038 - Streamlit bot with LocalAI (#1072)
**Description**

This PR fixes #1038

Added Streamlit example and also updated readme for examples.


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [X] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-17 08:33:23 +02:00
Ettore Di Giacinto
31ed13094b Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-09-16 23:00:42 +02:00
Ettore Di Giacinto
8ccf5b2044 feat(speculative-sampling): allow to specify a draft model in the model config (#1052)
**Description**

This PR fixes #1013.

It adds `draft_model` and `n_draft` to the model YAML config in order to
load models with speculative sampling. This should be compatible as well
with grammars.

example:

```yaml
backend: llama                                                                                                                                                                   
context_size: 1024                                                                                                                                                                        
name: my-model-name
parameters:
  model: foo-bar
n_draft: 16                                                                                                                                                                      
draft_model: model-name
```

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-14 17:44:16 +02:00
renovate[bot]
247d85b523 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to cf4eb53 (#1047)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `f0735ef` -> `cf4eb53` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-14 10:41:07 +02:00
renovate[bot]
54688db994 chore(deps): update docker/metadata-action action to v5 (#1045)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [docker/metadata-action](https://togithub.com/docker/metadata-action)
| action | major | `v4` -> `v5` |

---

### Release Notes

<details>
<summary>docker/metadata-action (docker/metadata-action)</summary>

### [`v5`](https://togithub.com/docker/metadata-action/compare/v4...v5)

[Compare
Source](https://togithub.com/docker/metadata-action/compare/v4...v5)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-14 10:40:51 +02:00
ci-robbot [bot]
8590f5a599 ⬆️ Update go-skynet/go-llama.cpp (#1048)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-14 10:40:36 +02:00
renovate[bot]
289d51c049 fix(deps): update github.com/go-skynet/go-llama.cpp digest to cc8a123 (#1041)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `4145bd5` -> `cc8a123` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 10:49:40 +00:00
renovate[bot]
813eaa867c chore(deps): update docker/login-action action to v3 (#1040)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [docker/login-action](https://togithub.com/docker/login-action) |
action | major | `v2` -> `v3` |

---

### Release Notes

<details>
<summary>docker/login-action (docker/login-action)</summary>

### [`v3`](https://togithub.com/docker/login-action/compare/v2...v3)

[Compare
Source](https://togithub.com/docker/login-action/compare/v2...v3)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:17:50 +02:00
renovate[bot]
abffb16292 chore(deps): update docker/build-push-action action to v5 (#1039)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[docker/build-push-action](https://togithub.com/docker/build-push-action)
| action | major | `v4` -> `v5` |

---

### Release Notes

<details>
<summary>docker/build-push-action (docker/build-push-action)</summary>

###
[`v5`](https://togithub.com/docker/build-push-action/compare/v4...v5)

[Compare
Source](https://togithub.com/docker/build-push-action/compare/v4...v5)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:17:28 +02:00
renovate[bot]
50e439f633 fix(deps): update module github.com/sashabaranov/go-openai to v1.15.3 (#1035)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | patch | `v1.15.2` -> `v1.15.3` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.3`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.3)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.15.2...v1.15.3)

#### What's Changed

- Chore Support base64 embedding format by
[@&#8203;henomis](https://togithub.com/henomis) in
[https://github.com/sashabaranov/go-openai/pull/485](https://togithub.com/sashabaranov/go-openai/pull/485)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15.2...v1.15.3

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:17:11 +02:00
renovate[bot]
25eb1415df fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to f0735ef (#1034)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `b6e38d6` -> `f0735ef` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:16:52 +02:00
ci-robbot [bot]
0b28220f2b ⬆️ Update go-skynet/go-llama.cpp (#1043)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-13 09:16:33 +02:00
renovate[bot]
5661740990 fix(deps): update github.com/tmc/langchaingo digest to 9c8845b (#1029)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `c85d396` -> `9c8845b` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-11 09:43:11 +02:00
ci-robbot [bot]
255c31bddf ⬆️ Update go-skynet/go-llama.cpp (#1027)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-11 09:42:54 +02:00
Ettore Di Giacinto
7888fefeea docs: Update README 2023-09-10 09:21:47 +02:00
renovate[bot]
0937835802 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 4145bd5 (#1025)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `05dc4b6` -> `4145bd5` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-10 09:19:03 +02:00
renovate[bot]
ea806b37ac fix(deps): update module github.com/sashabaranov/go-openai to v1.15.2 (#1022)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | patch | `v1.15.1` -> `v1.15.2` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.2`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.2)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.15.1...v1.15.2)

#### What's Changed

- Update OpenAPI file return struct by
[@&#8203;NullpointerW](https://togithub.com/NullpointerW) in
[https://github.com/sashabaranov/go-openai/pull/486](https://togithub.com/sashabaranov/go-openai/pull/486)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15.1...v1.15.2

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-10 09:18:28 +02:00
Ettore Di Giacinto
d6614f3149 feat(vllm): Initial vllm backend implementation (#1026)
Related to: https://github.com/go-skynet/LocalAI/issues/1015
2023-09-10 09:17:55 +02:00
Ettore Di Giacinto
9a50a39848 doc(README): update 2023-09-09 19:28:07 +02:00
Ettore Di Giacinto
2793e8f327 doc(citation): Add citation block
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-09 19:17:19 +02:00
Ettore Di Giacinto
c0bb5c4bf6 feat(vllm): Initial vllm backend implementation
Related to: https://github.com/go-skynet/LocalAI/issues/1015

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-09 17:03:23 +02:00
Ettore Di Giacinto
cc74fc93b4 feat(llama.cpp): update (#1024)
**Description**

This PR fixes #

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-08 18:38:22 +02:00
renovate[bot]
44b39195d6 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 05dc4b6 (#1004)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `d8c8547` -> `05dc4b6` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-08 12:39:17 +02:00
Robert Deaton
2454110d81 Update README to reflect changes in Continue's config file (#1014)
**Description**

OpenAIServerInfo no longer exists, and api_base has been moved up.
Changes were made here
8967e2d53f (diff-98e147eaa7c9936befdddabb16c72447fbf8ad2df6b680c5176c24813169858e)

Signed-off-by: Robert Deaton <rdeaton@platipy.org>
2023-09-07 16:29:07 +02:00
Ettore Di Giacinto
ee59e7d45f fix(vall-e-x): make audiopath relative to models (#1012)
**Description**

This PR fixes #

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-05 19:33:36 +02:00
Ettore Di Giacinto
605c319157 feat(diffusers): don't set seed in params and respect device (#1010)
**Description**

Follow up of #998 - respect the device used to load the model and do not
specify a seed in the parameters, but rather just configure the
generator as described in
https://huggingface.co/docs/diffusers/using-diffusers/reusing_seeds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-04 19:38:38 +02:00
Ettore Di Giacinto
dc307a1cc0 feat: add vall-e-x (#1007)
**Description**

This PR fixes #985 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-04 19:25:23 +02:00
quoing
e7981152b2 [query_data example] max_chunk_overlap in PromptHelper must be in 0..1 range (#1000)
**Description**

Simple fix, percentage value is expected to be float in range 0..1

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-04 19:12:53 +02:00
ci-robbot [bot]
b3eb5c860b ⬆️ Update go-skynet/go-llama.cpp (#1005)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-04 19:11:41 +02:00
Bo-Yi Wu
1c2f7409e3 chore(deps): remove unused package (#1003)
**Description**

Just remove Golang unused package and update the format in Makefile

Signed-off-by: appleboy <appleboy.tw@gmail.com>
2023-09-04 19:11:28 +02:00
renovate[bot]
57d41a3f94 fix(deps): update module github.com/gofiber/fiber/v2 to v2.49.1 (#1001)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/gofiber/fiber/v2](https://togithub.com/gofiber/fiber) |
require | patch | `v2.49.0` -> `v2.49.1` |

---

### Release Notes

<details>
<summary>gofiber/fiber (github.com/gofiber/fiber/v2)</summary>

### [`v2.49.1`](https://togithub.com/gofiber/fiber/releases/tag/v2.49.1)

[Compare
Source](https://togithub.com/gofiber/fiber/compare/v2.49.0...v2.49.1)

#### 🧹 Updates

- Bump github.com/valyala/fasthttp from 1.48.0 to 1.49.0
([#&#8203;2615](https://togithub.com/gofiber/fiber/issues/2615))

#### 🐛 Fixes

- Rollback changes to go.mod file
([#&#8203;2614](https://togithub.com/gofiber/fiber/issues/2614))

#### 📚 Documentation

- Add Polish translation - README_pl.md
([#&#8203;2613](https://togithub.com/gofiber/fiber/issues/2613))
- Update README_ko.md
([#&#8203;2605](https://togithub.com/gofiber/fiber/issues/2605))

**Full Changelog**:
https://github.com/gofiber/fiber/compare/v2.49.0...v2.49.1

Thank you [@&#8203;KompocikDot](https://togithub.com/KompocikDot),
[@&#8203;LimJiAn](https://togithub.com/LimJiAn) and
[@&#8203;gaby](https://togithub.com/gaby) for making this update
possible.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNzguOCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-04 19:11:09 +02:00
Max Cohen
f9d2bd24eb Allow to manually set the seed for the SD pipeline (#998)
**Description**

Enable setting the seed for the stable diffusion pipeline. This is done
through an additional `seed` parameter in the request, such as:

```bash
curl http://localhost:8080/v1/images/generations \
    -H "Content-Type: application/json" \
    -d '{"model": "stablediffusion", "prompt": "prompt", "n": 1, "step": 51, "size": "512x512", "seed": 3}'
```

**Notes for Reviewers**
When the `seed` parameter is not sent, `request.seed` defaults to `0`,
making it difficult to detect an actual seed of `0`. Is there a way to
change the default to `-1` for instance ?

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-04 19:10:55 +02:00
ci-robbot [bot]
0e7e8eec53 ⬆️ Update go-skynet/go-llama.cpp (#1002)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-03 10:00:01 +02:00
renovate[bot]
9a30a246d8 fix(deps): update github.com/go-skynet/go-llama.cpp digest to d8c8547 (#997)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `c5622a8` -> `d8c8547` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-02 12:31:12 +00:00
ci-robbot [bot]
c332499252 ⬆️ Update go-skynet/go-llama.cpp (#996)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-02 09:54:50 +02:00
Dave
005f289632 feat: Model Gallery Endpoint Refactor / Mutable Galleries Endpoints (#991)
refactor for model gallery endpoints - bundle up resources into a
struct, make galleries mutable with some crud endpoints. This is
groundwork required for making efficient use of the new scraper - while
that PR isn't _quite_ ready yet, the goal is to have more, individually
smaller gallery files. Therefore, rather than requiring a full localai
service restart, these new endpoints have been added to make life
easier.

- Adds endpoints to add, list and remove model galleries at runtime
- Adds these endpoints to the Insomnia config
- Minor fix: loading file urls follows symbolic links now
2023-09-02 09:00:44 +02:00
renovate[bot]
3d7553317f fix(deps): update github.com/go-skynet/go-llama.cpp digest to c5622a8 (#992)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `bf3f946` -> `c5622a8` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-02 08:58:16 +02:00
renovate[bot]
8e4f6b2ee5 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to b6e38d6 (#988)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `27a8b02` -> `b6e38d6` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-02 08:56:43 +02:00
renovate[bot]
d5cad7d3ae fix(deps): update module github.com/shirou/gopsutil/v3 to v3.23.8 (#989)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/shirou/gopsutil/v3](https://togithub.com/shirou/gopsutil)
| require | patch | `v3.23.7` -> `v3.23.8` |

---

### Release Notes

<details>
<summary>shirou/gopsutil (github.com/shirou/gopsutil/v3)</summary>

###
[`v3.23.8`](https://togithub.com/shirou/gopsutil/releases/tag/v3.23.8)

[Compare
Source](https://togithub.com/shirou/gopsutil/compare/v3.23.7...v3.23.8)

<!-- Release notes generated using configuration in .github/release.yml
at v3.23.8 -->

#### What's Changed

[#&#8203;1514](https://togithub.com/shirou/gopsutil/issues/1514)
improves `Processes()` performance 6% or more. Thank you
[@&#8203;atoulme](https://togithub.com/atoulme) !

##### cpu

- Enable setting of vendor and related information for all Power
versions by [@&#8203;kishen-v](https://togithub.com/kishen-v) in
[https://github.com/shirou/gopsutil/pull/1495](https://togithub.com/shirou/gopsutil/pull/1495)
- chore: change CIRCLECI environment variable to CI. by
[@&#8203;shirou](https://togithub.com/shirou) in
[https://github.com/shirou/gopsutil/pull/1518](https://togithub.com/shirou/gopsutil/pull/1518)

##### disk

- fix: fixed windows disk package leaks by
[@&#8203;ozanh](https://togithub.com/ozanh) in
[https://github.com/shirou/gopsutil/pull/1501](https://togithub.com/shirou/gopsutil/pull/1501)
- fix IOCounters() SerialNumber enumeration by
[@&#8203;gdvalle](https://togithub.com/gdvalle) in
[https://github.com/shirou/gopsutil/pull/1508](https://togithub.com/shirou/gopsutil/pull/1508)

##### host

- \[host]\[linux]: remove double quote from lsb release info by
[@&#8203;shirou](https://togithub.com/shirou) in
[https://github.com/shirou/gopsutil/pull/1504](https://togithub.com/shirou/gopsutil/pull/1504)

##### mem

- mem: linux: fix vmstat field names by
[@&#8203;chouquette](https://togithub.com/chouquette) in
[https://github.com/shirou/gopsutil/pull/1498](https://togithub.com/shirou/gopsutil/pull/1498)

##### process

- Fix Processes() calls with many cores by
[@&#8203;atoulme](https://togithub.com/atoulme) in
[https://github.com/shirou/gopsutil/pull/1514](https://togithub.com/shirou/gopsutil/pull/1514)

#### New Contributors

- [@&#8203;kishen-v](https://togithub.com/kishen-v) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1495](https://togithub.com/shirou/gopsutil/pull/1495)
- [@&#8203;chouquette](https://togithub.com/chouquette) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1498](https://togithub.com/shirou/gopsutil/pull/1498)
- [@&#8203;ozanh](https://togithub.com/ozanh) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1501](https://togithub.com/shirou/gopsutil/pull/1501)
- [@&#8203;gdvalle](https://togithub.com/gdvalle) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1508](https://togithub.com/shirou/gopsutil/pull/1508)

**Full Changelog**:
https://github.com/shirou/gopsutil/compare/v3.23.7...v3.23.8

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-01 13:22:53 -04:00
Jirubizu
355e9d4fb5 [API] expose all the jobs via /models/jobs endpoint (#983)
**Description**

This PR fixes #


**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Co-authored-by: Jirubizu <jirubizu@jirubizu.cc>
2023-08-31 15:03:03 +00:00
renovate[bot]
629185e10a fix(deps): update module github.com/sashabaranov/go-openai to v1.15.1 (#984)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | minor | `v1.14.2` -> `v1.15.1` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.1`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.1)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.14.2...v1.15.1)

#### What's Changed

- Chore Deprecate legacy fine tunes API by
[@&#8203;henomis](https://togithub.com/henomis) in
[https://github.com/sashabaranov/go-openai/pull/484](https://togithub.com/sashabaranov/go-openai/pull/484)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15...v1.15.1

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-31 14:35:13 +00:00
Samuel Maynard
deeef5fc24 fix(utf8): prevent multi-byte utf8 characters from being mangled (#981)
**Description**

This PR fixes #677 using [suggested
solution](https://github.com/go-skynet/LocalAI/issues/677#issuecomment-1695939097)
from @yantoz

before:
```
❯ curl -N http://localhost:57541/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-model-q4_0.bin",
     "prompt": "",
     "max_tokens": 32,
     "temperature": 0.7,
     "stream": true
   }'
data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" |"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" I"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"'"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"m"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
```

now:
```
❯ curl -N http://localhost:57541/v1/completions -H Content-Type: application/json -d {
   "model": "ggml-model-q4_0.bin",
   "prompt": "",
   "max_tokens": 32,
   "temperature": 0.7,
   "stream": true
 }
data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"😂"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":" "}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"|"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":" "}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"I"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"'"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"m"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
```

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [X] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-08-30 23:56:59 +00:00
renovate[bot]
b905c07650 fix(deps): update github.com/go-skynet/go-llama.cpp digest to bf3f946 (#979)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `9072315` -> `bf3f946` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-30 23:02:19 +02:00
Ettore Di Giacinto
1ff30034e8 fix(deps): update go-llama.cpp (#980)
**Description**

This PR bumps llama.cpp (adding support to gguf v2) and changes the
default test model

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-30 23:01:55 +02:00
renovate[bot]
c64b59c80c fix(deps): update module github.com/valyala/fasthttp to v1.49.0 (#971)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/valyala/fasthttp](https://togithub.com/valyala/fasthttp) |
require | minor | `v1.48.0` -> `v1.49.0` |

---

### Release Notes

<details>
<summary>valyala/fasthttp (github.com/valyala/fasthttp)</summary>

###
[`v1.49.0`](https://togithub.com/valyala/fasthttp/releases/tag/v1.49.0)

[Compare
Source](https://togithub.com/valyala/fasthttp/compare/v1.48.0...v1.49.0)

- [`0e99e64`](https://togithub.com/valyala/fasthttp/commit/0e99e64)
Update golangci-lint and gosec
([#&#8203;1609](https://togithub.com/valyala/fasthttp/issues/1609))
(Erik Dubbelboer)
- [`6aea1e0`](https://togithub.com/valyala/fasthttp/commit/6aea1e0) fix
round2\_32, split round2 tests because they depend on sizeof int at
compile time
([#&#8203;1607](https://togithub.com/valyala/fasthttp/issues/1607))
(Duncan Overbruck)
- [`4b0e6c7`](https://togithub.com/valyala/fasthttp/commit/4b0e6c7)
Update ErrNoMultipartForm (Erik Dubbelboer)
- [`727021a`](https://togithub.com/valyala/fasthttp/commit/727021a)
Update security policy (Erik Dubbelboer)
- [`54fdc7a`](https://togithub.com/valyala/fasthttp/commit/54fdc7a)
Abstracts the RoundTripper interface and provides a default implement
([#&#8203;1602](https://togithub.com/valyala/fasthttp/issues/1602))
(Tim)
- [`e181af1`](https://togithub.com/valyala/fasthttp/commit/e181af1)
fasthttpproxy support ipv6
([#&#8203;1597](https://togithub.com/valyala/fasthttp/issues/1597))
(Pluto)
- [`6eb2249`](https://togithub.com/valyala/fasthttp/commit/6eb2249)
fix:fasthttp server with tlsConfig
([#&#8203;1595](https://togithub.com/valyala/fasthttp/issues/1595))
(Zhang Xiaopei)
- [`1c85d43`](https://togithub.com/valyala/fasthttp/commit/1c85d43) Fix
round2 (Erik Dubbelboer)
- [`064124e`](https://togithub.com/valyala/fasthttp/commit/064124e)
Avoid nolint:errcheck in header tests
([#&#8203;1589](https://togithub.com/valyala/fasthttp/issues/1589))
(Oleksandr Redko)
- [`0d0bbfe`](https://togithub.com/valyala/fasthttp/commit/0d0bbfe) Auto
add 'Vary' header after compression
([#&#8203;1585](https://togithub.com/valyala/fasthttp/issues/1585))
(AutumnSun)
- [`d229959`](https://togithub.com/valyala/fasthttp/commit/d229959)
Remove unnecessary indent blocks
([#&#8203;1586](https://togithub.com/valyala/fasthttp/issues/1586))
(Oleksandr Redko)
- [`6b68042`](https://togithub.com/valyala/fasthttp/commit/6b68042) Use
timeout in TCPDialer to resolveTCPAddrs
([#&#8203;1582](https://togithub.com/valyala/fasthttp/issues/1582))
(un000)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4wIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-29 22:16:10 +02:00
renovate[bot]
9a869bbaf6 fix(deps): update github.com/tmc/langchaingo digest to c85d396 (#962)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `1e2a401` -> `c85d396` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-29 22:15:43 +02:00
renovate[bot]
fe1b54b713 fix(deps): update module github.com/gofiber/fiber/v2 to v2.49.0 (#966)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/gofiber/fiber/v2](https://togithub.com/gofiber/fiber) |
require | minor | `v2.48.0` -> `v2.49.0` |

---

### Release Notes

<details>
<summary>gofiber/fiber (github.com/gofiber/fiber/v2)</summary>

### [`v2.49.0`](https://togithub.com/gofiber/fiber/releases/tag/v2.49.0)

[Compare
Source](https://togithub.com/gofiber/fiber/compare/v2.48.0...v2.49.0)

####  Breaking Changes

- Add config to enable splitting by comma in parsers
([#&#8203;2560](https://togithub.com/gofiber/fiber/issues/2560))
    https://docs.gofiber.io/api/fiber#config

> EnableSplittingOnParsers splits the query/body/header parameters by
comma when it's true (default: false).
>
> For example, you can use it to parse multiple values from a query
parameter like this:
> /api?foo=bar,baz == foo\[]=bar\&foo\[]=baz

#### 🚀 New

- Add custom data property to favicon middleware config
([#&#8203;2579](https://togithub.com/gofiber/fiber/issues/2579))
    https://docs.gofiber.io/api/middleware/favicon#config

> This allows the user to use //go:embed flags to load favicon data
during build-time, and supply it to the middleware instead of reading
the file every time the application starts.

#### 🧹 Updates

- Middleware/logger: Latency match gin-gonic/gin formatter
([#&#8203;2569](https://togithub.com/gofiber/fiber/issues/2569))
- Middleware/filesystem: Refactor: use `errors.Is` instead of
`os.IsNotExist`
([#&#8203;2558](https://togithub.com/gofiber/fiber/issues/2558))
- Use Global vars instead of local vars for isLocalHost
([#&#8203;2595](https://togithub.com/gofiber/fiber/issues/2595))
- Remove redundant nil check
([#&#8203;2584](https://togithub.com/gofiber/fiber/issues/2584))
- Bump github.com/mattn/go-runewidth from 0.0.14 to 0.0.15
([#&#8203;2551](https://togithub.com/gofiber/fiber/issues/2551))
- Bump github.com/google/uuid from 1.3.0 to 1.3.1
([#&#8203;2592](https://togithub.com/gofiber/fiber/issues/2592))
- Bump golang.org/x/sys from 0.10.0 to 0.11.0
([#&#8203;2563](https://togithub.com/gofiber/fiber/issues/2563))
- Add go 1.21 to ci and readmes
([#&#8203;2588](https://togithub.com/gofiber/fiber/issues/2588))

#### 🐛 Fixes

- Middleware/logger: Default latency output format
([#&#8203;2580](https://togithub.com/gofiber/fiber/issues/2580))
- Decompress request body when multi Content-Encoding sent on request
headers ([#&#8203;2555](https://togithub.com/gofiber/fiber/issues/2555))

#### 📚 Documentation

- Fix wrong JSON docs
([#&#8203;2554](https://togithub.com/gofiber/fiber/issues/2554))
- Update io/ioutil package to io package
([#&#8203;2589](https://togithub.com/gofiber/fiber/issues/2589))
- Replace EG flag with the proper and smaller SVG
([#&#8203;2585](https://togithub.com/gofiber/fiber/issues/2585))
- Added Egyptian Arabic readme file
([#&#8203;2565](https://togithub.com/gofiber/fiber/issues/2565))
- Translate README to Portuguese
([#&#8203;2567](https://togithub.com/gofiber/fiber/issues/2567))
- Improve \*fiber.Client section
([#&#8203;2553](https://togithub.com/gofiber/fiber/issues/2553))
- Improved the config section of the middleware readme´s
([#&#8203;2552](https://togithub.com/gofiber/fiber/issues/2552))
- Added documentation about ctx Fresh
([#&#8203;2549](https://togithub.com/gofiber/fiber/issues/2549))
- Update intro.md
([#&#8203;2550](https://togithub.com/gofiber/fiber/issues/2550))
- Fixed link to slim template engine
([#&#8203;2547](https://togithub.com/gofiber/fiber/issues/2547))

**Full Changelog**:
https://github.com/gofiber/fiber/compare/v2.48.0...v2.49.0

Thank you [@&#8203;Jictyvoo](https://togithub.com/Jictyvoo),
[@&#8203;Juneezee](https://togithub.com/Juneezee),
[@&#8203;Kirari04](https://togithub.com/Kirari04),
[@&#8203;LimJiAn](https://togithub.com/LimJiAn),
[@&#8203;PassTheMayo](https://togithub.com/PassTheMayo),
[@&#8203;andersonmiranda-com](https://togithub.com/andersonmiranda-com),
[@&#8203;bigpreshy](https://togithub.com/bigpreshy),
[@&#8203;efectn](https://togithub.com/efectn),
[@&#8203;renanbastos93](https://togithub.com/renanbastos93),
[@&#8203;scandar](https://togithub.com/scandar),
[@&#8203;sixcolors](https://togithub.com/sixcolors) and
[@&#8203;stefanb](https://togithub.com/stefanb) for making this update
possible.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42NC44IiwidXBkYXRlZEluVmVyIjoiMzYuNjQuOCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-28 08:24:13 +02:00
ci-robbot [bot]
cc84dfd50f ⬆️ Update go-skynet/go-llama.cpp (#968)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-28 08:23:51 +02:00
Ettore Di Giacinto
158c7867e7 fix(diffusers): correctly check alpha (#967)
**Description**

Loras that have no alpha would crash otherwise

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-27 15:35:59 +02:00
renovate[bot]
997c39ccd5 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 9072315 (#963)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `bf63302` -> `9072315` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-27 10:11:45 +02:00
Ettore Di Giacinto
3bab307904 fix(llama): resolve lora adapters correctly from the model file (#964)
**Description**

we were otherwise expecting absolute paths. this make it relative to the
model file (as someone would expect)

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-08-27 10:11:32 +02:00
Ettore Di Giacinto
02704e38d3 feat(diffusers): Add lora (#965)
**Description**

This PR fixes #914 

Now diffusers respects the `lora_adapter` configuration parameter.

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-08-27 10:11:16 +02:00
renovate[bot]
9e5fb29965 fix(deps): update module github.com/otiai10/openaigo to v1.6.0 (#960)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/otiai10/openaigo](https://togithub.com/otiai10/openaigo) |
require | minor | `v1.5.2` -> `v1.6.0` |

---

### Release Notes

<details>
<summary>otiai10/openaigo (github.com/otiai10/openaigo)</summary>

###
[`v1.6.0`](https://togithub.com/otiai10/openaigo/compare/v1.5.2...v1.6.0)

[Compare
Source](https://togithub.com/otiai10/openaigo/compare/v1.5.2...v1.6.0)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-26 14:18:06 +02:00
renovate[bot]
7dba131d5f fix(deps): update github.com/tmc/langchaingo digest to 1e2a401 (#948)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `fef0821` -> `1e2a401` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-26 14:17:48 +02:00
renovate[bot]
ce0b771217 fix(deps): update github.com/go-skynet/go-llama.cpp digest to bf63302 (#930)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `f03869d` -> `bf63302` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi40My4yIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-26 14:17:22 +02:00
Ettore Di Giacinto
44bc7aa3d0 feat: Allow to load lora adapters for llama.cpp (#955)
**Description**

This PR fixes #

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-25 21:58:46 +02:00
ci-robbot [bot]
7f0c88ed3e ⬆️ Update go-skynet/go-llama.cpp (#954)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-25 18:45:40 +02:00
ci-robbot [bot]
d15508f52c ⬆️ Update nomic-ai/gpt4all (#953)
Bump of nomic-ai/gpt4all version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-25 01:19:48 +02:00
renovate[bot]
b111423b9c fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 27a8b02 (#947)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `36f7fb5` -> `27a8b02` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-24 23:52:49 +02:00
renovate[bot]
215a51c4c1 fix(deps): update module github.com/onsi/ginkgo/v2 to v2.12.0 (#949)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/onsi/ginkgo/v2](https://togithub.com/onsi/ginkgo) |
require | minor | `v2.11.0` -> `v2.12.0` |

---

### Release Notes

<details>
<summary>onsi/ginkgo (github.com/onsi/ginkgo/v2)</summary>

### [`v2.12.0`](https://togithub.com/onsi/ginkgo/releases/tag/v2.12.0)

[Compare
Source](https://togithub.com/onsi/ginkgo/compare/v2.11.0...v2.12.0)

#### 2.12.0

##### Features

- feat: allow MustPassRepeatedly decorator to be set at suite level
([#&#8203;1266](https://togithub.com/onsi/ginkgo/issues/1266))
\[[`05de518`](https://togithub.com/onsi/ginkgo/commit/05de518)]

##### Fixes

- fix-errors-in-readme
([#&#8203;1244](https://togithub.com/onsi/ginkgo/issues/1244))
\[[`27c2f5d`](https://togithub.com/onsi/ginkgo/commit/27c2f5d)]

##### Maintenance

Various chores/dependency bumps.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-24 18:58:06 +02:00
Ettore Di Giacinto
1120847f72 feat: bump llama.cpp, add gguf support (#943)
**Description**

This PR syncs up the `llama` backend to use `gguf`
(https://github.com/go-skynet/go-llama.cpp/pull/180). It also adds
`llama-stable` to the targets so we can still load ggml. It adapts the
current tests to use the `llama-backend` for ggml and uses a `gguf`
model to run tests on the new backend.

In order to consume the new version of go-llama.cpp, it also bump go to
1.21 (images, pipelines, etc)

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-24 01:18:58 +02:00
Dave
704323b805 initial draft of an importable Insomnia profile for developers (#942)
This is a starting point for developers to easily import a collection of
requests to hit LocalAI. Insomnia was chosen as it's open source, has a
graphical user interface for users desiring that, and has the ability to
easily export requests as cURL commands for our documentation site.
2023-08-23 18:39:27 +02:00
Dave
10b0e13882 feat: backend monitor shutdown endpoint, process based (#938)
This PR adds a new endpoint to the backend monitor section
`/backend/shutdown` which terminates the grpc process for the related
model.
2023-08-23 18:38:37 +02:00
Dave
901f0709c5 Feat: rwkv improvements: (#937) 2023-08-22 18:48:06 +02:00
Gruber
0d6165e481 Example: Continue (dev) (#940) 2023-08-22 18:46:45 +02:00
renovate[bot]
6583eed6b2 fix(deps): update module github.com/google/uuid to v1.3.1 (#936)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-22 10:51:04 +02:00
Dave
a9ca70ad4a infra: add setup-go@4, test against 1.20.x (go.mod) and stable (1.21) (#935) 2023-08-21 22:16:47 +02:00
Ettore Di Giacinto
ab5b75eb01 feat: add llama-stable backend (#932)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-20 16:35:42 +02:00
Ettore Di Giacinto
cc060a283d fix: drop racy code, refactor and group API schema (#931)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-20 14:04:45 +02:00
Ettore Di Giacinto
28db83e17b fix: disable usage by default (still experimental) (#929)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-19 16:15:22 +02:00
ci-robbot [bot]
dbb1f86455 ⬆️ Update nomic-ai/gpt4all (#911)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-19 10:17:41 +02:00
renovate[bot]
02f7c555af fix(deps): update github.com/tmc/langchaingo digest to fef0821 (#922)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-19 01:50:04 +02:00
renovate[bot]
d982b38f76 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 36f7fb5 (#908)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-19 01:49:51 +02:00
renovate[bot]
bc2e4b952e fix(deps): update module github.com/shirou/gopsutil/v3 to v3.23.7 (#924)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-19 01:49:43 +02:00
Ettore Di Giacinto
afdc0ebfd7 feat: add --single-active-backend to allow only one backend active at the time (#925)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-19 01:49:33 +02:00
Ettore Di Giacinto
1079b18ff7 feat(diffusers): be consistent with pipelines, support also depthimg2img (#926)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-18 22:06:24 +02:00
Dave
8cb1061c11 Usage Features (#863) 2023-08-18 21:23:14 +02:00
Ettore Di Giacinto
2bacd0180d feat(diffusers): add img2img and clip_skip, support more kernels schedulers (#906)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-17 23:38:59 +02:00
renovate[bot]
ddf9bc2335 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to a630935 (#898)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-16 22:25:28 +02:00
renovate[bot]
a1afd940e3 fix(deps): update github.com/go-skynet/go-llama.cpp digest to f03869d (#901)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-16 22:25:14 +02:00
renovate[bot]
8bb76201c0 fix(deps): update github.com/tmc/langchaingo digest to eb0cbd3 (#902)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-16 22:25:02 +02:00
Ettore Di Giacinto
ede71d398c feat(diffusers): overcome prompt limit (#904)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-16 22:24:52 +02:00
ci-robbot [bot]
0c73a637f1 ⬆️ Update nomic-ai/gpt4all (#899)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-16 01:11:54 +02:00
Ettore Di Giacinto
37700f2d98 feat(diffusers): add DPMSolverMultistepScheduler++, DPMSolverMultistepSchedulerSDE++, guidance_scale (#903)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-16 01:11:42 +02:00
Ettore Di Giacinto
0ec695f9e4 feat: make initializer accept gRPC delay times (#900)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-16 01:11:32 +02:00
renovate[bot]
7ffd21dbc8 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 18f25c2 (#894)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-15 09:25:40 +02:00
renovate[bot]
48b3920656 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 4e55940 (#893)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-15 09:25:27 +02:00
ci-robbot [bot]
63d91af555 ⬆️ Update nomic-ai/gpt4all (#878)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-15 09:25:10 +02:00
Ettore Di Giacinto
a96c3bc885 feat(diffusers): various enhancements (#895) 2023-08-14 23:12:00 +02:00
Ettore Di Giacinto
77e1ae3d70 feat(Makefile): allow to restrict backend builds (#890)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-13 20:04:08 +02:00
renovate[bot]
9cc8d90865 fix(deps): update module github.com/sashabaranov/go-openai to v1.14.2 (#884)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-12 22:41:12 +02:00
Ettore Di Giacinto
a6c621ef7f feat: pre-configure LocalAI galleries (#886)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-12 11:25:17 +02:00
renovate[bot]
328289099a fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 4d855af (#875)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-12 08:58:55 +02:00
renovate[bot]
22ffd5f490 fix(deps): update github.com/tmc/langchaingo digest to fd8b7f0 (#882)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-12 08:56:15 +02:00
Ettore Di Giacinto
81708bb1e6 fix: workaround exllama import error (#885)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-12 08:56:01 +02:00
Ettore Di Giacinto
c81e9d8d1f fix: add exllama to protogen 2023-08-11 01:02:31 +02:00
Ettore Di Giacinto
ff3ab5fcca feat: Add exllama (#881)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-11 00:49:40 +02:00
Michael Nesbitt
1d1cae8e4d feat: add API_KEY list support (#877)
Co-authored-by: Harold Sun <sunhua@amazon.com>
2023-08-10 00:06:21 +02:00
Ettore Di Giacinto
8c781a6a44 feat: Add Diffusers (#874)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-09 08:38:51 +02:00
Ettore Di Giacinto
93a4bec06b fix: upgrade pip (#872)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 23:20:03 +02:00
renovate[bot]
c93f57efd6 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 0f2bb50 (#869)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-08 21:57:16 +02:00
ci-robbot [bot]
0e4f93c5cf ⬆️ Update nomic-ai/gpt4all (#870)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-08 21:57:01 +02:00
Ettore Di Giacinto
5b3fedebfe feat: add bark and AutoGPTQ (#871) 2023-08-08 20:41:49 +02:00
Ettore Di Giacinto
219751bb21 fix: cut prompt from AutoGPTQ answers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 01:27:38 +02:00
Ettore Di Giacinto
bb7772a364 fix: byte utf-8 encode results from autogptq
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 01:20:07 +02:00
Ettore Di Giacinto
3c8fc37c56 feat: Add UseFastTokenizer
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 01:10:05 +02:00
Ettore Di Giacinto
39805b09e5 fix: pass by env in managed services
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 00:58:38 +02:00
Ettore Di Giacinto
63b01199fe fix: match lowercase of the input, not of the model 2023-08-08 00:46:22 +02:00
Ettore Di Giacinto
b09bae3443 fix: autogptq requirements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-08 00:22:15 +02:00
Ettore Di Giacinto
de6fb98bed feat: register autogptq and bark in the container image 2023-08-07 22:53:28 +02:00
Ettore Di Giacinto
433605e282 feat: add initial Bark backend implementation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-07 22:53:28 +02:00
Ettore Di Giacinto
a843e64fc2 feat: add initial AutoGPTQ backend implementation 2023-08-07 22:53:28 +02:00
scott4290
71611d2dec docs: base-Update comments in .env for cublas, openblas, clblas (#867) 2023-08-07 08:22:42 +00:00
Ettore Di Giacinto
abf48e8a5d readme: link to hot topics in the website 2023-08-07 00:31:46 +02:00
Ettore Di Giacinto
ac5ea0cd4d readme: link usage to docs 2023-08-07 00:04:28 +02:00
Ettore Di Giacinto
a46fcacedd readme: simplify, remove dups with website 2023-08-07 00:01:01 +02:00
Ettore Di Giacinto
df947fc933 examples: Update README 2023-08-06 23:07:06 +02:00
Ettore Di Giacinto
91d49cfe9f Update README.md 2023-08-06 11:57:28 +02:00
Ettore Di Giacinto
19d15f83db Update README.md 2023-08-06 00:04:06 +02:00
Ettore Di Giacinto
cde61cc518 Update README.md 2023-08-05 23:14:09 +02:00
Ettore Di Giacinto
acd829a7a0 fix: do not break on newlines on function returns (#864)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-04 21:46:36 +02:00
Ettore Di Giacinto
4aa5dac768 feat: update integer, number and string rules - allow primitives as root types (#862)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-03 23:32:30 +02:00
renovate[bot]
08b59b5cc5 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 50cee77 (#861)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-03 19:08:04 +02:00
ci-robbot [bot]
6b900e28cd ⬆️ Update nomic-ai/gpt4all (#859)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-03 19:07:53 +02:00
Ettore Di Giacinto
5ca21ee398 feat: add ngqa and RMSNormEps parameters (#860)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-03 00:51:08 +02:00
renovate[bot]
953e30814a fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to c449b71 (#858)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-03 00:20:45 +02:00
renovate[bot]
a65344cf25 fix(deps): update github.com/tmc/langchaingo digest to 271e9bd (#857)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-03 00:20:22 +02:00
Dave
7fb8b4191f feat: "simple" chat/edit/completion template system prompt from config (#856) 2023-08-03 00:19:55 +02:00
Tyler Harpool
fc8aec7324 Update to working k8sgpt + localai example in documentation (#852) 2023-08-01 22:31:36 +02:00
Ettore Di Giacinto
c309aac8f5 fix(gallery): use inline YAML (#851)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-01 19:09:32 +02:00
Ettore Di Giacinto
1e37ec727d Revert "⬆️ Update go-skynet/go-llama.cpp" (#850) 2023-08-01 19:09:18 +02:00
ci-robbot [bot]
ae36bae59d ⬆️ Update nomic-ai/gpt4all (#847)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-01 00:48:10 +02:00
renovate[bot]
e663beebf0 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to cbdcde8 (#833)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-01 00:47:06 +02:00
Ettore Di Giacinto
9d0292e9e1 Update README.md 2023-08-01 00:42:56 +02:00
Ettore Di Giacinto
fe27bb7982 feat: Update logo (#849) 2023-08-01 00:31:40 +02:00
Ettore Di Giacinto
d603a9cbb5 fix(gallery): preload from file should by in YAML format (#846)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-31 21:13:16 +02:00
Ettore Di Giacinto
c1fc22e746 fix(examples): use pinned versions in the k8sgpt example (#845)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-31 19:15:57 +02:00
renovate[bot]
85d3710924 fix(deps): update github.com/tmc/langchaingo digest to 8f10160 (#843)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-31 19:15:13 +02:00
ci-robbot [bot]
a0324245f1 ⬆️ Update nomic-ai/gpt4all (#841)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-31 19:14:56 +02:00
Dave
ce8e9dc690 feature: model list :: filter query string parameter (#830) 2023-07-31 19:14:32 +02:00
Andrew Zigler
32ca7efbeb 📝 Add OpenOps to README's project list (#832)
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-07-30 15:30:14 +02:00
longy2k
27520eb169 Added "BMO Chatbot" to "Projects already using LocalAI to run local models" section." (#828) 2023-07-30 15:29:23 +02:00
Andrew
9843adb4f1 Create .gitattributes to force git clone to keep the LF line endings on .sh files (#838) 2023-07-30 15:27:43 +02:00
Dave
8e8d474ae8 refactor: Remove remaining uses of depreciated package io/ioutil (#837) 2023-07-30 11:23:43 +00:00
renovate[bot]
6151ea1c4d fix(deps): update github.com/tmc/langchaingo digest to 7df4fe5 (#826)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-30 09:49:12 +02:00
renovate[bot]
d969025f87 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 8c51308 (#822)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-30 09:48:52 +02:00
ci-robbot [bot]
18e1cb9c92 ⬆️ Update nomic-ai/gpt4all (#825)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-30 09:48:30 +02:00
ci-robbot [bot]
e7ceb9e8f5 ⬆️ Update go-skynet/go-llama.cpp (#824)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-30 09:48:10 +02:00
renovate[bot]
3a4675c8c3 fix(deps): update module github.com/rs/zerolog to v1.30.0 (#836)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-30 09:47:49 +02:00
Dave
5ce0f216cf Fix: Model Gallery Downloads (#835) 2023-07-30 09:47:22 +02:00
Ettore Di Giacinto
688f150463 fix: symlink libphonemize in the container (#831) 2023-07-29 12:47:34 +02:00
Ettore Di Giacinto
00ccb8d4f1 fix: set default rope freq base to 10000 during model load
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-29 10:40:56 +02:00
Ettore Di Giacinto
e70b91aaef tests: set a small context_size
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-29 10:29:47 +02:00
Dave
8b90ac2b1a 1000 -> 10,000 for ropeFreqBase?
the error message talks about a default of 10k, so setting this to 10k instead of 1k experimentally.
2023-07-29 02:37:24 -04:00
Ettore Di Giacinto
f085baa77d fix: set default rope if not specified
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-29 01:07:16 +02:00
Ettore Di Giacinto
fa4de05c14 fix: symlink libphonemize in the container
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-28 19:40:21 +02:00
Ettore Di Giacinto
dde12b492b fix: select function calls if 'name' is set in the request (#827)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-28 01:17:11 +02:00
Ettore Di Giacinto
096d98c3d9 fix: add rope settings during model load, fix CUDA (#821)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-27 21:56:05 +02:00
renovate[bot]
147cae9ed8 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 39acbc8 (#817)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-27 18:56:59 +02:00
renovate[bot]
c63709014b fix(deps): update github.com/go-skynet/go-llama.cpp digest to 6ba16de (#820)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-27 18:56:39 +02:00
Wendy Liga
9b307799ce fix missing openai_api_base on langchain-chroma example (#818) 2023-07-27 18:41:53 +02:00
renovate[bot]
78e36779cf fix(deps): update module google.golang.org/grpc to v1.57.0 (#815)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-27 18:41:29 +02:00
ci-robbot [bot]
90ae35e2e4 ⬆️ Update nomic-ai/gpt4all (#814)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-27 18:41:15 +02:00
Ettore Di Giacinto
b96e30e66c fix: use bytes in gRPC proto instead of strings (#813)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-27 18:41:04 +02:00
renovate[bot]
0af0df7423 fix(deps): update module github.com/sashabaranov/go-openai to v1.14.1 (#783)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-27 18:40:50 +02:00
renovate[bot]
0883d324d9 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 562d2b5 (#766)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-26 22:06:05 +02:00
renovate[bot]
77597e6a16 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 9100b2e (#753)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-26 22:05:55 +02:00
renovate[bot]
eae6b36d03 fix(deps): update github.com/donomii/go-rwkv.cpp digest to c898cd0 (#748)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-26 22:05:42 +02:00
renovate[bot]
c4bc7c41b1 fix(deps): update github.com/tmc/langchaingo digest to 7d5f9fd (#768)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-07-26 22:05:32 +02:00
ci-robbot [bot]
c79ddd6fc4 ⬆️ Update nomic-ai/gpt4all (#807)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-25 23:03:02 +02:00
Dave
ae58fb8821 fix: update gitignore and make clean (#798)
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-07-25 23:02:46 +02:00
Ettore Di Giacinto
569c1d1163 feat: add rope settings and negative prompt, drop grammar backend (#797)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-25 19:05:27 +02:00
Aman Gupta Karmani
12fe0932c4 feat: cancel stream generation if client disappears (#792) 2023-07-24 23:10:54 +02:00
finger42
72e3e236de Added CPU information to entrypoint.sh (#794) 2023-07-23 19:27:55 +00:00
Ettore Di Giacinto
ab59b238b3 fix: update README 2023-07-23 18:58:24 +02:00
ci-robbot [bot]
bed9570e48 ⬆️ Update nomic-ai/gpt4all (#785)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-23 09:51:42 +02:00
Dave
c6bf67f446 feat(llama2): add template for chat messages (#782)
Co-authored-by: Aman Karmani <aman@tmm1.net>

Lays some of the groundwork for LLAMA2 compatibility as well as other future models with complex prompting schemes.

Started small refactoring in pkg/model/loader.go regarding template loading. Currently still a part of ModelLoader, but should be easy to add template loading for situations other than overall prompt templates and the new chat-specific per-message templates
Adds support for new chat-endpoint-specific, per-message templates as an alternative to the existing Role: XYZ sprintf method.
Includes a temporary prompt template as an example, since I have a few questions before we merge in the model-gallery side changes (see )
Minor debug logging changes.
2023-07-22 11:31:39 -04:00
ci-robbot [bot]
5ee186b8e5 ⬆️ Update go-skynet/go-llama.cpp (#723)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-22 00:55:33 +02:00
Ettore Di Giacinto
94817b557c fix: make completions endpoint more close to OpenAI specification (#790)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-22 00:53:52 +02:00
Ettore Di Giacinto
26e1496075 Update README.md 2023-07-21 23:10:02 +02:00
Ettore Di Giacinto
92fca8ae74 ci: release space before build
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-21 22:56:43 +02:00
Stepan
7fa5b8401d [Telegram-bot example] Fix lint for command docker-compose (#787)
Co-authored-by: Stepan Zhashkov <steven.z@spectral-team.com>
2023-07-21 20:56:04 +02:00
Ettore Di Giacinto
0eac0402e1 feat: backends improvements (#778) 2023-07-21 20:55:49 +02:00
Ettore Di Giacinto
c71c729bc2 debug 2023-07-21 10:53:26 +02:00
Ettore Di Giacinto
e459f114cd fix: fix tests, small refactors
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 23:52:04 +02:00
Ettore Di Giacinto
982a7e86a8 feat: add huggingface embeddings backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 22:10:42 +02:00
Ettore Di Giacinto
94916749c5 feat: add external grpc and model autoloading 2023-07-20 22:10:12 +02:00
Ettore Di Giacinto
5ce5f87a26 fix: move metal file to grpcs assets (#777)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 22:00:07 +02:00
Ettore Di Giacinto
1d2ae46ddc tests: clean up logs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 01:36:34 +02:00
ci-robbot [bot]
71ac331f90 ⬆️ Update nomic-ai/gpt4all (#775)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-20 01:22:44 +02:00
Ettore Di Giacinto
47cc95fc9f feat: add all backends to autoload
Now since gRPCs are not crashing the main thread we can just greedly
attempt all the backends we have available.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 00:40:28 +02:00
Ettore Di Giacinto
3feb632eb4 refactor: rename "llama-master" and "llama" (#776)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-20 00:36:16 +02:00
Ettore Di Giacinto
236497e331 feat: resolve JSONSchema refs (planners) (#774) 2023-07-19 22:56:13 +02:00
ci-robbot [bot]
a38dc497b2 ⬆️ Update go-skynet/go-llama.cpp (#770)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-19 19:44:33 +02:00
ci-robbot [bot]
28ed52fa94 ⬆️ Update nomic-ai/gpt4all (#769)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-19 19:44:21 +02:00
Enzo Einhorn
e995b95c94 [build] pass build type to cmake on libtransformers.a build (#741)
Co-authored-by: Enzo Einhorn <enzo.einhorn@hiventive.com>
2023-07-18 19:04:19 +02:00
Ettore Di Giacinto
8379cce209 example(functions): Add OpenAI functions example (#767) 2023-07-18 00:04:21 +02:00
ci-robbot [bot]
3c6b798522 ⬆️ Update nomic-ai/gpt4all (#759)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-17 23:58:40 +02:00
ci-robbot [bot]
c18770a61a ⬆️ Update go-skynet/go-bert.cpp (#758)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-07-17 23:58:25 +02:00
Ettore Di Giacinto
6352448b72 feat: add llama-master backend (#752)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-07-17 23:58:15 +02:00
159 changed files with 8921 additions and 1368 deletions

26
.env
View File

@@ -23,7 +23,16 @@ MODELS_PATH=/models
## Enable debug mode
# DEBUG=true
## Disables COMPEL (Diffusers)
# COMPEL=0
## Enable/Disable single backend (useful if only one GPU is available)
# SINGLE_ACTIVE_BACKEND=true
## Specify a build type. Available: cublas, openblas, clblas.
## cuBLAS: This is a GPU-accelerated version of the complete standard BLAS (Basic Linear Algebra Subprograms) library. It's provided by Nvidia and is part of their CUDA toolkit.
## OpenBLAS: This is an open-source implementation of the BLAS library that aims to provide highly optimized code for various platforms. It includes support for multi-threading and can be compiled to use hardware-specific features for additional performance. OpenBLAS can run on many kinds of hardware, including CPUs from Intel, AMD, and ARM.
## clBLAS: This is an open-source implementation of the BLAS library that uses OpenCL, a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. clBLAS is designed to take advantage of the parallel computing power of GPUs but can also run on any hardware that supports OpenCL. This includes hardware from different vendors like Nvidia, AMD, and Intel.
# BUILD_TYPE=openblas
## Uncomment and set to true to enable rebuilding from source
@@ -41,3 +50,20 @@ MODELS_PATH=/models
## Specify a default upload limit in MB (whisper)
# UPLOAD_LIMIT
## List of external GRPC backends (note on the container image this variable is already set to use extra backends available in extra/)
# EXTERNAL_GRPC_BACKENDS=my-backend:127.0.0.1:9000,my-backend2:/usr/bin/backend.py
### Advanced settings ###
### Those are not really used by LocalAI, but from components in the stack ###
##
### Preload libraries
# LD_PRELOAD=
### Huggingface cache for models
# HUGGINGFACE_HUB_CACHE=/usr/local/huggingface
### Python backends GRPC max workers
### Default number of workers for GRPC Python backends.
### This actually controls wether a backend can process multiple requests or not.
# PYTHON_GRPC_MAX_WORKERS=1

1
.gitattributes vendored Normal file
View File

@@ -0,0 +1 @@
*.sh text eol=lf

View File

@@ -8,16 +8,24 @@ This PR fixes #
**[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
Contributing Conventions
-------------------------
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out!
By following the community's contribution conventions upfront, the review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->

View File

@@ -59,12 +59,44 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Release space from worker
run: |
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
df -h
echo
sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
sudo apt-get remove --auto-remove android-sdk-platform-tools || true
sudo apt-get purge --auto-remove android-sdk-platform-tools || true
sudo rm -rf /usr/local/lib/android
sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
sudo rm -rf /usr/share/dotnet
sudo apt-get remove -y '^mono-.*' || true
sudo apt-get remove -y '^ghc-.*' || true
sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
sudo apt-get remove -y 'php.*' || true
sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
sudo apt-get remove -y '^google-.*' || true
sudo apt-get remove -y azure-cli || true
sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
sudo apt-get remove -y '^gfortran-.*' || true
sudo apt-get autoremove -y
sudo apt-get clean
echo
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
sudo rm -rfv build || true
df -h
- name: Checkout
uses: actions/checkout@v3
- name: Docker meta
id: meta
uses: docker/metadata-action@v4
uses: docker/metadata-action@v5
with:
images: quay.io/go-skynet/local-ai
tags: |
@@ -86,14 +118,14 @@ jobs:
- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
password: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
- name: Build and push
uses: docker/build-push-action@v4
uses: docker/build-push-action@v5
with:
builder: ${{ steps.buildx.outputs.name }}
build-args: |

View File

@@ -22,6 +22,9 @@ jobs:
uses: actions/checkout@v3
with:
submodules: true
- uses: actions/setup-go@v4
with:
go-version: '>=1.21.0'
- name: Dependencies
run: |
sudo apt-get update
@@ -60,6 +63,9 @@ jobs:
uses: actions/checkout@v3
with:
submodules: true
- uses: actions/setup-go@v4
with:
go-version: '>=1.21.0'
- name: Build
id: build
env:

View File

@@ -16,12 +16,21 @@ concurrency:
jobs:
ubuntu-latest:
runs-on: ubuntu-latest
strategy:
matrix:
go-version: ['1.21.x']
steps:
- name: Clone
uses: actions/checkout@v3
with:
submodules: true
- name: Setup Go ${{ matrix.go-version }}
uses: actions/setup-go@v4
with:
go-version: ${{ matrix.go-version }}
# You can test your matrix by printing the current Go version
- name: Display Go version
run: go version
- name: Dependencies
run: |
sudo apt-get update
@@ -29,6 +38,7 @@ jobs:
sudo apt-get install -y ca-certificates cmake curl patch
sudo apt-get install -y libopencv-dev && sudo ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
sudo pip install -r extra/requirements.txt
sudo mkdir /build && sudo chmod -R 777 /build && cd /build && \
curl -L "https://github.com/gabime/spdlog/archive/refs/tags/v1.11.0.tar.gz" | \
@@ -42,23 +52,30 @@ jobs:
mkdir -p "lib/Linux-$(uname -m)/piper_phonemize" && \
curl -L "https://github.com/rhasspy/piper-phonemize/releases/download/v1.0.0/libpiper_phonemize-amd64.tar.gz" | \
tar -C "lib/Linux-$(uname -m)/piper_phonemize" -xzvf - && ls -liah /build/lib/Linux-$(uname -m)/piper_phonemize/ && \
sudo cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /lib64/ && \
sudo cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /usr/lib/ && \
sudo ln -s /usr/lib/libpiper_phonemize.so /usr/lib/libpiper_phonemize.so.1 && \
sudo cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/include/. /usr/include/
- name: Test
run: |
ESPEAK_DATA="/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data" GO_TAGS="tts stablediffusion" make test
macOS-latest:
runs-on: macOS-latest
strategy:
matrix:
go-version: ['1.21.x']
steps:
- name: Clone
uses: actions/checkout@v3
with:
submodules: true
- name: Setup Go ${{ matrix.go-version }}
uses: actions/setup-go@v4
with:
go-version: ${{ matrix.go-version }}
# You can test your matrix by printing the current Go version
- name: Display Go version
run: go version
- name: Test
run: |
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" make test

8
.gitignore vendored
View File

@@ -1,11 +1,13 @@
# go-llama build artifacts
go-llama
go-llama-stable
/gpt4all
go-stable-diffusion
go-piper
/go-bert
go-ggllm
/piper
__pycache__/
*.a
get-sources
@@ -21,6 +23,8 @@ LocalAI
local-ai
# prevent above rules from omitting the helm chart
!charts/*
# prevent above rules from omitting the api/localai folder
!api/localai
# Ignore models
models/*
@@ -35,5 +39,5 @@ release/
# Generated during build
backend-assets/
prepare
/ggml-metal.metal

72
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,72 @@
# Contributing to localAI
Thank you for your interest in contributing to LocalAI! We appreciate your time and effort in helping to improve our project. Before you get started, please take a moment to review these guidelines.
## Table of Contents
- [Getting Started](#getting-started)
- [Prerequisites](#prerequisites)
- [Setting up the Development Environment](#setting-up-the-development-environment)
- [Contributing](#contributing)
- [Submitting an Issue](#submitting-an-issue)
- [Creating a Pull Request (PR)](#creating-a-pull-request-pr)
- [Coding Guidelines](#coding-guidelines)
- [Testing](#testing)
- [Documentation](#documentation)
- [Community and Communication](#community-and-communication)
## Getting Started
### Prerequisites
- Golang [1.21]
- Git
- macOS/Linux
### Setting up the Development Environment and running localAI in the local environment
1. Clone the repository: `git clone https://github.com/go-skynet/LocalAI.git`
2. Navigate to the project directory: `cd LocalAI`
3. Install the required dependencies: `make prepare`
4. Run LocalAI: `make run`
## Contributing
We welcome contributions from everyone! To get started, follow these steps:
### Submitting an Issue
If you find a bug, have a feature request, or encounter any issues, please check the [issue tracker](https://github.com/go-skynet/LocalAI/issues) to see if a similar issue has already been reported. If not, feel free to [create a new issue](https://github.com/go-skynet/LocalAI/issues/new) and provide as much detail as possible.
### Creating a Pull Request (PR)
1. Fork the repository.
2. Create a new branch with a descriptive name: `git checkout -b [branch name]`
3. Make your changes and commit them.
4. Push the changes to your fork: `git push origin [branch name]`
5. Create a new pull request from your branch to the main project's `main` or `master` branch.
6. Provide a clear description of your changes in the pull request.
7. Make any requested changes during the review process.
8. Once your PR is approved, it will be merged into the main project.
## Coding Guidelines
- No specific coding guidelines at the moment. Please make sure the code can be tested. The most popular lint tools like []`golangci-lint`](https://golangci-lint.run) can help you here.
## Testing
`make test` cannot handle all the model now. Please be sure to add a test case for the new features or the part was changed.
## Documentation
- We are welcome the contribution of the documents, please open new PR in the official document repo [localai-website](https://github.com/go-skynet/localai-website)
## Community and Communication
- You can reach out via the Github issue tracker.
- Open a new discussion at [Discussion](https://github.com/go-skynet/LocalAI/discussions)
- Join the Discord channel [Discord](https://discord.gg/uJAeKSAGDy)
---

View File

@@ -1,4 +1,4 @@
ARG GO_VERSION=1.20-bullseye
ARG GO_VERSION=1.21-bullseye
FROM golang:$GO_VERSION as requirements
@@ -11,10 +11,16 @@ ARG TARGETARCH
ARG TARGETVARIANT
ENV BUILD_TYPE=${BUILD_TYPE}
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py,autogptq:/build/extra/grpc/autogptq/autogptq.py,bark:/build/extra/grpc/bark/ttsbark.py,diffusers:/build/extra/grpc/diffusers/backend_diffusers.py,exllama:/build/extra/grpc/exllama/exllama.py,vall-e-x:/build/extra/grpc/vall-e-x/ttsvalle.py,vllm:/build/extra/grpc/vllm/backend_vllm.py"
ENV GALLERIES='[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]'
ARG GO_TAGS="stablediffusion tts"
RUN apt-get update && \
apt-get install -y ca-certificates cmake curl patch
apt-get install -y ca-certificates cmake curl patch pip
# Use the variables in subsequent instructions
RUN echo "Target Architecture: $TARGETARCH"
RUN echo "Target Variant: $TARGETVARIANT"
# CuBLAS requirements
RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \
@@ -24,10 +30,26 @@ RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \
dpkg -i cuda-keyring_1.0-1_all.deb && \
rm -f cuda-keyring_1.0-1_all.deb && \
apt-get update && \
apt-get install -y cuda-nvcc-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
apt-get install -y cuda-nvcc-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
; fi
ENV PATH /usr/local/cuda/bin:${PATH}
# Extras requirements
COPY extra/requirements.txt /build/extra/requirements.txt
ENV PATH="/root/.cargo/bin:${PATH}"
RUN pip install --upgrade pip
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
RUN if [ "${TARGETARCH}" = "amd64" ]; then \
pip install git+https://github.com/suno-ai/bark.git diffusers invisible_watermark transformers accelerate safetensors;\
fi
RUN if [ "${BUILD_TYPE}" = "cublas" ] && [ "${TARGETARCH}" = "amd64" ]; then \
pip install torch vllm && pip install auto-gptq https://github.com/jllllll/exllama/releases/download/0.0.10/exllama-0.0.10+cu${CUDA_MAJOR_VERSION}${CUDA_MINOR_VERSION}-cp39-cp39-linux_x86_64.whl;\
fi
RUN pip install -r /build/extra/requirements.txt && rm -rf /build/extra/requirements.txt
# Vall-e-X
RUN git clone https://github.com/Plachtaa/VALL-E-X.git /usr/lib/vall-e-x && cd /usr/lib/vall-e-x && pip install -r requirements.txt
WORKDIR /build
# OpenBLAS requirements
@@ -37,9 +59,6 @@ RUN apt-get install -y libopenblas-dev
RUN apt-get install -y libopencv-dev && \
ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
# Use the variables in subsequent instructions
RUN echo "Target Architecture: $TARGETARCH"
RUN echo "Target Variant: $TARGETVARIANT"
# piper requirements
# Use pre-compiled Piper phonemization library (includes onnxruntime)
@@ -58,8 +77,8 @@ RUN curl -L "https://github.com/gabime/spdlog/archive/refs/tags/v${SPDLOG_VERSIO
mkdir -p "lib/Linux-$(uname -m)/piper_phonemize" && \
curl -L "https://github.com/rhasspy/piper-phonemize/releases/download/v${PIPER_PHONEMIZE_VERSION}/libpiper_phonemize-${TARGETARCH:-$(go env GOARCH)}${TARGETVARIANT}.tar.gz" | \
tar -C "lib/Linux-$(uname -m)/piper_phonemize" -xzvf - && ls -liah /build/lib/Linux-$(uname -m)/piper_phonemize/ && \
cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /lib64/ && \
cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /usr/lib/ && \
ln -s /usr/lib/libpiper_phonemize.so /usr/lib/libpiper_phonemize.so.1 && \
cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/include/. /usr/include/
# \
# ; fi
@@ -93,7 +112,10 @@ RUN ESPEAK_DATA=/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data
FROM requirements
ARG FFMPEG
ARG BUILD_TYPE
ARG TARGETARCH
ENV BUILD_TYPE=${BUILD_TYPE}
ENV REBUILD=false
ENV HEALTHCHECK_ENDPOINT=http://localhost:8080/readyz
@@ -112,6 +134,13 @@ COPY . .
RUN make prepare-sources
COPY --from=builder /build/local-ai ./
# Copy VALLE-X as it's not a real "lib"
RUN cp -rfv /usr/lib/vall-e-x/* ./
# To resolve exllama import error
RUN if [ "${BUILD_TYPE}" = "cublas" ] && [ "${TARGETARCH:-$(go env GOARCH)}" = "amd64" ]; then \
cp -rfv /usr/local/lib/python3.9/dist-packages/exllama extra/grpc/exllama/;\
fi
# Define the health check command
HEALTHCHECK --interval=1m --timeout=10m --retries=10 \
CMD curl -f $HEALTHCHECK_ENDPOINT || exit 1

View File

@@ -4,20 +4,13 @@ GOVET=$(GOCMD) vet
BINARY_NAME=local-ai
# llama.cpp versions
# Temporarly pinned to https://github.com/go-skynet/go-llama.cpp/pull/124
GOLLAMA_VERSION?=cb8d7cd4cb95725a04504a9e3a26dd72a12b69ac
GOLLAMA_VERSION?=d9f6176409de0a2b5ce798de502545c6721e346e
# Temporary set a specific version of llama.cpp
# containing: https://github.com/ggerganov/llama.cpp/pull/1773 and
# rebased on top of master.
# This pin can be dropped when the PR above is merged, and go-llama has merged changes as well
# Set empty to use the version pinned by go-llama
LLAMA_CPP_REPO?=https://github.com/mudler/llama.cpp
LLAMA_CPP_VERSION?=48ce8722a05a018681634af801fd0fd45b3a87cc
GOLLAMA_STABLE_VERSION?=50cee7712066d9e38306eccadcfbb44ea87df4b7
# gpt4all version
GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all
GPT4ALL_VERSION?=cfd70b69fcf5e587b8e0e3e9b9aaa90e19cbbc51
GPT4ALL_VERSION?=27a8b020c36b0df8f8b82a252d261cda47cf44b8
# go-ggml-transformers version
GOGGMLTRANSFORMERS_VERSION?=ffb09d7dd71e2cbc6c5d7d05357d230eea6f369a
@@ -30,7 +23,7 @@ RWKV_VERSION?=c898cd0f62df8f2a7830e53d1d513bef4f6f792b
WHISPER_CPP_VERSION?=85ed71aaec8e0612a84c0b67804bde75aa75a273
# bert.cpp version
BERT_VERSION?=6069103f54b9969c02e789d0fb12a23bd614285f
BERT_VERSION?=6abe312cded14042f6b7c3cd8edf082713334a4d
# go-piper version
PIPER_VERSION?=56b8a81b4760a6fbee1a82e62f007ae7e8f010a7
@@ -45,6 +38,7 @@ STABLEDIFFUSION_VERSION?=d89260f598afb809279bc72aa0107b4292587632
GOGGLLM_VERSION?=862477d16eefb0805261c19c9b0d053e3b2b684b
export BUILD_TYPE?=
export CMAKE_ARGS?=
CGO_LDFLAGS?=
CUDA_LIBPATH?=/usr/local/cuda/lib64/
GO_TAGS?=
@@ -71,9 +65,12 @@ ifndef UNAME_S
UNAME_S := $(shell uname -s)
endif
# workaround for rwkv.cpp
ifeq ($(UNAME_S),Darwin)
CGO_LDFLAGS += -lcblas -framework Accelerate
CGO_LDFLAGS += -lcblas -framework Accelerate
ifneq ($(BUILD_TYPE),metal)
# explicit disable metal if on Darwin and metal is disabled
CMAKE_ARGS+=-DLLAMA_METAL=OFF
endif
endif
ifeq ($(BUILD_TYPE),openblas)
@@ -110,6 +107,8 @@ ifeq ($(findstring tts,$(GO_TAGS)),tts)
OPTIONAL_GRPC+=backend-assets/grpc/piper
endif
GRPC_BACKENDS?=backend-assets/grpc/langchain-huggingface backend-assets/grpc/falcon-ggml backend-assets/grpc/bert-embeddings backend-assets/grpc/falcon backend-assets/grpc/bloomz backend-assets/grpc/llama backend-assets/grpc/llama-stable backend-assets/grpc/gpt4all backend-assets/grpc/dolly backend-assets/grpc/gpt2 backend-assets/grpc/gptj backend-assets/grpc/gptneox backend-assets/grpc/mpt backend-assets/grpc/replit backend-assets/grpc/starcoder backend-assets/grpc/rwkv backend-assets/grpc/whisper $(OPTIONAL_GRPC)
.PHONY: all test build vendor
all: help
@@ -188,7 +187,7 @@ go-ggml-transformers:
cd go-ggml-transformers && git checkout -b build $(GOGPT2_VERSION) && git submodule update --init --recursive --depth 1
go-ggml-transformers/libtransformers.a: go-ggml-transformers
$(MAKE) -C go-ggml-transformers libtransformers.a
$(MAKE) -C go-ggml-transformers BUILD_TYPE=$(BUILD_TYPE) libtransformers.a
whisper.cpp:
git clone https://github.com/ggerganov/whisper.cpp.git
@@ -200,21 +199,24 @@ whisper.cpp/libwhisper.a: whisper.cpp
go-llama:
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp go-llama
cd go-llama && git checkout -b build $(GOLLAMA_VERSION) && git submodule update --init --recursive --depth 1
ifneq ($(LLAMA_CPP_REPO),)
cd go-llama && rm -rf llama.cpp && git clone $(LLAMA_CPP_REPO) llama.cpp && cd llama.cpp && git checkout -b build $(LLAMA_CPP_VERSION) && git submodule update --init --recursive --depth 1
endif
go-llama-stable:
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp go-llama-stable
cd go-llama-stable && git checkout -b build $(GOLLAMA_STABLE_VERSION) && git submodule update --init --recursive --depth 1
go-llama/libbinding.a: go-llama
$(MAKE) -C go-llama BUILD_TYPE=$(BUILD_TYPE) libbinding.a
go-llama-stable/libbinding.a: go-llama-stable
$(MAKE) -C go-llama-stable BUILD_TYPE=$(BUILD_TYPE) libbinding.a
go-piper/libpiper_binding.a:
$(MAKE) -C go-piper libpiper_binding.a example/main
get-sources: go-llama go-ggllm go-ggml-transformers gpt4all go-piper go-rwkv whisper.cpp go-bert bloomz go-stable-diffusion
get-sources: go-llama go-llama-stable go-ggllm go-ggml-transformers gpt4all go-piper go-rwkv whisper.cpp go-bert bloomz go-stable-diffusion
touch $@
replace:
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/go-llama
$(GOCMD) mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=$(shell pwd)/gpt4all/gpt4all-bindings/golang
$(GOCMD) mod edit -replace github.com/go-skynet/go-ggml-transformers.cpp=$(shell pwd)/go-ggml-transformers
$(GOCMD) mod edit -replace github.com/donomii/go-rwkv.cpp=$(shell pwd)/go-rwkv
@@ -232,6 +234,7 @@ prepare-sources: get-sources replace
rebuild: ## Rebuilds the project
$(GOCMD) clean -cache
$(MAKE) -C go-llama clean
$(MAKE) -C go-llama-stable clean
$(MAKE) -C gpt4all/gpt4all-bindings/golang/ clean
$(MAKE) -C go-ggml-transformers clean
$(MAKE) -C go-rwkv clean
@@ -243,13 +246,15 @@ rebuild: ## Rebuilds the project
$(MAKE) -C go-ggllm clean
$(MAKE) build
prepare: prepare-sources $(OPTIONAL_TARGETS)
prepare: prepare-sources $(OPTIONAL_TARGETS)
touch $@
clean: ## Remove build related file
$(GOCMD) clean -cache
rm -fr ./go-llama
rm -f prepare
rm -rf ./go-llama
rm -rf ./gpt4all
rm -rf ./go-llama-stable
rm -rf ./go-gpt2
rm -rf ./go-stable-diffusion
rm -rf ./go-ggml-transformers
@@ -272,9 +277,6 @@ build: grpcs prepare ## Build the project
$(info ${GREEN}I LD_FLAGS: ${YELLOW}$(LD_FLAGS)${RESET})
CGO_LDFLAGS="$(CGO_LDFLAGS)" $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o $(BINARY_NAME) ./
ifeq ($(BUILD_TYPE),metal)
cp go-llama/build/bin/ggml-metal.metal .
endif
dist: build
mkdir -p release
@@ -303,10 +305,11 @@ test: prepare test-models/testmodel grpcs
@echo 'Running tests'
export GO_TAGS="tts stablediffusion"
$(MAKE) prepare-test
TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama" --flake-attempts 5 -v -r ./api ./pkg
HUGGINGFACE_GRPC=$(abspath ./)/extra/grpc/huggingface/huggingface.py TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama && !llama-gguf" --flake-attempts 5 -v -r ./api ./pkg
$(MAKE) test-gpt4all
$(MAKE) test-llama
$(MAKE) test-llama-gguf
$(MAKE) test-tts
$(MAKE) test-stablediffusion
@@ -318,6 +321,10 @@ test-llama: prepare-test
TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="llama" --flake-attempts 5 -v -r ./api ./pkg
test-llama-gguf: prepare-test
TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="llama-gguf" --flake-attempts 5 -v -r ./api ./pkg
test-tts: prepare-test
TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="tts" --flake-attempts 1 -v -r ./api ./pkg
@@ -328,9 +335,7 @@ test-stablediffusion: prepare-test
test-container:
docker build --target requirements -t local-ai-test-container .
docker run --name localai-tests -e GO_TAGS=$(GO_TAGS) -ti -v $(abspath ./):/build local-ai-test-container make test
docker rm localai-tests
docker rmi local-ai-test-container
docker run -ti --rm --entrypoint /bin/bash -ti -v $(abspath ./):/build local-ai-test-container
## Help:
help: ## Show this help.
@@ -344,10 +349,21 @@ help: ## Show this help.
else if (/^## .*$$/) {printf " ${CYAN}%s${RESET}\n", substr($$1,4)} \
}' $(MAKEFILE_LIST)
protogen:
protogen: protogen-go protogen-python
protogen-go:
protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative \
pkg/grpc/proto/backend.proto
protogen-python:
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/huggingface/ --grpc_python_out=extra/grpc/huggingface/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/autogptq/ --grpc_python_out=extra/grpc/autogptq/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/exllama/ --grpc_python_out=extra/grpc/exllama/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/bark/ --grpc_python_out=extra/grpc/bark/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/diffusers/ --grpc_python_out=extra/grpc/diffusers/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/vall-e-x/ --grpc_python_out=extra/grpc/vall-e-x/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/vllm/ --grpc_python_out=extra/grpc/vllm/ pkg/grpc/proto/backend.proto
## GRPC
backend-assets/grpc:
@@ -358,8 +374,18 @@ backend-assets/grpc/falcon: backend-assets/grpc go-ggllm/libggllm.a
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/falcon ./cmd/grpc/falcon/
backend-assets/grpc/llama: backend-assets/grpc go-llama/libbinding.a
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/go-llama
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-llama LIBRARY_PATH=$(shell pwd)/go-llama \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama ./cmd/grpc/llama/
# TODO: every binary should have its own folder instead, so can have different metal implementations
ifeq ($(BUILD_TYPE),metal)
cp go-llama/build/bin/ggml-metal.metal backend-assets/grpc/
endif
backend-assets/grpc/llama-stable: backend-assets/grpc go-llama-stable/libbinding.a
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/go-llama-stable
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-llama-stable LIBRARY_PATH=$(shell pwd)/go-llama \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama-stable ./cmd/grpc/llama-stable/
backend-assets/grpc/gpt4all: backend-assets/grpc backend-assets/gpt4all gpt4all/gpt4all-bindings/golang/libgpt4all.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/gpt4all/gpt4all-bindings/golang/ LIBRARY_PATH=$(shell pwd)/gpt4all/gpt4all-bindings/golang/ \
@@ -424,4 +450,4 @@ backend-assets/grpc/whisper: backend-assets/grpc whisper.cpp/libwhisper.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/whisper.cpp LIBRARY_PATH=$(shell pwd)/whisper.cpp \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/whisper ./cmd/grpc/whisper/
grpcs: prepare backend-assets/grpc/langchain-huggingface backend-assets/grpc/falcon-ggml backend-assets/grpc/bert-embeddings backend-assets/grpc/falcon backend-assets/grpc/bloomz backend-assets/grpc/llama backend-assets/grpc/gpt4all backend-assets/grpc/dolly backend-assets/grpc/gpt2 backend-assets/grpc/gptj backend-assets/grpc/gptneox backend-assets/grpc/mpt backend-assets/grpc/replit backend-assets/grpc/starcoder backend-assets/grpc/rwkv backend-assets/grpc/whisper $(OPTIONAL_GRPC)
grpcs: prepare $(GRPC_BACKENDS)

283
README.md
View File

@@ -1,208 +1,135 @@
<h1 align="center">
<br>
<img height="300" src="https://user-images.githubusercontent.com/2420543/233147843-88697415-6dbf-4368-a862-ab217f9f7342.jpeg"> <br>
<img height="300" src="https://github.com/go-skynet/LocalAI/assets/2420543/0966aa2a-166e-4f99-a3e5-6c915fc997dd"> <br>
LocalAI
<br>
</h1>
[![tests](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml) [![build container images](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml)
<p align="center">
<a href="https://github.com/go-skynet/LocalAI/fork" target="blank">
<img src="https://img.shields.io/github/forks/go-skynet/LocalAI?style=for-the-badge" alt="LocalAI forks"/>
</a>
<a href="https://github.com/go-skynet/LocalAI/stargazers" target="blank">
<img src="https://img.shields.io/github/stars/go-skynet/LocalAI?style=for-the-badge" alt="LocalAI stars"/>
</a>
<a href="https://github.com/go-skynet/LocalAI/pulls" target="blank">
<img src="https://img.shields.io/github/issues-pr/go-skynet/LocalAI?style=for-the-badge" alt="LocalAI pull-requests"/>
</a>
<a href='https://github.com/go-skynet/LocalAI/releases'>
<img src='https://img.shields.io/github/release/go-skynet/LocalAI?&label=Latest&style=for-the-badge'>
</a>
</p>
[![](https://dcbadge.vercel.app/api/server/uJAeKSAGDy?style=flat-square&theme=default-inverted)](https://discord.gg/uJAeKSAGDy)
> :bulb: Get help - [❓FAQ](https://localai.io/faq/) [💭Discussions](https://github.com/go-skynet/LocalAI/discussions) [:speech_balloon: Discord](https://discord.gg/uJAeKSAGDy) [:book: Documentation website](https://localai.io/)
>
> [💻 Quickstart](https://localai.io/basics/getting_started/) [📣 News](https://localai.io/basics/news/) [ 🛫 Examples ](https://github.com/go-skynet/LocalAI/tree/master/examples/) [ 🖼️ Models ](https://localai.io/models/)
[Documentation website](https://localai.io/)
**LocalAI** is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
[![tests](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml)[![Build and Release](https://github.com/go-skynet/LocalAI/actions/workflows/release.yaml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/release.yaml)[![build container images](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml)[![Bump dependencies](https://github.com/go-skynet/LocalAI/actions/workflows/bump_deps.yaml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/bump_deps.yaml)[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/localai)](https://artifacthub.io/packages/search?repo=localai)
For a list of the supported model families, please see [the model compatibility table](https://localai.io/model-compatibility/index.html#model-compatibility-table).
**LocalAI** is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format, pytorch and more. Does not require GPU.
<p align="center"><b>Follow LocalAI </b></p>
<p align="center">
<a href="https://twitter.com/LocalAI_API" target="blank">
<img src="https://img.shields.io/twitter/follow/LocalAI_API?label=Follow: LocalAI_API&style=social" alt="Follow LocalAI_API"/>
</a>
<a href="https://discord.gg/uJAeKSAGDy" target="blank">
<img src="https://dcbadge.vercel.app/api/server/uJAeKSAGDy?style=flat-square&theme=default-inverted" alt="Join LocalAI Discord Community"/>
</a>
<p align="center"><b>Connect with the Creator </b></p>
<p align="center">
<a href="https://twitter.com/mudler_it" target="blank">
<img src="https://img.shields.io/twitter/follow/mudler_it?label=Follow: mudler_it&style=social" alt="Follow mudler_it"/>
</a>
<a href='https://github.com/mudler'>
<img alt="Follow on Github" src="https://img.shields.io/badge/Follow-mudler-black?logo=github&link=https%3A%2F%2Fgithub.com%2Fmudler">
</a>
</p>
<p align="center"><b>Share LocalAI Repository</b></p>
<p align="center">
<a href="https://twitter.com/intent/tweet?text=Check%20this%20GitHub%20repository%20out.%20LocalAI%20-%20Let%27s%20you%20easily%20run%20LLM%20locally.&url=https://github.com/go-skynet/LocalAI&hashtags=LocalAI,AI" target="blank">
<img src="https://img.shields.io/twitter/follow/_LocalAI?label=Share Repo on Twitter&style=social" alt="Follow _LocalAI"/></a>
<a href="https://t.me/share/url?text=Check%20this%20GitHub%20repository%20out.%20LocalAI%20-%20Let%27s%20you%20easily%20run%20LLM%20locally.&url=https://github.com/go-skynet/LocalAI" target="_blank"><img src="https://img.shields.io/twitter/url?label=Telegram&logo=Telegram&style=social&url=https://github.com/go-skynet/LocalAI" alt="Share on Telegram"/></a>
<a href="https://api.whatsapp.com/send?text=Check%20this%20GitHub%20repository%20out.%20LocalAI%20-%20Let%27s%20you%20easily%20run%20LLM%20locally.%20https://github.com/go-skynet/LocalAI"><img src="https://img.shields.io/twitter/url?label=whatsapp&logo=whatsapp&style=social&url=https://github.com/go-skynet/LocalAI" /></a> <a href="https://www.reddit.com/submit?url=https://github.com/go-skynet/LocalAI&title=Check%20this%20GitHub%20repository%20out.%20LocalAI%20-%20Let%27s%20you%20easily%20run%20LLM%20locally.
" target="blank">
<img src="https://img.shields.io/twitter/url?label=Reddit&logo=Reddit&style=social&url=https://github.com/go-skynet/LocalAI" alt="Share on Reddit"/>
</a> <a href="mailto:?subject=Check%20this%20GitHub%20repository%20out.%20LocalAI%20-%20Let%27s%20you%20easily%20run%20LLM%20locally.%3A%0Ahttps://github.com/go-skynet/LocalAI" target="_blank"><img src="https://img.shields.io/twitter/url?label=Gmail&logo=Gmail&style=social&url=https://github.com/go-skynet/LocalAI"/></a> <a href="https://www.buymeacoffee.com/mudler" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/default-orange.png" alt="Buy Me A Coffee" height="23" width="100" style="border-radius:1px"></a>
</p>
<hr>
In a nutshell:
- Local, OpenAI drop-in alternative REST API. You own your data.
- NO GPU required. NO Internet access is required either
- Optional, GPU Acceleration is available in `llama.cpp`-compatible LLMs. See also the [build section](https://localai.io/basics/build/index.html).
- Supports multiple models:
- 📖 Text generation with GPTs (`llama.cpp`, `gpt4all.cpp`, ... and more)
- 🗣 Text to Audio 🎺🆕
- 🔈 Audio to Text (Audio transcription with `whisper.cpp`)
- 🎨 Image generation with stable diffusion
- Supports multiple models
- 🏃 Once loaded the first time, it keep models loaded in memory for faster inference
- ⚡ Doesn't shell-out, but uses C++ bindings for a faster inference and better performance.
- ⚡ Doesn't shell-out, but uses C++ bindings for a faster inference and better performance.
LocalAI was created by [Ettore Di Giacinto](https://github.com/mudler/) and is a community-driven project, focused on making the AI accessible to anyone. Any contribution, feedback and PR is welcome!
See the [Getting started](https://localai.io/basics/getting_started/index.html) and [examples](https://github.com/go-skynet/LocalAI/tree/master/examples/) sections to learn how to use LocalAI. For a list of curated models check out the [model gallery](https://localai.io/models/).
Note that this started just as a [fun weekend project](https://localai.io/#backstory) in order to try to create the necessary pieces for a full AI assistant like `ChatGPT`: the community is growing fast and we are working hard to make it better and more stable. If you want to help, please consider contributing (see below)!
## 🔥🔥 [Hot topics / Roadmap](https://localai.io/#-hot-topics--roadmap)
## 🚀 [Features](https://localai.io/features/)
- 📖 [Text generation with GPTs](https://localai.io/features/text-generation/) (`llama.cpp`, `gpt4all.cpp`, ... [:book: and more](https://localai.io/model-compatibility/index.html#model-compatibility-table))
- 🗣 [Text to Audio](https://localai.io/features/text-to-audio/)
- 🔈 [Audio to Text](https://localai.io/features/audio-to-text/) (Audio transcription with `whisper.cpp`)
- 🎨 [Image generation with stable diffusion](https://localai.io/features/image-generation)
- 🔥 [OpenAI functions](https://localai.io/features/openai-functions/) 🆕
- 🧠 [Embeddings generation for vector databases](https://localai.io/features/embeddings/)
- ✍️ [Constrained grammars](https://localai.io/features/constrained_grammars/)
- 🖼️ [Download Models directly from Huggingface ](https://localai.io/models/)
| [ChatGPT OSS alternative](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) | [Image generation](https://localai.io/api-endpoints/index.html#image-generation) |
|------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
| ![Screenshot from 2023-04-26 23-59-55](https://user-images.githubusercontent.com/2420543/234715439-98d12e03-d3ce-4f94-ab54-2b256808e05e.png) | ![b6441997879](https://github.com/go-skynet/LocalAI/assets/2420543/d50af51c-51b7-4f39-b6c2-bf04c403894c) |
| [Telegram bot](https://github.com/go-skynet/LocalAI/tree/master/examples/telegram-bot) | [Flowise](https://github.com/go-skynet/LocalAI/tree/master/examples/flowise) |
|------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
![Screenshot from 2023-06-09 00-36-26](https://github.com/go-skynet/LocalAI/assets/2420543/e98b4305-fa2d-41cf-9d2f-1bb2d75ca902) | ![Screenshot from 2023-05-30 18-01-03](https://github.com/go-skynet/LocalAI/assets/2420543/02458782-0549-4131-971c-95ee56ec1af8)| |
## Hot topics / Roadmap
- [x] Support for embeddings
- [x] Support for audio transcription with https://github.com/ggerganov/whisper.cpp
- [X] Support for text-to-audio
- [x] GPU/CUDA support ( https://github.com/go-skynet/LocalAI/issues/69 )
- [X] Enable automatic downloading of models from a curated gallery
- [X] Enable automatic downloading of models from HuggingFace
- [ ] Upstream our golang bindings to llama.cpp (https://github.com/ggerganov/llama.cpp/issues/351)
- [ ] Enable gallery management directly from the webui.
- [ ] 🔥 OpenAI functions: https://github.com/go-skynet/LocalAI/issues/588
## News
- 🔥🔥🔥 28-06-2023: **v1.20.0**: Added text to audio and gallery huggingface repositories! [Release notes](https://localai.io/basics/news/index.html#-28-06-2023-__v1200__-) [Changelog](https://github.com/go-skynet/LocalAI/releases/tag/v1.20.0)
- 🔥🔥🔥 19-06-2023: **v1.19.0**: CUDA support! [Release notes](https://localai.io/basics/news/index.html#-19-06-2023-__v1190__-) [Changelog](https://github.com/go-skynet/LocalAI/releases/tag/v1.19.0)
- 🔥🔥🔥 06-06-2023: **v1.18.0**: Many updates, new features, and much more 🚀, check out the [Release notes](https://localai.io/basics/news/index.html#-06-06-2023-__v1180__-)!
- 29-05-2023: LocalAI now has a website, [https://localai.io](https://localai.io)! check the news in the [dedicated section](https://localai.io/basics/news/index.html)!
For latest news, follow also on Twitter [@LocalAI_API](https://twitter.com/LocalAI_API) and [@mudler_it](https://twitter.com/mudler_it)
## Media, Blogs, Social
## :book: 🎥 [Media, Blogs, Social](https://localai.io/basics/news/#media-blogs-social)
- [Create a slackbot for teams and OSS projects that answer to documentation](https://mudler.pm/posts/smart-slackbot-for-teams/)
- [LocalAI meets k8sgpt](https://www.youtube.com/watch?v=PKrDNuJ_dfE)
- [Question Answering on Documents locally with LangChain, LocalAI, Chroma, and GPT4All](https://mudler.pm/posts/localai-question-answering/)
- [Tutorial to use k8sgpt with LocalAI](https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65)
## Contribute and help
## 💻 Usage
To help the project you can:
Check out the [Getting started](https://localai.io/basics/getting_started/index.html) section in our documentation.
- [Hacker news post](https://news.ycombinator.com/item?id=35726934) - help us out by voting if you like this project.
### 💡 Example: Use Luna-AI Llama model
- If you have technological skills and want to contribute to development, have a look at the open issues. If you are new you can have a look at the [good-first-issue](https://github.com/go-skynet/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) and [help-wanted](https://github.com/go-skynet/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels.
See the [documentation](https://localai.io/basics/getting_started)
- If you don't have technological skills you can still help improving documentation or add examples or share your user-stories with our community, any help and contribution is welcome!
### 🔗 Resources
## Usage
- [How to build locally](https://localai.io/basics/build/index.html)
- [How to install in Kubernetes](https://localai.io/basics/getting_started/index.html#run-localai-in-kubernetes)
- [Projects integrating LocalAI](https://localai.io/integrations/)
- [How tos section](https://localai.io/howtos/) (curated by our community)
## Citation
Check out the [Getting started](https://localai.io/basics/getting_started/index.html) section. Here below you will find generic, quick instructions to get ready and use LocalAI.
The easiest way to run LocalAI is by using `docker-compose` (to build locally, see [building LocalAI](https://localai.io/basics/build/index.html)):
```bash
git clone https://github.com/go-skynet/LocalAI
cd LocalAI
# (optional) Checkout a specific LocalAI tag
# git checkout -b build <TAG>
# copy your models to models/
cp your-model.bin models/
# (optional) Edit the .env file to set things like context size and threads
# vim .env
# start with docker-compose
docker-compose up -d --pull always
# or you can build the images with:
# docker-compose up -d --build
# Now API is accessible at localhost:8080
curl http://localhost:8080/v1/models
# {"object":"list","data":[{"id":"your-model.bin","object":"model"}]}
curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
"model": "your-model.bin",
"prompt": "A long time ago in a galaxy far, far away",
"temperature": 0.7
}'
```
### Example: Use GPT4ALL-J model
<details>
```bash
# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI
cd LocalAI
# (optional) Checkout a specific LocalAI tag
# git checkout -b build <TAG>
# Download gpt4all-j to models/
wget https://gpt4all.io/models/ggml-gpt4all-j.bin -O models/ggml-gpt4all-j
# Use a template from the examples
cp -rf prompt-templates/ggml-gpt4all-j.tmpl models/
# (optional) Edit the .env file to set things like context size and threads
# vim .env
# start with docker-compose
docker-compose up -d --pull always
# or you can build the images with:
# docker-compose up -d --build
# Now API is accessible at localhost:8080
curl http://localhost:8080/v1/models
# {"object":"list","data":[{"id":"ggml-gpt4all-j","object":"model"}]}
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "ggml-gpt4all-j",
"messages": [{"role": "user", "content": "How are you?"}],
"temperature": 0.9
}'
# {"model":"ggml-gpt4all-j","choices":[{"message":{"role":"assistant","content":"I'm doing well, thanks. How about you?"}}]}
```
</details>
### Build locally
<details>
In order to build the `LocalAI` container image locally you can use `docker`:
If you utilize this repository, data in a downstream project, please consider citing it with:
```
# build the image
docker build -t localai .
docker run localai
@misc{localai,
author = {Ettore Di Giacinto},
title = {LocalAI: The free, Open source OpenAI alternative},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/go-skynet/LocalAI}},
```
Or you can build the binary with `make`:
```
make build
```
</details>
See the [build section](https://localai.io/basics/build/index.html) in our documentation for detailed instructions.
### Run LocalAI in Kubernetes
LocalAI can be installed inside Kubernetes with helm. See [installation instructions](https://localai.io/basics/getting_started/index.html#run-localai-in-kubernetes).
## Supported API endpoints
See the [list of the supported API endpoints](https://localai.io/api-endpoints/index.html) and how to configure image generation and audio transcription.
## Frequently asked questions
See [the FAQ](https://localai.io/faq/index.html) section for a list of common questions.
## Projects already using LocalAI to run local models
Feel free to open up a PR to get your project listed!
- [Kairos](https://github.com/kairos-io/kairos)
- [k8sgpt](https://github.com/k8sgpt-ai/k8sgpt#running-local-models)
- [Spark](https://github.com/cedriking/spark)
- [autogpt4all](https://github.com/aorumbayev/autogpt4all)
- [Mods](https://github.com/charmbracelet/mods)
- [Flowise](https://github.com/FlowiseAI/Flowise)
## Sponsors
## ❤️ Sponsors
> Do you find LocalAI useful?
@@ -215,21 +142,22 @@ A huge thank you to our generous sponsors who support this project:
| [Spectro Cloud](https://www.spectrocloud.com/) |
| Spectro Cloud kindly supports LocalAI by providing GPU and computing resources to run tests on lamdalabs! |
## Star history
And a huge shout-out to individuals sponsoring the project by donating hardware or backing the project.
- [Sponsor list](https://github.com/sponsors/mudler)
- JDAM00 (donating HW for the CI)
## 🌟 Star history
[![LocalAI Star history Chart](https://api.star-history.com/svg?repos=go-skynet/LocalAI&type=Date)](https://star-history.com/#go-skynet/LocalAI&Date)
## License
## 📖 License
LocalAI is a community-driven project created by [Ettore Di Giacinto](https://github.com/mudler/).
MIT
MIT - Author Ettore Di Giacinto
## Author
Ettore Di Giacinto and others
## Acknowledgements
## 🙇 Acknowledgements
LocalAI couldn't have been built without the help of great software already available from the community. Thank you!
@@ -240,9 +168,12 @@ LocalAI couldn't have been built without the help of great software already avai
- https://github.com/EdVince/Stable-Diffusion-NCNN
- https://github.com/ggerganov/whisper.cpp
- https://github.com/saharNooby/rwkv.cpp
- https://github.com/rhasspy/piper
- https://github.com/cmp-nct/ggllm.cpp
## Contributors
## 🤗 Contributors
This is a community project, a special thanks to our contributors! 🤗
<a href="https://github.com/go-skynet/LocalAI/graphs/contributors">
<img src="https://contrib.rocks/image?repo=go-skynet/LocalAI" />
</a>

View File

@@ -2,11 +2,14 @@ package api
import (
"errors"
"fmt"
"strings"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/localai"
"github.com/go-skynet/LocalAI/api/openai"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/go-skynet/LocalAI/internal"
"github.com/go-skynet/LocalAI/pkg/assets"
@@ -18,7 +21,7 @@ import (
"github.com/rs/zerolog/log"
)
func App(opts ...options.AppOption) (*fiber.App, error) {
func Startup(opts ...options.AppOption) (*options.Option, *config.ConfigLoader, error) {
options := options.NewOptions(opts...)
zerolog.SetGlobalLevel(zerolog.InfoLevel)
@@ -26,6 +29,65 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
zerolog.SetGlobalLevel(zerolog.DebugLevel)
}
log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.Loader.ModelPath)
log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion())
cl := config.NewConfigLoader()
if err := cl.LoadConfigs(options.Loader.ModelPath); err != nil {
log.Error().Msgf("error loading config files: %s", err.Error())
}
if options.ConfigFile != "" {
if err := cl.LoadConfigFile(options.ConfigFile); err != nil {
log.Error().Msgf("error loading config file: %s", err.Error())
}
}
if options.Debug {
for _, v := range cl.ListConfigs() {
cfg, _ := cl.GetConfig(v)
log.Debug().Msgf("Model: %s (config: %+v)", v, cfg)
}
}
if options.AssetsDestination != "" {
// Extract files from the embedded FS
err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination)
log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination)
if err != nil {
log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly, like gpt4all)", err)
}
}
if options.PreloadJSONModels != "" {
if err := localai.ApplyGalleryFromString(options.Loader.ModelPath, options.PreloadJSONModels, cl, options.Galleries); err != nil {
return nil, nil, err
}
}
if options.PreloadModelsFromPath != "" {
if err := localai.ApplyGalleryFromFile(options.Loader.ModelPath, options.PreloadModelsFromPath, cl, options.Galleries); err != nil {
return nil, nil, err
}
}
// turn off any process that was started by GRPC if the context is canceled
go func() {
<-options.Context.Done()
log.Debug().Msgf("Context canceled, shutting down")
options.Loader.StopAllGRPC()
}()
return options, cl, nil
}
func App(opts ...options.AppOption) (*fiber.App, error) {
options, cl, err := Startup(opts...)
if err != nil {
return nil, fmt.Errorf("failed basic startup tasks with error %s", err.Error())
}
// Return errors as JSON responses
app := fiber.New(fiber.Config{
BodyLimit: options.UploadLimitMB * 1024 * 1024, // this is the default limit of 4MB
@@ -43,8 +105,8 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
// Send custom error page
return ctx.Status(code).JSON(
openai.ErrorResponse{
Error: &openai.APIError{Message: err.Error(), Code: code},
schema.ErrorResponse{
Error: &schema.APIError{Message: err.Error(), Code: code},
},
)
},
@@ -56,49 +118,33 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
}))
}
log.Info().Msgf("Starting LocalAI using %d threads, with models path: %s", options.Threads, options.Loader.ModelPath)
log.Info().Msgf("LocalAI version: %s", internal.PrintableVersion())
cm := config.NewConfigLoader()
if err := cm.LoadConfigs(options.Loader.ModelPath); err != nil {
log.Error().Msgf("error loading config files: %s", err.Error())
}
if options.ConfigFile != "" {
if err := cm.LoadConfigFile(options.ConfigFile); err != nil {
log.Error().Msgf("error loading config file: %s", err.Error())
}
}
if options.Debug {
for _, v := range cm.ListConfigs() {
cfg, _ := cm.GetConfig(v)
log.Debug().Msgf("Model: %s (config: %+v)", v, cfg)
}
}
if options.AssetsDestination != "" {
// Extract files from the embedded FS
err := assets.ExtractFiles(options.BackendAssets, options.AssetsDestination)
log.Debug().Msgf("Extracting backend assets files to %s", options.AssetsDestination)
if err != nil {
log.Warn().Msgf("Failed extracting backend assets files: %s (might be required for some backends to work properly, like gpt4all)", err)
}
}
// Default middleware config
app.Use(recover.New())
if options.PreloadJSONModels != "" {
if err := localai.ApplyGalleryFromString(options.Loader.ModelPath, options.PreloadJSONModels, cm, options.Galleries); err != nil {
return nil, err
}
}
// Auth middleware checking if API key is valid. If no API key is set, no auth is required.
auth := func(c *fiber.Ctx) error {
if len(options.ApiKeys) > 0 {
authHeader := c.Get("Authorization")
if authHeader == "" {
return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Authorization header missing"})
}
authHeaderParts := strings.Split(authHeader, " ")
if len(authHeaderParts) != 2 || authHeaderParts[0] != "Bearer" {
return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Invalid Authorization header format"})
}
if options.PreloadModelsFromPath != "" {
if err := localai.ApplyGalleryFromFile(options.Loader.ModelPath, options.PreloadModelsFromPath, cm, options.Galleries); err != nil {
return nil, err
apiKey := authHeaderParts[1]
validApiKey := false
for _, key := range options.ApiKeys {
if apiKey == key {
validApiKey = true
}
}
if !validApiKey {
return c.Status(fiber.StatusUnauthorized).JSON(fiber.Map{"message": "Invalid API key"})
}
}
return c.Next()
}
if options.CORS {
@@ -114,44 +160,49 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
// LocalAI API endpoints
galleryService := localai.NewGalleryService(options.Loader.ModelPath)
galleryService.Start(options.Context, cm)
galleryService.Start(options.Context, cl)
app.Get("/version", func(c *fiber.Ctx) error {
app.Get("/version", auth, func(c *fiber.Ctx) error {
return c.JSON(struct {
Version string `json:"version"`
}{Version: internal.PrintableVersion()})
})
app.Post("/models/apply", localai.ApplyModelGalleryEndpoint(options.Loader.ModelPath, cm, galleryService.C, options.Galleries))
app.Get("/models/available", localai.ListModelFromGalleryEndpoint(options.Galleries, options.Loader.ModelPath))
app.Get("/models/jobs/:uuid", localai.GetOpStatusEndpoint(galleryService))
modelGalleryService := localai.CreateModelGalleryService(options.Galleries, options.Loader.ModelPath, galleryService)
app.Post("/models/apply", auth, modelGalleryService.ApplyModelGalleryEndpoint())
app.Get("/models/available", auth, modelGalleryService.ListModelFromGalleryEndpoint())
app.Get("/models/galleries", auth, modelGalleryService.ListModelGalleriesEndpoint())
app.Post("/models/galleries", auth, modelGalleryService.AddModelGalleryEndpoint())
app.Delete("/models/galleries", auth, modelGalleryService.RemoveModelGalleryEndpoint())
app.Get("/models/jobs/:uuid", auth, modelGalleryService.GetOpStatusEndpoint())
app.Get("/models/jobs", auth, modelGalleryService.GetAllStatusEndpoint())
// openAI compatible API endpoint
// chat
app.Post("/v1/chat/completions", openai.ChatEndpoint(cm, options))
app.Post("/chat/completions", openai.ChatEndpoint(cm, options))
app.Post("/v1/chat/completions", auth, openai.ChatEndpoint(cl, options))
app.Post("/chat/completions", auth, openai.ChatEndpoint(cl, options))
// edit
app.Post("/v1/edits", openai.EditEndpoint(cm, options))
app.Post("/edits", openai.EditEndpoint(cm, options))
app.Post("/v1/edits", auth, openai.EditEndpoint(cl, options))
app.Post("/edits", auth, openai.EditEndpoint(cl, options))
// completion
app.Post("/v1/completions", openai.CompletionEndpoint(cm, options))
app.Post("/completions", openai.CompletionEndpoint(cm, options))
app.Post("/v1/engines/:model/completions", openai.CompletionEndpoint(cm, options))
app.Post("/v1/completions", auth, openai.CompletionEndpoint(cl, options))
app.Post("/completions", auth, openai.CompletionEndpoint(cl, options))
app.Post("/v1/engines/:model/completions", auth, openai.CompletionEndpoint(cl, options))
// embeddings
app.Post("/v1/embeddings", openai.EmbeddingsEndpoint(cm, options))
app.Post("/embeddings", openai.EmbeddingsEndpoint(cm, options))
app.Post("/v1/engines/:model/embeddings", openai.EmbeddingsEndpoint(cm, options))
app.Post("/v1/embeddings", auth, openai.EmbeddingsEndpoint(cl, options))
app.Post("/embeddings", auth, openai.EmbeddingsEndpoint(cl, options))
app.Post("/v1/engines/:model/embeddings", auth, openai.EmbeddingsEndpoint(cl, options))
// audio
app.Post("/v1/audio/transcriptions", openai.TranscriptEndpoint(cm, options))
app.Post("/tts", localai.TTSEndpoint(cm, options))
app.Post("/v1/audio/transcriptions", auth, openai.TranscriptEndpoint(cl, options))
app.Post("/tts", auth, localai.TTSEndpoint(cl, options))
// images
app.Post("/v1/images/generations", openai.ImageEndpoint(cm, options))
app.Post("/v1/images/generations", auth, openai.ImageEndpoint(cl, options))
if options.ImageDir != "" {
app.Static("/generated-images", options.ImageDir)
@@ -169,16 +220,14 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
app.Get("/healthz", ok)
app.Get("/readyz", ok)
// models
app.Get("/v1/models", openai.ListModelsEndpoint(options.Loader, cm))
app.Get("/models", openai.ListModelsEndpoint(options.Loader, cm))
// Experimental Backend Statistics Module
backendMonitor := localai.NewBackendMonitor(cl, options) // Split out for now
app.Get("/backend/monitor", localai.BackendMonitorEndpoint(backendMonitor))
app.Post("/backend/shutdown", localai.BackendShutdownEndpoint(backendMonitor))
// turn off any process that was started by GRPC if the context is canceled
go func() {
<-options.Context.Done()
log.Debug().Msgf("Context canceled, shutting down")
options.Loader.StopGRPC()
}()
// models
app.Get("/v1/models", auth, openai.ListModelsEndpoint(options.Loader, cl))
app.Get("/models", auth, openai.ListModelsEndpoint(options.Loader, cl))
return app, nil
}

View File

@@ -8,7 +8,6 @@ import (
"errors"
"fmt"
"io"
"io/ioutil"
"net/http"
"os"
"path/filepath"
@@ -30,10 +29,10 @@ import (
)
type modelApplyRequest struct {
ID string `json:"id"`
URL string `json:"url"`
Name string `json:"name"`
Overrides map[string]string `json:"overrides"`
ID string `json:"id"`
URL string `json:"url"`
Name string `json:"name"`
Overrides map[string]interface{} `json:"overrides"`
}
func getModelStatus(url string) (response map[string]interface{}) {
@@ -45,7 +44,7 @@ func getModelStatus(url string) (response map[string]interface{}) {
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
body, err := io.ReadAll(resp.Body)
if err != nil {
fmt.Println("Error reading response body:", err)
return
@@ -97,7 +96,7 @@ func postModelApplyRequest(url string, request modelApplyRequest) (response map[
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
body, err := io.ReadAll(resp.Body)
if err != nil {
fmt.Println("Error reading response body:", err)
return
@@ -125,6 +124,11 @@ var _ = Describe("API test", func() {
var cancel context.CancelFunc
var tmpdir string
commonOpts := []options.AppOption{
options.WithDebug(true),
options.WithDisableMessage(true),
}
Context("API with ephemeral models", func() {
BeforeEach(func() {
var err error
@@ -143,12 +147,12 @@ var _ = Describe("API test", func() {
Name: "bert2",
URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
Overrides: map[string]interface{}{"foo": "bar"},
AdditionalFiles: []gallery.File{gallery.File{Filename: "foo.yaml", URI: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml"}},
AdditionalFiles: []gallery.File{{Filename: "foo.yaml", URI: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml"}},
},
}
out, err := yaml.Marshal(g)
Expect(err).ToNot(HaveOccurred())
err = ioutil.WriteFile(filepath.Join(tmpdir, "gallery_simple.yaml"), out, 0644)
err = os.WriteFile(filepath.Join(tmpdir, "gallery_simple.yaml"), out, 0644)
Expect(err).ToNot(HaveOccurred())
galleries := []gallery.Gallery{
@@ -159,9 +163,10 @@ var _ = Describe("API test", func() {
}
app, err = App(
options.WithContext(c),
options.WithGalleries(galleries),
options.WithModelLoader(modelLoader), options.WithBackendAssets(backendAssets), options.WithBackendAssetsOutput(tmpdir))
append(commonOpts,
options.WithContext(c),
options.WithGalleries(galleries),
options.WithModelLoader(modelLoader), options.WithBackendAssets(backendAssets), options.WithBackendAssetsOutput(tmpdir))...)
Expect(err).ToNot(HaveOccurred())
go app.Listen("127.0.0.1:9090")
@@ -237,7 +242,7 @@ var _ = Describe("API test", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
Name: "bert",
Overrides: map[string]string{
Overrides: map[string]interface{}{
"backend": "llama",
},
})
@@ -263,7 +268,7 @@ var _ = Describe("API test", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "https://raw.githubusercontent.com/go-skynet/model-gallery/main/bert-embeddings.yaml",
Name: "bert",
Overrides: map[string]string{},
Overrides: map[string]interface{}{},
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
@@ -291,7 +296,7 @@ var _ = Describe("API test", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "github:go-skynet/model-gallery/openllama_3b.yaml",
Name: "openllama_3b",
Overrides: map[string]string{},
Overrides: map[string]interface{}{"backend": "llama-stable", "mmap": true, "f16": true, "context_size": 128},
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
@@ -346,6 +351,82 @@ var _ = Describe("API test", func() {
Expect(resp2.Choices[0].Message.FunctionCall).ToNot(BeNil())
Expect(resp2.Choices[0].Message.FunctionCall.Name).To(Equal("get_current_weather"), resp2.Choices[0].Message.FunctionCall.Name)
var res map[string]string
err = json.Unmarshal([]byte(resp2.Choices[0].Message.FunctionCall.Arguments), &res)
Expect(err).ToNot(HaveOccurred())
Expect(res["location"]).To(Equal("San Francisco, California, United States"), fmt.Sprint(res))
Expect(res["unit"]).To(Equal("celcius"), fmt.Sprint(res))
Expect(string(resp2.Choices[0].FinishReason)).To(Equal("function_call"), fmt.Sprint(resp2.Choices[0].FinishReason))
})
It("runs openllama gguf", Label("llama-gguf"), func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")
}
modelName := "codellama"
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "github:go-skynet/model-gallery/codellama-7b-instruct.yaml",
Name: modelName,
Overrides: map[string]interface{}{"backend": "llama", "mmap": true, "f16": true, "context_size": 128},
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
uuid := response["uuid"].(string)
Eventually(func() bool {
response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
return response["processed"].(bool)
}, "360s", "10s").Should(Equal(true))
By("testing chat")
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: modelName, Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "How much is 2+2?",
},
}})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Choices)).To(Equal(1))
Expect(resp.Choices[0].Message.Content).To(Or(ContainSubstring("4"), ContainSubstring("four")))
By("testing functions")
resp2, err := client.CreateChatCompletion(
context.TODO(),
openai.ChatCompletionRequest{
Model: modelName,
Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "What is the weather like in San Francisco (celsius)?",
},
},
Functions: []openai.FunctionDefinition{
openai.FunctionDefinition{
Name: "get_current_weather",
Description: "Get the current weather",
Parameters: jsonschema.Definition{
Type: jsonschema.Object,
Properties: map[string]jsonschema.Definition{
"location": {
Type: jsonschema.String,
Description: "The city and state, e.g. San Francisco, CA",
},
"unit": {
Type: jsonschema.String,
Enum: []string{"celcius", "fahrenheit"},
},
},
Required: []string{"location"},
},
},
},
})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp2.Choices)).To(Equal(1))
Expect(resp2.Choices[0].Message.FunctionCall).ToNot(BeNil())
Expect(resp2.Choices[0].Message.FunctionCall.Name).To(Equal("get_current_weather"), resp2.Choices[0].Message.FunctionCall.Name)
var res map[string]string
err = json.Unmarshal([]byte(resp2.Choices[0].Message.FunctionCall.Arguments), &res)
Expect(err).ToNot(HaveOccurred())
@@ -360,9 +441,8 @@ var _ = Describe("API test", func() {
}
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "github:go-skynet/model-gallery/gpt4all-j.yaml",
Name: "gpt4all-j",
Overrides: map[string]string{},
URL: "github:go-skynet/model-gallery/gpt4all-j.yaml",
Name: "gpt4all-j",
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
@@ -400,13 +480,14 @@ var _ = Describe("API test", func() {
}
app, err = App(
options.WithContext(c),
options.WithAudioDir(tmpdir),
options.WithImageDir(tmpdir),
options.WithGalleries(galleries),
options.WithModelLoader(modelLoader),
options.WithBackendAssets(backendAssets),
options.WithBackendAssetsOutput(tmpdir),
append(commonOpts,
options.WithContext(c),
options.WithAudioDir(tmpdir),
options.WithImageDir(tmpdir),
options.WithGalleries(galleries),
options.WithModelLoader(modelLoader),
options.WithBackendAssets(backendAssets),
options.WithBackendAssetsOutput(tmpdir))...,
)
Expect(err).ToNot(HaveOccurred())
go app.Listen("127.0.0.1:9090")
@@ -465,6 +546,9 @@ var _ = Describe("API test", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
ID: "model-gallery@stablediffusion",
Overrides: map[string]interface{}{
"parameters": map[string]interface{}{"model": "stablediffusion_assets"},
},
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
@@ -500,7 +584,12 @@ var _ = Describe("API test", func() {
c, cancel = context.WithCancel(context.Background())
var err error
app, err = App(options.WithContext(c), options.WithModelLoader(modelLoader))
app, err = App(
append(commonOpts,
options.WithExternalBackend("huggingface", os.Getenv("HUGGINGFACE_GRPC")),
options.WithContext(c),
options.WithModelLoader(modelLoader),
)...)
Expect(err).ToNot(HaveOccurred())
go app.Listen("127.0.0.1:9090")
@@ -524,7 +613,7 @@ var _ = Describe("API test", func() {
It("returns the models list", func() {
models, err := client.ListModels(context.TODO())
Expect(err).ToNot(HaveOccurred())
Expect(len(models.Models)).To(Equal(10))
Expect(len(models.Models)).To(Equal(6)) // If "config.yaml" should be included, this should be 8?
})
It("can generate completions", func() {
resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "testmodel", Prompt: "abcdedfghikl"})
@@ -555,9 +644,10 @@ var _ = Describe("API test", func() {
})
It("returns errors", func() {
backends := len(model.AutoLoadBackends) + 1 // +1 for huggingface
_, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "foomodel", Prompt: "abcdedfghikl"})
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("error, status code: 500, message: could not load model - all backends returned error: 12 errors occurred:"))
Expect(err.Error()).To(ContainSubstring(fmt.Sprintf("error, status code: 500, message: could not load model - all backends returned error: %d errors occurred:", backends)))
})
It("transcribes audio", func() {
if runtime.GOOS != "linux" {
@@ -601,6 +691,36 @@ var _ = Describe("API test", func() {
Expect(resp2.Data[0].Embedding).To(Equal(sunEmbedding))
})
Context("External gRPC calls", func() {
It("calculate embeddings with huggingface", func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")
}
resp, err := client.CreateEmbeddings(
context.Background(),
openai.EmbeddingRequest{
Model: openai.AdaCodeSearchCode,
Input: []string{"sun", "cat"},
},
)
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Data[0].Embedding)).To(BeNumerically("==", 384))
Expect(len(resp.Data[1].Embedding)).To(BeNumerically("==", 384))
sunEmbedding := resp.Data[0].Embedding
resp2, err := client.CreateEmbeddings(
context.Background(),
openai.EmbeddingRequest{
Model: openai.AdaCodeSearchCode,
Input: []string{"sun"},
},
)
Expect(err).ToNot(HaveOccurred())
Expect(resp2.Data[0].Embedding).To(Equal(sunEmbedding))
Expect(resp2.Data[0].Embedding).ToNot(Equal(resp.Data[1].Embedding))
})
})
Context("backends", func() {
It("runs rwkv completion", func() {
if runtime.GOOS != "linux" {
@@ -673,7 +793,12 @@ var _ = Describe("API test", func() {
c, cancel = context.WithCancel(context.Background())
var err error
app, err = App(options.WithContext(c), options.WithModelLoader(modelLoader), options.WithConfigFile(os.Getenv("CONFIG_FILE")))
app, err = App(
append(commonOpts,
options.WithContext(c),
options.WithModelLoader(modelLoader),
options.WithConfigFile(os.Getenv("CONFIG_FILE")))...,
)
Expect(err).ToNot(HaveOccurred())
go app.Listen("127.0.0.1:9090")
@@ -692,19 +817,14 @@ var _ = Describe("API test", func() {
cancel()
app.Shutdown()
})
It("can generate chat completions from config file", func() {
models, err := client.ListModels(context.TODO())
Expect(err).ToNot(HaveOccurred())
Expect(len(models.Models)).To(Equal(12))
})
It("can generate chat completions from config file", func() {
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "list1", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "abcdedfghikl"}}})
It("can generate chat completions from config file (list1)", func() {
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "list1", Messages: []openai.ChatCompletionMessage{{Role: "user", Content: "abcdedfghikl"}}})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Choices)).To(Equal(1))
Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty())
})
It("can generate chat completions from config file", func() {
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "list2", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "abcdedfghikl"}}})
It("can generate chat completions from config file (list2)", func() {
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "list2", Messages: []openai.ChatCompletionMessage{{Role: "user", Content: "abcdedfghikl"}}})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Choices)).To(Equal(1))
Expect(resp.Choices[0].Message.Content).ToNot(BeEmpty())

View File

@@ -2,7 +2,6 @@ package backend
import (
"fmt"
"sync"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
@@ -22,13 +21,13 @@ func ModelEmbedding(s string, tokens []int, loader *model.ModelLoader, c config.
var inferenceModel interface{}
var err error
opts := []model.Option{
model.WithLoadGRPCLLMModelOpts(grpcOpts),
opts := modelOpts(c, o, []model.Option{
model.WithLoadGRPCLoadModelOpts(grpcOpts),
model.WithThreads(uint32(c.Threads)),
model.WithAssetDir(o.AssetsDestination),
model.WithModelFile(modelFile),
model.WithModel(modelFile),
model.WithContext(o.Context),
}
})
if c.Backend == "" {
inferenceModel, err = loader.GreedyLoader(opts...)
@@ -76,18 +75,6 @@ func ModelEmbedding(s string, tokens []int, loader *model.ModelLoader, c config.
}
return func() ([]float32, error) {
// This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
mutexMap.Lock()
l, ok := mutexes[modelFile]
if !ok {
m := &sync.Mutex{}
mutexes[modelFile] = m
l = m
}
mutexMap.Unlock()
l.Lock()
defer l.Unlock()
embeds, err := fn()
if err != nil {
return embeds, err

View File

@@ -1,26 +1,36 @@
package backend
import (
"fmt"
"sync"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/pkg/grpc/proto"
model "github.com/go-skynet/LocalAI/pkg/model"
)
func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negative_prompt, dst string, loader *model.ModelLoader, c config.Config, o *options.Option) (func() error, error) {
if c.Backend != model.StableDiffusionBackend {
return nil, fmt.Errorf("endpoint only working with stablediffusion models")
}
func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negative_prompt, src, dst string, loader *model.ModelLoader, c config.Config, o *options.Option) (func() error, error) {
inferenceModel, err := loader.BackendLoader(
opts := modelOpts(c, o, []model.Option{
model.WithBackendString(c.Backend),
model.WithAssetDir(o.AssetsDestination),
model.WithThreads(uint32(c.Threads)),
model.WithContext(o.Context),
model.WithModelFile(c.ImageGenerationAssets),
model.WithModel(c.Model),
model.WithLoadGRPCLoadModelOpts(&proto.ModelOptions{
CUDA: c.Diffusers.CUDA,
SchedulerType: c.Diffusers.SchedulerType,
PipelineType: c.Diffusers.PipelineType,
CFGScale: c.Diffusers.CFGScale,
LoraAdapter: c.LoraAdapter,
LoraBase: c.LoraBase,
IMG2IMG: c.Diffusers.IMG2IMG,
CLIPModel: c.Diffusers.ClipModel,
CLIPSubfolder: c.Diffusers.ClipSubFolder,
CLIPSkip: int32(c.Diffusers.ClipSkip),
}),
})
inferenceModel, err := loader.BackendLoader(
opts...,
)
if err != nil {
return nil, err
@@ -30,31 +40,20 @@ func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negat
_, err := inferenceModel.GenerateImage(
o.Context,
&proto.GenerateImageRequest{
Height: int32(height),
Width: int32(width),
Mode: int32(mode),
Step: int32(step),
Seed: int32(seed),
PositivePrompt: positive_prompt,
NegativePrompt: negative_prompt,
Dst: dst,
Height: int32(height),
Width: int32(width),
Mode: int32(mode),
Step: int32(step),
Seed: int32(seed),
CLIPSkip: int32(c.Diffusers.ClipSkip),
PositivePrompt: positive_prompt,
NegativePrompt: negative_prompt,
Dst: dst,
Src: src,
EnableParameters: c.Diffusers.EnableParameters,
})
return err
}
return func() error {
// This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
mutexMap.Lock()
l, ok := mutexes[c.Backend]
if !ok {
m := &sync.Mutex{}
mutexes[c.Backend] = m
l = m
}
mutexMap.Unlock()
l.Lock()
defer l.Unlock()
return fn()
}, nil
return fn, nil
}

View File

@@ -1,17 +1,32 @@
package backend
import (
"context"
"os"
"regexp"
"strings"
"sync"
"unicode/utf8"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/pkg/gallery"
"github.com/go-skynet/LocalAI/pkg/grpc"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/utils"
)
func ModelInference(s string, loader *model.ModelLoader, c config.Config, o *options.Option, tokenCallback func(string) bool) (func() (string, error), error) {
type LLMResponse struct {
Response string // should this be []byte?
Usage TokenUsage
}
type TokenUsage struct {
Prompt int
Completion int
}
func ModelInference(ctx context.Context, s string, loader *model.ModelLoader, c config.Config, o *options.Option, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
modelFile := c.Model
grpcOpts := gRPCModelOpts(c)
@@ -19,56 +34,106 @@ func ModelInference(s string, loader *model.ModelLoader, c config.Config, o *opt
var inferenceModel *grpc.Client
var err error
opts := []model.Option{
model.WithLoadGRPCLLMModelOpts(grpcOpts),
opts := modelOpts(c, o, []model.Option{
model.WithLoadGRPCLoadModelOpts(grpcOpts),
model.WithThreads(uint32(c.Threads)), // some models uses this to allocate threads during startup
model.WithAssetDir(o.AssetsDestination),
model.WithModelFile(modelFile),
model.WithModel(modelFile),
model.WithContext(o.Context),
})
if c.Backend != "" {
opts = append(opts, model.WithBackendString(c.Backend))
}
// Check if the modelFile exists, if it doesn't try to load it from the gallery
if o.AutoloadGalleries { // experimental
if _, err := os.Stat(modelFile); os.IsNotExist(err) {
utils.ResetDownloadTimers()
// if we failed to load the model, we try to download it
err := gallery.InstallModelFromGalleryByName(o.Galleries, modelFile, loader.ModelPath, gallery.GalleryModel{}, utils.DisplayDownloadFunction)
if err != nil {
return nil, err
}
}
}
if c.Backend == "" {
inferenceModel, err = loader.GreedyLoader(opts...)
} else {
opts = append(opts, model.WithBackendString(c.Backend))
inferenceModel, err = loader.BackendLoader(opts...)
}
if err != nil {
return nil, err
}
// in GRPC, the backend is supposed to answer to 1 single token if stream is not supported
fn := func() (string, error) {
fn := func() (LLMResponse, error) {
opts := gRPCPredictOpts(c, loader.ModelPath)
opts.Prompt = s
tokenUsage := TokenUsage{}
// check the per-model feature flag for usage, since tokenCallback may have a cost.
// Defaults to off as for now it is still experimental
if c.FeatureFlag.Enabled("usage") {
userTokenCallback := tokenCallback
if userTokenCallback == nil {
userTokenCallback = func(token string, usage TokenUsage) bool {
return true
}
}
promptInfo, pErr := inferenceModel.TokenizeString(ctx, opts)
if pErr == nil && promptInfo.Length > 0 {
tokenUsage.Prompt = int(promptInfo.Length)
}
tokenCallback = func(token string, usage TokenUsage) bool {
tokenUsage.Completion++
return userTokenCallback(token, tokenUsage)
}
}
if tokenCallback != nil {
ss := ""
err := inferenceModel.PredictStream(o.Context, opts, func(s string) {
tokenCallback(s)
ss += s
var partialRune []byte
err := inferenceModel.PredictStream(ctx, opts, func(chars []byte) {
partialRune = append(partialRune, chars...)
for len(partialRune) > 0 {
r, size := utf8.DecodeRune(partialRune)
if r == utf8.RuneError {
// incomplete rune, wait for more bytes
break
}
tokenCallback(string(r), tokenUsage)
ss += string(r)
partialRune = partialRune[size:]
}
})
return ss, err
return LLMResponse{
Response: ss,
Usage: tokenUsage,
}, err
} else {
reply, err := inferenceModel.Predict(o.Context, opts)
return reply.Message, err
// TODO: Is the chicken bit the only way to get here? is that acceptable?
reply, err := inferenceModel.Predict(ctx, opts)
if err != nil {
return LLMResponse{}, err
}
return LLMResponse{
Response: string(reply.Message),
Usage: tokenUsage,
}, err
}
}
return func() (string, error) {
// This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
mutexMap.Lock()
l, ok := mutexes[modelFile]
if !ok {
m := &sync.Mutex{}
mutexes[modelFile] = m
l = m
}
mutexMap.Unlock()
l.Lock()
defer l.Unlock()
return fn()
}, nil
return fn, nil
}
var cutstrings map[string]*regexp.Regexp = make(map[string]*regexp.Regexp)

View File

@@ -1,22 +0,0 @@
package backend
import "sync"
// mutex still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
var mutexMap sync.Mutex
var mutexes map[string]*sync.Mutex = make(map[string]*sync.Mutex)
func Lock(s string) *sync.Mutex {
// This is still needed, see: https://github.com/ggerganov/llama.cpp/discussions/784
mutexMap.Lock()
l, ok := mutexes[s]
if !ok {
m := &sync.Mutex{}
mutexes[s] = m
l = m
}
mutexMap.Unlock()
l.Lock()
return l
}

View File

@@ -5,29 +5,69 @@ import (
"path/filepath"
pb "github.com/go-skynet/LocalAI/pkg/grpc/proto"
model "github.com/go-skynet/LocalAI/pkg/model"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
)
func modelOpts(c config.Config, o *options.Option, opts []model.Option) []model.Option {
if o.SingleBackend {
opts = append(opts, model.WithSingleActiveBackend())
}
if c.GRPC.Attempts != 0 {
opts = append(opts, model.WithGRPCAttempts(c.GRPC.Attempts))
}
if c.GRPC.AttemptsSleepTime != 0 {
opts = append(opts, model.WithGRPCAttemptsDelay(c.GRPC.AttemptsSleepTime))
}
for k, v := range o.ExternalGRPCBackends {
opts = append(opts, model.WithExternalBackend(k, v))
}
return opts
}
func gRPCModelOpts(c config.Config) *pb.ModelOptions {
b := 512
if c.Batch != 0 {
b = c.Batch
}
return &pb.ModelOptions{
ContextSize: int32(c.ContextSize),
Seed: int32(c.Seed),
NBatch: int32(b),
F16Memory: c.F16,
MLock: c.MMlock,
NUMA: c.NUMA,
Embeddings: c.Embeddings,
LowVRAM: c.LowVRAM,
NGPULayers: int32(c.NGPULayers),
MMap: c.MMap,
MainGPU: c.MainGPU,
Threads: int32(c.Threads),
TensorSplit: c.TensorSplit,
ContextSize: int32(c.ContextSize),
Seed: int32(c.Seed),
NBatch: int32(b),
NoMulMatQ: c.NoMulMatQ,
DraftModel: c.DraftModel,
AudioPath: c.VallE.AudioPath,
Quantization: c.Quantization,
LoraAdapter: c.LoraAdapter,
LoraBase: c.LoraBase,
NGQA: c.NGQA,
RMSNormEps: c.RMSNormEps,
F16Memory: c.F16,
MLock: c.MMlock,
RopeFreqBase: c.RopeFreqBase,
RopeFreqScale: c.RopeFreqScale,
NUMA: c.NUMA,
Embeddings: c.Embeddings,
LowVRAM: c.LowVRAM,
NGPULayers: int32(c.NGPULayers),
MMap: c.MMap,
MainGPU: c.MainGPU,
Threads: int32(c.Threads),
TensorSplit: c.TensorSplit,
// AutoGPTQ
ModelBaseName: c.AutoGPTQ.ModelBaseName,
Device: c.AutoGPTQ.Device,
UseTriton: c.AutoGPTQ.Triton,
UseFastTokenizer: c.AutoGPTQ.UseFastTokenizer,
// RWKV
Tokenizer: c.Tokenizer,
}
}
@@ -39,34 +79,38 @@ func gRPCPredictOpts(c config.Config, modelPath string) *pb.PredictOptions {
promptCachePath = p
}
return &pb.PredictOptions{
Temperature: float32(c.Temperature),
TopP: float32(c.TopP),
TopK: int32(c.TopK),
Tokens: int32(c.Maxtokens),
Threads: int32(c.Threads),
PromptCacheAll: c.PromptCacheAll,
PromptCacheRO: c.PromptCacheRO,
PromptCachePath: promptCachePath,
F16KV: c.F16,
DebugMode: c.Debug,
Grammar: c.Grammar,
Mirostat: int32(c.Mirostat),
MirostatETA: float32(c.MirostatETA),
MirostatTAU: float32(c.MirostatTAU),
Debug: c.Debug,
StopPrompts: c.StopWords,
Repeat: int32(c.RepeatPenalty),
NKeep: int32(c.Keep),
Batch: int32(c.Batch),
IgnoreEOS: c.IgnoreEOS,
Seed: int32(c.Seed),
FrequencyPenalty: float32(c.FrequencyPenalty),
MLock: c.MMlock,
MMap: c.MMap,
MainGPU: c.MainGPU,
TensorSplit: c.TensorSplit,
TailFreeSamplingZ: float32(c.TFZ),
TypicalP: float32(c.TypicalP),
Temperature: float32(c.Temperature),
TopP: float32(c.TopP),
NDraft: c.NDraft,
TopK: int32(c.TopK),
Tokens: int32(c.Maxtokens),
Threads: int32(c.Threads),
PromptCacheAll: c.PromptCacheAll,
PromptCacheRO: c.PromptCacheRO,
PromptCachePath: promptCachePath,
F16KV: c.F16,
DebugMode: c.Debug,
Grammar: c.Grammar,
NegativePromptScale: c.NegativePromptScale,
RopeFreqBase: c.RopeFreqBase,
RopeFreqScale: c.RopeFreqScale,
NegativePrompt: c.NegativePrompt,
Mirostat: int32(c.LLMConfig.Mirostat),
MirostatETA: float32(c.LLMConfig.MirostatETA),
MirostatTAU: float32(c.LLMConfig.MirostatTAU),
Debug: c.Debug,
StopPrompts: c.StopWords,
Repeat: int32(c.RepeatPenalty),
NKeep: int32(c.Keep),
Batch: int32(c.Batch),
IgnoreEOS: c.IgnoreEOS,
Seed: int32(c.Seed),
FrequencyPenalty: float32(c.FrequencyPenalty),
MLock: c.MMlock,
MMap: c.MMap,
MainGPU: c.MainGPU,
TensorSplit: c.TensorSplit,
TailFreeSamplingZ: float32(c.TFZ),
TypicalP: float32(c.TypicalP),
}
}

39
api/backend/transcript.go Normal file
View File

@@ -0,0 +1,39 @@
package backend
import (
"context"
"fmt"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/pkg/grpc/proto"
model "github.com/go-skynet/LocalAI/pkg/model"
)
func ModelTranscription(audio, language string, loader *model.ModelLoader, c config.Config, o *options.Option) (*schema.Result, error) {
opts := modelOpts(c, o, []model.Option{
model.WithBackendString(model.WhisperBackend),
model.WithModel(c.Model),
model.WithContext(o.Context),
model.WithThreads(uint32(c.Threads)),
model.WithAssetDir(o.AssetsDestination),
})
whisperModel, err := o.Loader.BackendLoader(opts...)
if err != nil {
return nil, err
}
if whisperModel == nil {
return nil, fmt.Errorf("could not load whisper model")
}
return whisperModel.AudioTranscription(context.Background(), &proto.TranscriptRequest{
Dst: audio,
Language: language,
Threads: uint32(c.Threads),
})
}

75
api/backend/tts.go Normal file
View File

@@ -0,0 +1,75 @@
package backend
import (
"context"
"fmt"
"os"
"path/filepath"
api_config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/pkg/grpc/proto"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/utils"
)
func generateUniqueFileName(dir, baseName, ext string) string {
counter := 1
fileName := baseName + ext
for {
filePath := filepath.Join(dir, fileName)
_, err := os.Stat(filePath)
if os.IsNotExist(err) {
return fileName
}
counter++
fileName = fmt.Sprintf("%s_%d%s", baseName, counter, ext)
}
}
func ModelTTS(backend, text, modelFile string, loader *model.ModelLoader, o *options.Option) (string, *proto.Result, error) {
bb := backend
if bb == "" {
bb = model.PiperBackend
}
opts := modelOpts(api_config.Config{}, o, []model.Option{
model.WithBackendString(bb),
model.WithModel(modelFile),
model.WithContext(o.Context),
model.WithAssetDir(o.AssetsDestination),
})
piperModel, err := o.Loader.BackendLoader(opts...)
if err != nil {
return "", nil, err
}
if piperModel == nil {
return "", nil, fmt.Errorf("could not load piper model")
}
if err := os.MkdirAll(o.AudioDir, 0755); err != nil {
return "", nil, fmt.Errorf("failed creating audio directory: %s", err)
}
fileName := generateUniqueFileName(o.AudioDir, "piper", ".wav")
filePath := filepath.Join(o.AudioDir, fileName)
// If the model file is not empty, we pass it joined with the model path
modelPath := ""
if modelFile != "" {
modelPath = filepath.Join(o.Loader.ModelPath, modelFile)
if err := utils.VerifyPath(modelPath, o.Loader.ModelPath); err != nil {
return "", nil, err
}
}
res, err := piperModel.TTS(context.Background(), &proto.TTSRequest{
Text: text,
Model: modelPath,
Dst: filePath,
})
return filePath, res, err
}

View File

@@ -13,42 +13,104 @@ import (
type Config struct {
PredictionOptions `yaml:"parameters"`
Name string `yaml:"name"`
StopWords []string `yaml:"stopwords"`
Cutstrings []string `yaml:"cutstrings"`
TrimSpace []string `yaml:"trimspace"`
ContextSize int `yaml:"context_size"`
F16 bool `yaml:"f16"`
NUMA bool `yaml:"numa"`
Threads int `yaml:"threads"`
Debug bool `yaml:"debug"`
Roles map[string]string `yaml:"roles"`
Embeddings bool `yaml:"embeddings"`
Backend string `yaml:"backend"`
TemplateConfig TemplateConfig `yaml:"template"`
MirostatETA float64 `yaml:"mirostat_eta"`
MirostatTAU float64 `yaml:"mirostat_tau"`
Mirostat int `yaml:"mirostat"`
NGPULayers int `yaml:"gpu_layers"`
MMap bool `yaml:"mmap"`
MMlock bool `yaml:"mmlock"`
LowVRAM bool `yaml:"low_vram"`
Name string `yaml:"name"`
TensorSplit string `yaml:"tensor_split"`
MainGPU string `yaml:"main_gpu"`
ImageGenerationAssets string `yaml:"asset_dir"`
F16 bool `yaml:"f16"`
Threads int `yaml:"threads"`
Debug bool `yaml:"debug"`
Roles map[string]string `yaml:"roles"`
Embeddings bool `yaml:"embeddings"`
Backend string `yaml:"backend"`
TemplateConfig TemplateConfig `yaml:"template"`
PromptCachePath string `yaml:"prompt_cache_path"`
PromptCacheAll bool `yaml:"prompt_cache_all"`
PromptCacheRO bool `yaml:"prompt_cache_ro"`
Grammar string `yaml:"grammar"`
PromptStrings, InputStrings []string
InputToken [][]int
functionCallString, functionCallNameString string
PromptStrings, InputStrings []string `yaml:"-"`
InputToken [][]int `yaml:"-"`
functionCallString, functionCallNameString string `yaml:"-"`
FunctionsConfig Functions `yaml:"function"`
FeatureFlag FeatureFlag `yaml:"feature_flags"` // Feature Flag registry. We move fast, and features may break on a per model/backend basis. Registry for (usually temporary) flags that indicate aborting something early.
// LLM configs (GPT4ALL, Llama.cpp, ...)
LLMConfig `yaml:",inline"`
// AutoGPTQ specifics
AutoGPTQ AutoGPTQ `yaml:"autogptq"`
// Diffusers
Diffusers Diffusers `yaml:"diffusers"`
Step int `yaml:"step"`
// GRPC Options
GRPC GRPC `yaml:"grpc"`
// Vall-e-x
VallE VallE `yaml:"vall-e"`
}
type VallE struct {
AudioPath string `yaml:"audio_path"`
}
type FeatureFlag map[string]*bool
func (ff FeatureFlag) Enabled(s string) bool {
v, exist := ff[s]
return exist && v != nil && *v
}
type GRPC struct {
Attempts int `yaml:"attempts"`
AttemptsSleepTime int `yaml:"attempts_sleep_time"`
}
type Diffusers struct {
PipelineType string `yaml:"pipeline_type"`
SchedulerType string `yaml:"scheduler_type"`
CUDA bool `yaml:"cuda"`
EnableParameters string `yaml:"enable_parameters"` // A list of comma separated parameters to specify
CFGScale float32 `yaml:"cfg_scale"` // Classifier-Free Guidance Scale
IMG2IMG bool `yaml:"img2img"` // Image to Image Diffuser
ClipSkip int `yaml:"clip_skip"` // Skip every N frames
ClipModel string `yaml:"clip_model"` // Clip model to use
ClipSubFolder string `yaml:"clip_subfolder"` // Subfolder to use for clip model
}
type LLMConfig struct {
SystemPrompt string `yaml:"system_prompt"`
TensorSplit string `yaml:"tensor_split"`
MainGPU string `yaml:"main_gpu"`
RMSNormEps float32 `yaml:"rms_norm_eps"`
NGQA int32 `yaml:"ngqa"`
PromptCachePath string `yaml:"prompt_cache_path"`
PromptCacheAll bool `yaml:"prompt_cache_all"`
PromptCacheRO bool `yaml:"prompt_cache_ro"`
MirostatETA float64 `yaml:"mirostat_eta"`
MirostatTAU float64 `yaml:"mirostat_tau"`
Mirostat int `yaml:"mirostat"`
NGPULayers int `yaml:"gpu_layers"`
MMap bool `yaml:"mmap"`
MMlock bool `yaml:"mmlock"`
LowVRAM bool `yaml:"low_vram"`
Grammar string `yaml:"grammar"`
StopWords []string `yaml:"stopwords"`
Cutstrings []string `yaml:"cutstrings"`
TrimSpace []string `yaml:"trimspace"`
ContextSize int `yaml:"context_size"`
NUMA bool `yaml:"numa"`
LoraAdapter string `yaml:"lora_adapter"`
LoraBase string `yaml:"lora_base"`
NoMulMatQ bool `yaml:"no_mulmatq"`
DraftModel string `yaml:"draft_model"`
NDraft int32 `yaml:"n_draft"`
Quantization string `yaml:"quantization"`
}
type AutoGPTQ struct {
ModelBaseName string `yaml:"model_base_name"`
Device string `yaml:"device"`
Triton bool `yaml:"triton"`
UseFastTokenizer bool `yaml:"use_fast_tokenizer"`
}
type Functions struct {
@@ -58,10 +120,11 @@ type Functions struct {
}
type TemplateConfig struct {
Completion string `yaml:"completion"`
Functions string `yaml:"function"`
Chat string `yaml:"chat"`
Edit string `yaml:"edit"`
Chat string `yaml:"chat"`
ChatMessage string `yaml:"chat_message"`
Completion string `yaml:"completion"`
Edit string `yaml:"edit"`
Functions string `yaml:"function"`
}
type ConfigLoader struct {
@@ -169,6 +232,16 @@ func (cm *ConfigLoader) GetConfig(m string) (Config, bool) {
return v, exists
}
func (cm *ConfigLoader) GetAllConfigs() []Config {
cm.Lock()
defer cm.Unlock()
var res []Config
for _, v := range cm.configs {
res = append(res, v)
}
return res
}
func (cm *ConfigLoader) ListConfigs() []string {
cm.Lock()
defer cm.Unlock()

View File

@@ -34,4 +34,17 @@ type PredictionOptions struct {
TypicalP float64 `json:"typical_p" yaml:"typical_p"`
Seed int `json:"seed" yaml:"seed"`
NegativePrompt string `json:"negative_prompt" yaml:"negative_prompt"`
RopeFreqBase float32 `json:"rope_freq_base" yaml:"rope_freq_base"`
RopeFreqScale float32 `json:"rope_freq_scale" yaml:"rope_freq_scale"`
NegativePromptScale float32 `json:"negative_prompt_scale" yaml:"negative_prompt_scale"`
// AutoGPTQ
UseFastTokenizer bool `json:"use_fast_tokenizer" yaml:"use_fast_tokenizer"`
// Diffusers
ClipSkip int `json:"clip_skip" yaml:"clip_skip"`
// RWKV (?)
Tokenizer string `json:"tokenizer" yaml:"tokenizer"`
}

View File

@@ -0,0 +1,163 @@
package localai
import (
"context"
"fmt"
"strings"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/pkg/grpc/proto"
"github.com/go-skynet/LocalAI/api/options"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
gopsutil "github.com/shirou/gopsutil/v3/process"
)
type BackendMonitorRequest struct {
Model string `json:"model" yaml:"model"`
}
type BackendMonitorResponse struct {
MemoryInfo *gopsutil.MemoryInfoStat
MemoryPercent float32
CPUPercent float64
}
type BackendMonitor struct {
configLoader *config.ConfigLoader
options *options.Option // Taking options in case we need to inspect ExternalGRPCBackends, though that's out of scope for now, hence the name.
}
func NewBackendMonitor(configLoader *config.ConfigLoader, options *options.Option) BackendMonitor {
return BackendMonitor{
configLoader: configLoader,
options: options,
}
}
func (bm *BackendMonitor) SampleLocalBackendProcess(model string) (*BackendMonitorResponse, error) {
config, exists := bm.configLoader.GetConfig(model)
var backend string
if exists {
backend = config.Model
} else {
// Last ditch effort: use it raw, see if a backend happens to match.
backend = model
}
if !strings.HasSuffix(backend, ".bin") {
backend = fmt.Sprintf("%s.bin", backend)
}
pid, err := bm.options.Loader.GetGRPCPID(backend)
if err != nil {
log.Error().Msgf("model %s : failed to find pid %+v", model, err)
return nil, err
}
// Name is slightly frightening but this does _not_ create a new process, rather it looks up an existing process by PID.
backendProcess, err := gopsutil.NewProcess(int32(pid))
if err != nil {
log.Error().Msgf("model %s [PID %d] : error getting process info %+v", model, pid, err)
return nil, err
}
memInfo, err := backendProcess.MemoryInfo()
if err != nil {
log.Error().Msgf("model %s [PID %d] : error getting memory info %+v", model, pid, err)
return nil, err
}
memPercent, err := backendProcess.MemoryPercent()
if err != nil {
log.Error().Msgf("model %s [PID %d] : error getting memory percent %+v", model, pid, err)
return nil, err
}
cpuPercent, err := backendProcess.CPUPercent()
if err != nil {
log.Error().Msgf("model %s [PID %d] : error getting cpu percent %+v", model, pid, err)
return nil, err
}
return &BackendMonitorResponse{
MemoryInfo: memInfo,
MemoryPercent: memPercent,
CPUPercent: cpuPercent,
}, nil
}
func (bm BackendMonitor) getModelLoaderIDFromCtx(c *fiber.Ctx) (string, error) {
input := new(BackendMonitorRequest)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return "", err
}
config, exists := bm.configLoader.GetConfig(input.Model)
var backendId string
if exists {
backendId = config.Model
} else {
// Last ditch effort: use it raw, see if a backend happens to match.
backendId = input.Model
}
if !strings.HasSuffix(backendId, ".bin") {
backendId = fmt.Sprintf("%s.bin", backendId)
}
return backendId, nil
}
func BackendMonitorEndpoint(bm BackendMonitor) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
backendId, err := bm.getModelLoaderIDFromCtx(c)
if err != nil {
return err
}
client := bm.options.Loader.CheckIsLoaded(backendId)
if client == nil {
return fmt.Errorf("backend %s is not currently loaded", backendId)
}
status, rpcErr := client.Status(context.TODO())
if rpcErr != nil {
log.Warn().Msgf("backend %s experienced an error retrieving status info: %s", backendId, rpcErr.Error())
val, slbErr := bm.SampleLocalBackendProcess(backendId)
if slbErr != nil {
return fmt.Errorf("backend %s experienced an error retrieving status info via rpc: %s, then failed local node process sample: %s", backendId, rpcErr.Error(), slbErr.Error())
}
return c.JSON(proto.StatusResponse{
State: proto.StatusResponse_ERROR,
Memory: &proto.MemoryUsageData{
Total: val.MemoryInfo.VMS,
Breakdown: map[string]uint64{
"gopsutil-RSS": val.MemoryInfo.RSS,
},
},
})
}
return c.JSON(status)
}
}
func BackendShutdownEndpoint(bm BackendMonitor) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
backendId, err := bm.getModelLoaderIDFromCtx(c)
if err != nil {
return err
}
return bm.options.Loader.ShutdownModel(backendId)
}
}

View File

@@ -4,13 +4,17 @@ import (
"context"
"fmt"
"os"
"slices"
"strings"
"sync"
"time"
json "github.com/json-iterator/go"
"gopkg.in/yaml.v3"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/pkg/gallery"
"github.com/go-skynet/LocalAI/pkg/utils"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/rs/zerolog/log"
@@ -24,6 +28,7 @@ type galleryOp struct {
}
type galleryOpStatus struct {
FileName string `json:"file_name"`
Error error `json:"error"`
Processed bool `json:"processed"`
Message string `json:"message"`
@@ -47,7 +52,6 @@ func NewGalleryService(modelPath string) *galleryApplier {
}
}
// prepareModel applies a
func prepareModel(modelPath string, req gallery.GalleryModel, cm *config.ConfigLoader, downloadStatus func(string, string, string, float64)) error {
config, err := gallery.GetGalleryConfigFromURL(req.URL)
@@ -73,6 +77,13 @@ func (g *galleryApplier) getStatus(s string) *galleryOpStatus {
return g.statuses[s]
}
func (g *galleryApplier) getAllStatus() map[string]*galleryOpStatus {
g.Lock()
defer g.Unlock()
return g.statuses
}
func (g *galleryApplier) Start(c context.Context, cm *config.ConfigLoader) {
go func() {
for {
@@ -80,6 +91,8 @@ func (g *galleryApplier) Start(c context.Context, cm *config.ConfigLoader) {
case <-c.Done():
return
case op := <-g.C:
utils.ResetDownloadTimers()
g.updateStatus(op.id, &galleryOpStatus{Message: "processing", Progress: 0})
// updates the status with an error
@@ -89,14 +102,18 @@ func (g *galleryApplier) Start(c context.Context, cm *config.ConfigLoader) {
// displayDownload displays the download progress
progressCallback := func(fileName string, current string, total string, percentage float64) {
g.updateStatus(op.id, &galleryOpStatus{Message: "processing", Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
displayDownload(fileName, current, total, percentage)
g.updateStatus(op.id, &galleryOpStatus{Message: "processing", FileName: fileName, Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
utils.DisplayDownloadFunction(fileName, current, total, percentage)
}
var err error
// if the request contains a gallery name, we apply the gallery from the gallery list
if op.galleryName != "" {
err = gallery.InstallModelFromGallery(op.galleries, op.galleryName, g.modelPath, op.req, progressCallback)
if strings.Contains(op.galleryName, "@") {
err = gallery.InstallModelFromGallery(op.galleries, op.galleryName, g.modelPath, op.req, progressCallback)
} else {
err = gallery.InstallModelFromGalleryByName(op.galleries, op.galleryName, g.modelPath, op.req, progressCallback)
}
} else {
err = prepareModel(g.modelPath, op.req, cm, progressCallback)
}
@@ -119,34 +136,28 @@ func (g *galleryApplier) Start(c context.Context, cm *config.ConfigLoader) {
}()
}
var lastProgress time.Time = time.Now()
var startTime time.Time = time.Now()
func displayDownload(fileName string, current string, total string, percentage float64) {
currentTime := time.Now()
if currentTime.Sub(lastProgress) >= 5*time.Second {
lastProgress = currentTime
// calculate ETA based on percentage and elapsed time
var eta time.Duration
if percentage > 0 {
elapsed := currentTime.Sub(startTime)
eta = time.Duration(float64(elapsed)*(100/percentage) - float64(elapsed))
}
if total != "" {
log.Debug().Msgf("Downloading %s: %s/%s (%.2f%%) ETA: %s", fileName, current, total, percentage, eta)
} else {
log.Debug().Msgf("Downloading: %s", current)
}
}
type galleryModel struct {
gallery.GalleryModel `yaml:",inline"` // https://github.com/go-yaml/yaml/issues/63
ID string `json:"id"`
}
type galleryModel struct {
gallery.GalleryModel
ID string `json:"id"`
func processRequests(modelPath, s string, cm *config.ConfigLoader, galleries []gallery.Gallery, requests []galleryModel) error {
var err error
for _, r := range requests {
utils.ResetDownloadTimers()
if r.ID == "" {
err = prepareModel(modelPath, r.GalleryModel, cm, utils.DisplayDownloadFunction)
} else {
if strings.Contains(r.ID, "@") {
err = gallery.InstallModelFromGallery(
galleries, r.ID, modelPath, r.GalleryModel, utils.DisplayDownloadFunction)
} else {
err = gallery.InstallModelFromGalleryByName(
galleries, r.ID, modelPath, r.GalleryModel, utils.DisplayDownloadFunction)
}
}
}
return err
}
func ApplyGalleryFromFile(modelPath, s string, cm *config.ConfigLoader, galleries []gallery.Gallery) error {
@@ -154,7 +165,13 @@ func ApplyGalleryFromFile(modelPath, s string, cm *config.ConfigLoader, gallerie
if err != nil {
return err
}
return ApplyGalleryFromString(modelPath, string(dat), cm, galleries)
var requests []galleryModel
if err := yaml.Unmarshal(dat, &requests); err != nil {
return err
}
return processRequests(modelPath, s, cm, galleries, requests)
}
func ApplyGalleryFromString(modelPath, s string, cm *config.ConfigLoader, galleries []gallery.Gallery) error {
@@ -164,29 +181,15 @@ func ApplyGalleryFromString(modelPath, s string, cm *config.ConfigLoader, galler
return err
}
for _, r := range requests {
if r.ID == "" {
err = prepareModel(modelPath, r.GalleryModel, cm, displayDownload)
} else {
err = gallery.InstallModelFromGallery(galleries, r.ID, modelPath, r.GalleryModel, displayDownload)
}
}
return err
return processRequests(modelPath, s, cm, galleries, requests)
}
/// Endpoints
/// Endpoint Service
func GetOpStatusEndpoint(g *galleryApplier) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
status := g.getStatus(c.Params("uuid"))
if status == nil {
return fmt.Errorf("could not find any status for ID")
}
return c.JSON(status)
}
type ModelGalleryService struct {
galleries []gallery.Gallery
modelPath string
galleryApplier *galleryApplier
}
type GalleryModel struct {
@@ -194,7 +197,31 @@ type GalleryModel struct {
gallery.GalleryModel
}
func ApplyModelGalleryEndpoint(modelPath string, cm *config.ConfigLoader, g chan galleryOp, galleries []gallery.Gallery) func(c *fiber.Ctx) error {
func CreateModelGalleryService(galleries []gallery.Gallery, modelPath string, galleryApplier *galleryApplier) ModelGalleryService {
return ModelGalleryService{
galleries: galleries,
modelPath: modelPath,
galleryApplier: galleryApplier,
}
}
func (mgs *ModelGalleryService) GetOpStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
status := mgs.galleryApplier.getStatus(c.Params("uuid"))
if status == nil {
return fmt.Errorf("could not find any status for ID")
}
return c.JSON(status)
}
}
func (mgs *ModelGalleryService) GetAllStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
return c.JSON(mgs.galleryApplier.getAllStatus())
}
}
func (mgs *ModelGalleryService) ApplyModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(GalleryModel)
// Get input data from the request body
@@ -206,11 +233,11 @@ func ApplyModelGalleryEndpoint(modelPath string, cm *config.ConfigLoader, g chan
if err != nil {
return err
}
g <- galleryOp{
mgs.galleryApplier.C <- galleryOp{
req: input.GalleryModel,
id: uuid.String(),
galleryName: input.ID,
galleries: galleries,
galleries: mgs.galleries,
}
return c.JSON(struct {
ID string `json:"uuid"`
@@ -219,11 +246,11 @@ func ApplyModelGalleryEndpoint(modelPath string, cm *config.ConfigLoader, g chan
}
}
func ListModelFromGalleryEndpoint(galleries []gallery.Gallery, basePath string) func(c *fiber.Ctx) error {
func (mgs *ModelGalleryService) ListModelFromGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
log.Debug().Msgf("Listing models from galleries: %+v", galleries)
log.Debug().Msgf("Listing models from galleries: %+v", mgs.galleries)
models, err := gallery.AvailableGalleryModels(galleries, basePath)
models, err := gallery.AvailableGalleryModels(mgs.galleries, mgs.modelPath)
if err != nil {
return err
}
@@ -238,3 +265,56 @@ func ListModelFromGalleryEndpoint(galleries []gallery.Gallery, basePath string)
return c.Send(dat)
}
}
// NOTE: This is different (and much simpler!) than above! This JUST lists the model galleries that have been loaded, not their contents!
func (mgs *ModelGalleryService) ListModelGalleriesEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
log.Debug().Msgf("Listing model galleries %+v", mgs.galleries)
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
return c.Send(dat)
}
}
func (mgs *ModelGalleryService) AddModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(gallery.Gallery)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
if slices.ContainsFunc(mgs.galleries, func(gallery gallery.Gallery) bool {
return gallery.Name == input.Name
}) {
return fmt.Errorf("%s already exists", input.Name)
}
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
log.Debug().Msgf("Adding %+v to gallery list", *input)
mgs.galleries = append(mgs.galleries, *input)
return c.Send(dat)
}
}
func (mgs *ModelGalleryService) RemoveModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(gallery.Gallery)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
if !slices.ContainsFunc(mgs.galleries, func(gallery gallery.Gallery) bool {
return gallery.Name == input.Name
}) {
return fmt.Errorf("%s is not currently registered", input.Name)
}
mgs.galleries = slices.DeleteFunc(mgs.galleries, func(gallery gallery.Gallery) bool {
return gallery.Name == input.Name
})
return c.Send(nil)
}
}

View File

@@ -1,39 +1,17 @@
package localai
import (
"context"
"fmt"
"os"
"path/filepath"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/pkg/grpc/proto"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/utils"
"github.com/gofiber/fiber/v2"
)
type TTSRequest struct {
Model string `json:"model" yaml:"model"`
Input string `json:"input" yaml:"input"`
}
func generateUniqueFileName(dir, baseName, ext string) string {
counter := 1
fileName := baseName + ext
for {
filePath := filepath.Join(dir, fileName)
_, err := os.Stat(filePath)
if os.IsNotExist(err) {
return fileName
}
counter++
fileName = fmt.Sprintf("%s_%d%s", baseName, counter, ext)
}
Model string `json:"model" yaml:"model"`
Input string `json:"input" yaml:"input"`
Backend string `json:"backend" yaml:"backend"`
}
func TTSEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
@@ -45,40 +23,10 @@ func TTSEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
return err
}
piperModel, err := o.Loader.BackendLoader(
model.WithBackendString(model.PiperBackend),
model.WithModelFile(input.Model),
model.WithContext(o.Context),
model.WithAssetDir(o.AssetsDestination))
filePath, _, err := backend.ModelTTS(input.Backend, input.Input, input.Model, o.Loader, o)
if err != nil {
return err
}
if piperModel == nil {
return fmt.Errorf("could not load piper model")
}
if err := os.MkdirAll(o.AudioDir, 0755); err != nil {
return fmt.Errorf("failed creating audio directory: %s", err)
}
fileName := generateUniqueFileName(o.AudioDir, "piper", ".wav")
filePath := filepath.Join(o.AudioDir, fileName)
modelPath := filepath.Join(o.Loader.ModelPath, input.Model)
if err := utils.VerifyPath(modelPath, o.Loader.ModelPath); err != nil {
return err
}
if _, err := piperModel.TTS(context.Background(), &proto.TTSRequest{
Text: input.Input,
Model: modelPath,
Dst: filePath,
}); err != nil {
return err
}
return c.Download(filePath)
}
}

View File

@@ -10,8 +10,10 @@ import (
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/go-skynet/LocalAI/pkg/grammar"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/utils"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
"github.com/valyala/fasthttp"
@@ -20,19 +22,24 @@ import (
func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
emptyMessage := ""
process := func(s string, req *OpenAIRequest, config *config.Config, loader *model.ModelLoader, responses chan OpenAIResponse) {
initialMessage := OpenAIResponse{
process := func(s string, req *schema.OpenAIRequest, config *config.Config, loader *model.ModelLoader, responses chan schema.OpenAIResponse) {
initialMessage := schema.OpenAIResponse{
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []Choice{{Delta: &Message{Role: "assistant", Content: &emptyMessage}}},
Choices: []schema.Choice{{Delta: &schema.Message{Role: "assistant", Content: &emptyMessage}}},
Object: "chat.completion.chunk",
}
responses <- initialMessage
ComputeChoices(s, req.N, config, o, loader, func(s string, c *[]Choice) {}, func(s string) bool {
resp := OpenAIResponse{
ComputeChoices(req, s, config, o, loader, func(s string, c *[]schema.Choice) {}, func(s string, usage backend.TokenUsage) bool {
resp := schema.OpenAIResponse{
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []Choice{{Delta: &Message{Content: &s}, Index: 0}},
Choices: []schema.Choice{{Delta: &schema.Message{Content: &s}, Index: 0}},
Object: "chat.completion.chunk",
Usage: schema.OpenAIUsage{
PromptTokens: usage.Prompt,
CompletionTokens: usage.Completion,
TotalTokens: usage.Prompt + usage.Completion,
},
}
responses <- resp
@@ -43,12 +50,12 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
return func(c *fiber.Ctx) error {
processFunctions := false
funcs := grammar.Functions{}
model, input, err := readInput(c, o.Loader, true)
modelFile, input, err := readInput(c, o, true)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
config, input, err := readConfig(model, input, cm, o.Loader, o.Debug, o.Threads, o.ContextSize, o.F16)
config, input, err := readConfig(modelFile, input, cm, o.Loader, o.Debug, o.Threads, o.ContextSize, o.F16)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
@@ -109,10 +116,12 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
var predInput string
suppressConfigSystemPrompt := false
mess := []string{}
for _, i := range input.Messages {
for messageIndex, i := range input.Messages {
var content string
role := i.Role
// if function call, we might want to customize the role so we can display better that the "assistant called a json action"
// if an "assistant_function_call" role is defined, we use it, otherwise we use the role that is passed by in the request
if i.FunctionCall != nil && i.Role == "assistant" {
@@ -124,33 +133,61 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
r := config.Roles[role]
contentExists := i.Content != nil && *i.Content != ""
if r != "" {
if contentExists {
content = fmt.Sprint(r, " ", *i.Content)
// First attempt to populate content via a chat message specific template
if config.TemplateConfig.ChatMessage != "" {
chatMessageData := model.ChatMessageTemplateData{
SystemPrompt: config.SystemPrompt,
Role: r,
RoleName: role,
Content: *i.Content,
MessageIndex: messageIndex,
}
if i.FunctionCall != nil {
j, err := json.Marshal(i.FunctionCall)
if err == nil {
if contentExists {
content += "\n" + fmt.Sprint(r, " ", string(j))
} else {
content = fmt.Sprint(r, " ", string(j))
templatedChatMessage, err := o.Loader.EvaluateTemplateForChatMessage(config.TemplateConfig.ChatMessage, chatMessageData)
if err != nil {
log.Error().Msgf("error processing message %+v using template \"%s\": %v. Skipping!", chatMessageData, config.TemplateConfig.ChatMessage, err)
} else {
if templatedChatMessage == "" {
log.Warn().Msgf("template \"%s\" produced blank output for %+v. Skipping!", config.TemplateConfig.ChatMessage, chatMessageData)
continue // TODO: This continue is here intentionally to skip over the line `mess = append(mess, content)` below, and to prevent the sprintf
}
log.Debug().Msgf("templated message for chat: %s", templatedChatMessage)
content = templatedChatMessage
}
}
// If this model doesn't have such a template, or if that template fails to return a value, template at the message level.
if content == "" {
if r != "" {
if contentExists {
content = fmt.Sprint(r, " ", *i.Content)
}
if i.FunctionCall != nil {
j, err := json.Marshal(i.FunctionCall)
if err == nil {
if contentExists {
content += "\n" + fmt.Sprint(r, " ", string(j))
} else {
content = fmt.Sprint(r, " ", string(j))
}
}
}
} else {
if contentExists {
content = fmt.Sprint(*i.Content)
}
if i.FunctionCall != nil {
j, err := json.Marshal(i.FunctionCall)
if err == nil {
if contentExists {
content += "\n" + string(j)
} else {
content = string(j)
}
}
}
}
} else {
if contentExists {
content = fmt.Sprint(*i.Content)
}
if i.FunctionCall != nil {
j, err := json.Marshal(i.FunctionCall)
if err == nil {
if contentExists {
content += "\n" + string(j)
} else {
content = string(j)
}
}
// Special Handling: System. We care if it was printed at all, not the r branch, so check seperately
if contentExists && role == "system" {
suppressConfigSystemPrompt = true
}
}
@@ -181,12 +218,11 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
// A model can have a "file.bin.tmpl" file associated with a prompt template prefix
templatedInput, err := o.Loader.TemplatePrefix(templateFile, struct {
Input string
Functions []grammar.Function
}{
Input: predInput,
Functions: funcs,
templatedInput, err := o.Loader.EvaluateTemplateForPrompt(model.ChatPromptTemplate, templateFile, model.PromptTemplateData{
SystemPrompt: config.SystemPrompt,
SuppressSystemPrompt: suppressConfigSystemPrompt,
Input: predInput,
Functions: funcs,
})
if err == nil {
predInput = templatedInput
@@ -201,31 +237,39 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
if toStream {
responses := make(chan OpenAIResponse)
responses := make(chan schema.OpenAIResponse)
go process(predInput, input, config, o.Loader, responses)
c.Context().SetBodyStreamWriter(fasthttp.StreamWriter(func(w *bufio.Writer) {
usage := &schema.OpenAIUsage{}
for ev := range responses {
usage = &ev.Usage // Copy a pointer to the latest usage chunk so that the stop message can reference it
var buf bytes.Buffer
enc := json.NewEncoder(&buf)
enc.Encode(ev)
log.Debug().Msgf("Sending chunk: %s", buf.String())
fmt.Fprintf(w, "data: %v\n", buf.String())
_, err := fmt.Fprintf(w, "data: %v\n", buf.String())
if err != nil {
log.Debug().Msgf("Sending chunk failed: %v", err)
input.Cancel()
break
}
w.Flush()
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []Choice{
Choices: []schema.Choice{
{
FinishReason: "stop",
Index: 0,
Delta: &Message{Content: &emptyMessage},
Delta: &schema.Message{Content: &emptyMessage},
}},
Object: "chat.completion.chunk",
Usage: *usage,
}
respData, _ := json.Marshal(resp)
@@ -236,10 +280,12 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
return nil
}
result, err := ComputeChoices(predInput, input.N, config, o, o.Loader, func(s string, c *[]Choice) {
result, tokenUsage, err := ComputeChoices(input, predInput, config, o, o.Loader, func(s string, c *[]schema.Choice) {
if processFunctions {
// As we have to change the result before processing, we can't stream the answer (yet?)
ss := map[string]interface{}{}
// This prevent newlines to break JSON parsing for clients
s = utils.EscapeNewLines(s)
json.Unmarshal([]byte(s), &ss)
log.Debug().Msgf("Function return: %s %+v", s, ss)
@@ -268,7 +314,7 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
message = backend.Finetune(*config, predInput, message)
log.Debug().Msgf("Reply received from LLM(finetuned): %s", message)
*c = append(*c, Choice{Message: &Message{Role: "assistant", Content: &message}})
*c = append(*c, schema.Choice{Message: &schema.Message{Role: "assistant", Content: &message}})
return
}
}
@@ -278,7 +324,7 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
// Otherwise ask the LLM to understand the JSON output and the context, and return a message
// Note: This costs (in term of CPU) another computation
config.Grammar = ""
predFunc, err := backend.ModelInference(predInput, o.Loader, *config, o, nil)
predFunc, err := backend.ModelInference(input.Context, predInput, o.Loader, *config, o, nil)
if err != nil {
log.Error().Msgf("inference error: %s", err.Error())
return
@@ -290,28 +336,33 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
return
}
prediction = backend.Finetune(*config, predInput, prediction)
*c = append(*c, Choice{Message: &Message{Role: "assistant", Content: &prediction}})
fineTunedResponse := backend.Finetune(*config, predInput, prediction.Response)
*c = append(*c, schema.Choice{Message: &schema.Message{Role: "assistant", Content: &fineTunedResponse}})
} else {
// otherwise reply with the function call
*c = append(*c, Choice{
*c = append(*c, schema.Choice{
FinishReason: "function_call",
Message: &Message{Role: "assistant", FunctionCall: ss},
Message: &schema.Message{Role: "assistant", FunctionCall: ss},
})
}
return
}
*c = append(*c, Choice{Message: &Message{Role: "assistant", Content: &s}})
*c = append(*c, schema.Choice{FinishReason: "stop", Index: 0, Message: &schema.Message{Role: "assistant", Content: &s}})
}, nil)
if err != nil {
return err
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: result,
Object: "chat.completion",
Usage: schema.OpenAIUsage{
PromptTokens: tokenUsage.Prompt,
CompletionTokens: tokenUsage.Completion,
TotalTokens: tokenUsage.Prompt + tokenUsage.Completion,
},
}
respData, _ := json.Marshal(resp)
log.Debug().Msgf("Response: %s", respData)

View File

@@ -7,8 +7,10 @@ import (
"errors"
"fmt"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
@@ -17,17 +19,22 @@ import (
// https://platform.openai.com/docs/api-reference/completions
func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
process := func(s string, req *OpenAIRequest, config *config.Config, loader *model.ModelLoader, responses chan OpenAIResponse) {
ComputeChoices(s, req.N, config, o, loader, func(s string, c *[]Choice) {}, func(s string) bool {
resp := OpenAIResponse{
process := func(s string, req *schema.OpenAIRequest, config *config.Config, loader *model.ModelLoader, responses chan schema.OpenAIResponse) {
ComputeChoices(req, s, config, o, loader, func(s string, c *[]schema.Choice) {}, func(s string, usage backend.TokenUsage) bool {
resp := schema.OpenAIResponse{
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []Choice{
Choices: []schema.Choice{
{
Index: 0,
Text: s,
},
},
Object: "text_completion",
Usage: schema.OpenAIUsage{
PromptTokens: usage.Prompt,
CompletionTokens: usage.Completion,
TotalTokens: usage.Prompt + usage.Completion,
},
}
log.Debug().Msgf("Sending goroutine: %s", s)
@@ -38,14 +45,14 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
}
return func(c *fiber.Ctx) error {
model, input, err := readInput(c, o.Loader, true)
modelFile, input, err := readInput(c, o, true)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
log.Debug().Msgf("`input`: %+v", input)
config, input, err := readConfig(model, input, cm, o.Loader, o.Debug, o.Threads, o.ContextSize, o.F16)
config, input, err := readConfig(modelFile, input, cm, o.Loader, o.Debug, o.Threads, o.ContextSize, o.F16)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
@@ -76,9 +83,7 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
predInput := config.PromptStrings[0]
// A model can have a "file.bin.tmpl" file associated with a prompt template prefix
templatedInput, err := o.Loader.TemplatePrefix(templateFile, struct {
Input string
}{
templatedInput, err := o.Loader.EvaluateTemplateForPrompt(model.CompletionPromptTemplate, templateFile, model.PromptTemplateData{
Input: predInput,
})
if err == nil {
@@ -86,7 +91,7 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
log.Debug().Msgf("Template found, input modified to: %s", predInput)
}
responses := make(chan OpenAIResponse)
responses := make(chan schema.OpenAIResponse)
go process(predInput, input, config, o.Loader, responses)
@@ -102,9 +107,9 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
w.Flush()
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []Choice{
Choices: []schema.Choice{
{
Index: 0,
FinishReason: "stop",
@@ -121,33 +126,44 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
return nil
}
var result []Choice
for _, i := range config.PromptStrings {
var result []schema.Choice
totalTokenUsage := backend.TokenUsage{}
for k, i := range config.PromptStrings {
// A model can have a "file.bin.tmpl" file associated with a prompt template prefix
templatedInput, err := o.Loader.TemplatePrefix(templateFile, struct {
Input string
}{
Input: i,
templatedInput, err := o.Loader.EvaluateTemplateForPrompt(model.CompletionPromptTemplate, templateFile, model.PromptTemplateData{
SystemPrompt: config.SystemPrompt,
Input: i,
})
if err == nil {
i = templatedInput
log.Debug().Msgf("Template found, input modified to: %s", i)
}
r, err := ComputeChoices(i, input.N, config, o, o.Loader, func(s string, c *[]Choice) {
*c = append(*c, Choice{Text: s})
}, nil)
r, tokenUsage, err := ComputeChoices(
input, i, config, o, o.Loader, func(s string, c *[]schema.Choice) {
*c = append(*c, schema.Choice{Text: s, FinishReason: "stop", Index: k})
}, nil)
if err != nil {
return err
}
totalTokenUsage.Prompt += tokenUsage.Prompt
totalTokenUsage.Completion += tokenUsage.Completion
result = append(result, r...)
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: result,
Object: "text_completion",
Usage: schema.OpenAIUsage{
PromptTokens: totalTokenUsage.Prompt,
CompletionTokens: totalTokenUsage.Completion,
TotalTokens: totalTokenUsage.Prompt + totalTokenUsage.Completion,
},
}
jsonResult, _ := json.Marshal(resp)

View File

@@ -4,20 +4,24 @@ import (
"encoding/json"
"fmt"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
)
func EditEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
model, input, err := readInput(c, o.Loader, true)
modelFile, input, err := readInput(c, o, true)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
config, input, err := readConfig(model, input, cm, o.Loader, o.Debug, o.Threads, o.ContextSize, o.F16)
config, input, err := readConfig(modelFile, input, cm, o.Loader, o.Debug, o.Threads, o.ContextSize, o.F16)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
@@ -30,32 +34,43 @@ func EditEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
templateFile = config.TemplateConfig.Edit
}
var result []Choice
var result []schema.Choice
totalTokenUsage := backend.TokenUsage{}
for _, i := range config.InputStrings {
// A model can have a "file.bin.tmpl" file associated with a prompt template prefix
templatedInput, err := o.Loader.TemplatePrefix(templateFile, struct {
Input string
Instruction string
}{Input: i})
templatedInput, err := o.Loader.EvaluateTemplateForPrompt(model.EditPromptTemplate, templateFile, model.PromptTemplateData{
Input: i,
Instruction: input.Instruction,
SystemPrompt: config.SystemPrompt,
})
if err == nil {
i = templatedInput
log.Debug().Msgf("Template found, input modified to: %s", i)
}
r, err := ComputeChoices(i, input.N, config, o, o.Loader, func(s string, c *[]Choice) {
*c = append(*c, Choice{Text: s})
r, tokenUsage, err := ComputeChoices(input, i, config, o, o.Loader, func(s string, c *[]schema.Choice) {
*c = append(*c, schema.Choice{Text: s})
}, nil)
if err != nil {
return err
}
totalTokenUsage.Prompt += tokenUsage.Prompt
totalTokenUsage.Completion += tokenUsage.Completion
result = append(result, r...)
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: result,
Object: "edit",
Usage: schema.OpenAIUsage{
PromptTokens: totalTokenUsage.Prompt,
CompletionTokens: totalTokenUsage.Completion,
TotalTokens: totalTokenUsage.Prompt + totalTokenUsage.Completion,
},
}
jsonResult, _ := json.Marshal(resp)

View File

@@ -6,6 +6,8 @@ import (
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/go-skynet/LocalAI/api/options"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
@@ -14,7 +16,7 @@ import (
// https://platform.openai.com/docs/api-reference/embeddings
func EmbeddingsEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
model, input, err := readInput(c, o.Loader, true)
model, input, err := readInput(c, o, true)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
@@ -25,7 +27,7 @@ func EmbeddingsEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
}
log.Debug().Msgf("Parameter Config: %+v", config)
items := []Item{}
items := []schema.Item{}
for i, s := range config.InputToken {
// get the model function to call for the result
@@ -38,7 +40,7 @@ func EmbeddingsEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
if err != nil {
return err
}
items = append(items, Item{Embedding: embeddings, Index: i, Object: "embedding"})
items = append(items, schema.Item{Embedding: embeddings, Index: i, Object: "embedding"})
}
for i, s := range config.InputStrings {
@@ -52,10 +54,10 @@ func EmbeddingsEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
if err != nil {
return err
}
items = append(items, Item{Embedding: embeddings, Index: i, Object: "embedding"})
items = append(items, schema.Item{Embedding: embeddings, Index: i, Object: "embedding"})
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Data: items,
Object: "list",

View File

@@ -1,10 +1,11 @@
package openai
import (
"bufio"
"encoding/base64"
"encoding/json"
"fmt"
"io/ioutil"
"github.com/go-skynet/LocalAI/api/schema"
"os"
"path/filepath"
"strconv"
@@ -35,7 +36,7 @@ import (
*/
func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
m, input, err := readInput(c, o.Loader, false)
m, input, err := readInput(c, o, false)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
@@ -50,6 +51,31 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
return fmt.Errorf("failed reading parameters from request:%w", err)
}
src := ""
if input.File != "" {
//base 64 decode the file and write it somewhere
// that we will cleanup
decoded, err := base64.StdEncoding.DecodeString(input.File)
if err != nil {
return err
}
// Create a temporary file
outputFile, err := os.CreateTemp(o.ImageDir, "b64")
if err != nil {
return err
}
// write the base64 result
writer := bufio.NewWriter(outputFile)
_, err = writer.Write(decoded)
if err != nil {
outputFile.Close()
return err
}
outputFile.Close()
src = outputFile.Name()
defer os.RemoveAll(src)
}
log.Debug().Msgf("Parameter Config: %+v", config)
// XXX: Only stablediffusion is supported for now
@@ -74,8 +100,8 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
if input.ResponseFormat == "b64_json" {
b64JSON = true
}
var result []Item
// src and clip_skip
var result []schema.Item
for _, i := range config.PromptStrings {
n := input.N
if input.N == 0 {
@@ -90,7 +116,10 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
}
mode := 0
step := 15
step := config.Step
if step == 0 {
step = 15
}
if input.Mode != 0 {
mode = input.Mode
@@ -105,7 +134,7 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
tempDir = o.ImageDir
}
// Create a temporary file
outputFile, err := ioutil.TempFile(tempDir, "b64")
outputFile, err := os.CreateTemp(tempDir, "b64")
if err != nil {
return err
}
@@ -119,7 +148,7 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
baseURL := c.BaseURL()
fn, err := backend.ImageGeneration(height, width, mode, step, input.Seed, positive_prompt, negative_prompt, output, o.Loader, *config, o)
fn, err := backend.ImageGeneration(height, width, mode, step, input.Seed, positive_prompt, negative_prompt, src, output, o.Loader, *config, o)
if err != nil {
return err
}
@@ -127,7 +156,7 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
return err
}
item := &Item{}
item := &schema.Item{}
if b64JSON {
defer os.RemoveAll(output)
@@ -145,7 +174,7 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
}
}
resp := &OpenAIResponse{
resp := &schema.OpenAIResponse{
Data: result,
}

View File

@@ -4,33 +4,47 @@ import (
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
model "github.com/go-skynet/LocalAI/pkg/model"
)
func ComputeChoices(predInput string, n int, config *config.Config, o *options.Option, loader *model.ModelLoader, cb func(string, *[]Choice), tokenCallback func(string) bool) ([]Choice, error) {
result := []Choice{}
func ComputeChoices(
req *schema.OpenAIRequest,
predInput string,
config *config.Config,
o *options.Option,
loader *model.ModelLoader,
cb func(string, *[]schema.Choice),
tokenCallback func(string, backend.TokenUsage) bool) ([]schema.Choice, backend.TokenUsage, error) {
n := req.N // number of completions to return
result := []schema.Choice{}
if n == 0 {
n = 1
}
// get the model function to call for the result
predFunc, err := backend.ModelInference(predInput, loader, *config, o, tokenCallback)
predFunc, err := backend.ModelInference(req.Context, predInput, loader, *config, o, tokenCallback)
if err != nil {
return result, err
return result, backend.TokenUsage{}, err
}
tokenUsage := backend.TokenUsage{}
for i := 0; i < n; i++ {
prediction, err := predFunc()
if err != nil {
return result, err
return result, backend.TokenUsage{}, err
}
prediction = backend.Finetune(*config, predInput, prediction)
cb(prediction, &result)
tokenUsage.Prompt += prediction.Usage.Prompt
tokenUsage.Completion += prediction.Usage.Completion
finetunedResponse := backend.Finetune(*config, predInput, prediction.Response)
cb(finetunedResponse, &result)
//result = append(result, Choice{Text: prediction})
}
return result, err
return result, tokenUsage, err
}

View File

@@ -1,7 +1,10 @@
package openai
import (
"regexp"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/schema"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
)
@@ -14,21 +17,50 @@ func ListModelsEndpoint(loader *model.ModelLoader, cm *config.ConfigLoader) func
}
var mm map[string]interface{} = map[string]interface{}{}
dataModels := []OpenAIModel{}
for _, m := range models {
mm[m] = nil
dataModels = append(dataModels, OpenAIModel{ID: m, Object: "model"})
dataModels := []schema.OpenAIModel{}
var filterFn func(name string) bool
filter := c.Query("filter")
// If filter is not specified, do not filter the list by model name
if filter == "" {
filterFn = func(_ string) bool { return true }
} else {
// If filter _IS_ specified, we compile it to a regex which is used to create the filterFn
rxp, err := regexp.Compile(filter)
if err != nil {
return err
}
filterFn = func(name string) bool {
return rxp.MatchString(name)
}
}
for _, k := range cm.ListConfigs() {
if _, exists := mm[k]; !exists {
dataModels = append(dataModels, OpenAIModel{ID: k, Object: "model"})
// By default, exclude any loose files that are already referenced by a configuration file.
excludeConfigured := c.QueryBool("excludeConfigured", true)
// Start with the known configurations
for _, c := range cm.GetAllConfigs() {
if excludeConfigured {
mm[c.Model] = nil
}
if filterFn(c.Name) {
dataModels = append(dataModels, schema.OpenAIModel{ID: c.Name, Object: "model"})
}
}
// Then iterate through the loose files:
for _, m := range models {
// And only adds them if they shouldn't be skipped.
if _, exists := mm[m]; !exists && filterFn(m) {
dataModels = append(dataModels, schema.OpenAIModel{ID: m, Object: "model"})
}
}
return c.JSON(struct {
Object string `json:"object"`
Data []OpenAIModel `json:"data"`
Object string `json:"object"`
Data []schema.OpenAIModel `json:"data"`
}{
Object: "list",
Data: dataModels,

View File

@@ -1,6 +1,7 @@
package openai
import (
"context"
"encoding/json"
"fmt"
"os"
@@ -8,13 +9,19 @@ import (
"strings"
config "github.com/go-skynet/LocalAI/api/config"
options "github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
)
func readInput(c *fiber.Ctx, loader *model.ModelLoader, randomModel bool) (string, *OpenAIRequest, error) {
input := new(OpenAIRequest)
func readInput(c *fiber.Ctx, o *options.Option, randomModel bool) (string, *schema.OpenAIRequest, error) {
loader := o.Loader
input := new(schema.OpenAIRequest)
ctx, cancel := context.WithCancel(o.Context)
input.Context = ctx
input.Cancel = cancel
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return "", nil, err
@@ -54,7 +61,7 @@ func readInput(c *fiber.Ctx, loader *model.ModelLoader, randomModel bool) (strin
return modelFile, input, nil
}
func updateConfig(config *config.Config, input *OpenAIRequest) {
func updateConfig(config *config.Config, input *schema.OpenAIRequest) {
if input.Echo {
config.Echo = input.Echo
}
@@ -65,6 +72,38 @@ func updateConfig(config *config.Config, input *OpenAIRequest) {
config.TopP = input.TopP
}
if input.Backend != "" {
config.Backend = input.Backend
}
if input.ClipSkip != 0 {
config.Diffusers.ClipSkip = input.ClipSkip
}
if input.ModelBaseName != "" {
config.AutoGPTQ.ModelBaseName = input.ModelBaseName
}
if input.NegativePromptScale != 0 {
config.NegativePromptScale = input.NegativePromptScale
}
if input.UseFastTokenizer {
config.UseFastTokenizer = input.UseFastTokenizer
}
if input.NegativePrompt != "" {
config.NegativePrompt = input.NegativePrompt
}
if input.RopeFreqBase != 0 {
config.RopeFreqBase = input.RopeFreqBase
}
if input.RopeFreqScale != 0 {
config.RopeFreqScale = input.RopeFreqScale
}
if input.Grammar != "" {
config.Grammar = input.Grammar
}
@@ -115,15 +154,15 @@ func updateConfig(config *config.Config, input *OpenAIRequest) {
}
if input.Mirostat != 0 {
config.Mirostat = input.Mirostat
config.LLMConfig.Mirostat = input.Mirostat
}
if input.MirostatETA != 0 {
config.MirostatETA = input.MirostatETA
config.LLMConfig.MirostatETA = input.MirostatETA
}
if input.MirostatTAU != 0 {
config.MirostatTAU = input.MirostatTAU
config.LLMConfig.MirostatTAU = input.MirostatTAU
}
if input.TypicalP != 0 {
@@ -161,7 +200,7 @@ func updateConfig(config *config.Config, input *OpenAIRequest) {
n, exists := fnc["name"]
if exists {
nn, e := n.(string)
if !e {
if e {
name = nn
}
}
@@ -180,7 +219,7 @@ func updateConfig(config *config.Config, input *OpenAIRequest) {
}
}
func readConfig(modelFile string, input *OpenAIRequest, cm *config.ConfigLoader, loader *model.ModelLoader, debug bool, threads, ctx int, f16 bool) (*config.Config, *OpenAIRequest, error) {
func readConfig(modelFile string, input *schema.OpenAIRequest, cm *config.ConfigLoader, loader *model.ModelLoader, debug bool, threads, ctx int, f16 bool) (*config.Config, *schema.OpenAIRequest, error) {
// Load a config file if present after the model name
modelConfig := filepath.Join(loader.ModelPath, modelFile+".yaml")

View File

@@ -1,7 +1,6 @@
package openai
import (
"context"
"fmt"
"io"
"net/http"
@@ -9,10 +8,9 @@ import (
"path"
"path/filepath"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/pkg/grpc/proto"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/rs/zerolog/log"
@@ -21,7 +19,7 @@ import (
// https://platform.openai.com/docs/api-reference/audio/create
func TranscriptEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
m, input, err := readInput(c, o.Loader, false)
m, input, err := readInput(c, o, false)
if err != nil {
return fmt.Errorf("failed reading parameters from request:%w", err)
}
@@ -61,25 +59,7 @@ func TranscriptEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
log.Debug().Msgf("Audio file copied to: %+v", dst)
whisperModel, err := o.Loader.BackendLoader(
model.WithBackendString(model.WhisperBackend),
model.WithModelFile(config.Model),
model.WithContext(o.Context),
model.WithThreads(uint32(config.Threads)),
model.WithAssetDir(o.AssetsDestination))
if err != nil {
return err
}
if whisperModel == nil {
return fmt.Errorf("could not load whisper model")
}
tr, err := whisperModel.AudioTranscription(context.Background(), &proto.TranscriptRequest{
Dst: dst,
Language: input.Language,
Threads: uint32(config.Threads),
})
tr, err := backend.ModelTranscription(dst, input.Language, o.Loader, *config, o)
if err != nil {
return err
}

View File

@@ -23,11 +23,18 @@ type Option struct {
PreloadJSONModels string
PreloadModelsFromPath string
CORSAllowOrigins string
ApiKeys []string
Galleries []gallery.Gallery
BackendAssets embed.FS
AssetsDestination string
ExternalGRPCBackends map[string]string
AutoloadGalleries bool
SingleBackend bool
}
type AppOption func(*Option)
@@ -53,6 +60,23 @@ func WithCors(b bool) AppOption {
}
}
var EnableSingleBackend = func(o *Option) {
o.SingleBackend = true
}
var EnableGalleriesAutoload = func(o *Option) {
o.AutoloadGalleries = true
}
func WithExternalBackend(name string, uri string) AppOption {
return func(o *Option) {
if o.ExternalGRPCBackends == nil {
o.ExternalGRPCBackends = make(map[string]string)
}
o.ExternalGRPCBackends[name] = uri
}
}
func WithCorsAllowOrigins(b string) AppOption {
return func(o *Option) {
o.CORSAllowOrigins = b
@@ -75,6 +99,7 @@ func WithStringGalleries(galls string) AppOption {
return func(o *Option) {
if galls == "" {
log.Debug().Msgf("no galleries to load")
o.Galleries = []gallery.Gallery{}
return
}
var galleries []gallery.Gallery
@@ -167,3 +192,9 @@ func WithImageDir(imageDir string) AppOption {
o.ImageDir = imageDir
}
}
func WithApiKeys(apiKeys []string) AppOption {
return func(o *Option) {
o.ApiKeys = apiKeys
}
}

View File

@@ -1,6 +1,8 @@
package openai
package schema
import (
"context"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/pkg/grammar"
@@ -46,7 +48,7 @@ type OpenAIResponse struct {
}
type Choice struct {
Index int `json:"index,omitempty"`
Index int `json:"index"`
FinishReason string `json:"finish_reason,omitempty"`
Message *Message `json:"message,omitempty"`
Delta *Message `json:"delta,omitempty"`
@@ -70,6 +72,9 @@ type OpenAIModel struct {
type OpenAIRequest struct {
config.PredictionOptions
Context context.Context
Cancel context.CancelFunc
// whisper
File string `json:"file" validate:"required"`
//whisper/image
@@ -102,4 +107,9 @@ type OpenAIRequest struct {
Grammar string `json:"grammar" yaml:"grammar"`
JSONFunctionGrammarObject *grammar.JSONFunctionStructure `json:"grammar_json_functions" yaml:"grammar_json_functions"`
Backend string `json:"backend" yaml:"backend"`
// AutoGPTQ
ModelBaseName string `json:"model_base_name" yaml:"model_base_name"`
}

View File

@@ -1,4 +1,4 @@
package api
package schema
import "time"

View File

@@ -5,8 +5,8 @@ package main
import (
"flag"
bert "github.com/go-skynet/LocalAI/pkg/backend/llm/bert"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
bert "github.com/go-skynet/LocalAI/pkg/grpc/llm/bert"
)
var (

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
bloomz "github.com/go-skynet/LocalAI/pkg/grpc/llm/bloomz"
bloomz "github.com/go-skynet/LocalAI/pkg/backend/llm/bloomz"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -7,7 +7,7 @@ package main
import (
"flag"
falcon "github.com/go-skynet/LocalAI/pkg/grpc/llm/falcon"
falcon "github.com/go-skynet/LocalAI/pkg/backend/llm/falcon"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
gpt4all "github.com/go-skynet/LocalAI/pkg/grpc/llm/gpt4all"
gpt4all "github.com/go-skynet/LocalAI/pkg/backend/llm/gpt4all"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
langchain "github.com/go-skynet/LocalAI/pkg/grpc/llm/langchain"
langchain "github.com/go-skynet/LocalAI/pkg/backend/llm/langchain"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -0,0 +1,21 @@
package main
import (
"flag"
llama "github.com/go-skynet/LocalAI/pkg/backend/llm/llama-stable"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
var (
addr = flag.String("addr", "localhost:50051", "the address to connect to")
)
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &llama.LLM{}); err != nil {
panic(err)
}
}

View File

@@ -7,7 +7,7 @@ package main
import (
"flag"
llama "github.com/go-skynet/LocalAI/pkg/grpc/llm/llama"
llama "github.com/go-skynet/LocalAI/pkg/backend/llm/llama"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
tts "github.com/go-skynet/LocalAI/pkg/grpc/tts"
tts "github.com/go-skynet/LocalAI/pkg/backend/tts"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
rwkv "github.com/go-skynet/LocalAI/pkg/grpc/llm/rwkv"
rwkv "github.com/go-skynet/LocalAI/pkg/backend/llm/rwkv"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
image "github.com/go-skynet/LocalAI/pkg/grpc/image"
image "github.com/go-skynet/LocalAI/pkg/backend/image"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/grpc/llm/transformers"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transcribe "github.com/go-skynet/LocalAI/pkg/grpc/transcribe"
transcribe "github.com/go-skynet/LocalAI/pkg/backend/transcribe"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -16,6 +16,25 @@ else
echo "see the documentation at: https://localai.io/basics/build/index.html"
echo "Note: See also https://github.com/go-skynet/LocalAI/issues/288"
echo "@@@@@"
echo "CPU info:"
grep -e "model\sname" /proc/cpuinfo | head -1
grep -e "flags" /proc/cpuinfo | head -1
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
else
echo "CPU: no AVX found"
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
else
echo "CPU: no AVX2 found"
fi
if grep -q -e "\savx512" /proc/cpuinfo ; then
echo "CPU: AVX512 found OK"
else
echo "CPU: no AVX512 found"
fi
echo "@@@@@"
fi
./local-ai "$@"

View File

@@ -1,7 +1,16 @@
# Examples
| [ChatGPT OSS alternative](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) | [Image generation](https://localai.io/api-endpoints/index.html#image-generation) |
|------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
| ![Screenshot from 2023-04-26 23-59-55](https://user-images.githubusercontent.com/2420543/234715439-98d12e03-d3ce-4f94-ab54-2b256808e05e.png) | ![b6441997879](https://github.com/go-skynet/LocalAI/assets/2420543/d50af51c-51b7-4f39-b6c2-bf04c403894c) |
| [Telegram bot](https://github.com/go-skynet/LocalAI/tree/master/examples/telegram-bot) | [Flowise](https://github.com/go-skynet/LocalAI/tree/master/examples/flowise) |
|------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
![Screenshot from 2023-06-09 00-36-26](https://github.com/go-skynet/LocalAI/assets/2420543/e98b4305-fa2d-41cf-9d2f-1bb2d75ca902) | ![Screenshot from 2023-05-30 18-01-03](https://github.com/go-skynet/LocalAI/assets/2420543/02458782-0549-4131-971c-95ee56ec1af8)| |
Here is a list of projects that can easily be integrated with the LocalAI backend.
### Projects
### AutoGPT
@@ -64,6 +73,14 @@ A ready to use example to show e2e how to integrate LocalAI with langchain
[Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/langchain-python/)
### LocalAI functions
_by [@mudler](https://github.com/mudler)_
A ready to use example to show how to use OpenAI functions with LocalAI
[Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/functions/)
### LocalAI WebUI
_by [@dhruvgera](https://github.com/dhruvgera)_
@@ -140,6 +157,26 @@ Allows to run any LocalAI-compatible model as a backend on the servers of https:
[Check it out here](https://runpod.io/gsc?template=uv9mtqnrd0&ref=984wlcra)
### Continue
_by [@gruberdev](https://github.com/gruberdev)_
<img src="continue/img/screen.png" width="600" height="200" alt="Screenshot">
Demonstrates how to integrate an open-source copilot alternative that enhances code analysis, completion, and improvements. This approach seamlessly integrates with any LocalAI model, offering a more user-friendly experience.
[Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/continue/)
### Streamlit bot
_by [@majoshi1](https://github.com/majoshi1)_
![Screenshot](streamlit-bot/streamlit-bot.png)
A chat bot made using `Streamlit` & LocalAI.
[Check it out here](https://github.com/go-skynet/LocalAI/tree/master/examples/streamlit-bot/)
## Want to contribute?
Create an issue, and put `Example: <description>` in the title! We will post your examples here.

View File

@@ -24,10 +24,13 @@ docker-compose up -d --pull always
# docker-compose up -d --build
```
Then browse to `http://localhost:3000` to view the Web UI.
## Pointing chatbot-ui to a separately managed LocalAI service
If you want to use the [chatbot-ui example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) with an externally managed LocalAI service, you can alter the `docker-compose` file so that it looks like the below. You will notice the file is smaller, because we have removed the section that would normally start the LocalAI service. Take care to update the IP address (or FQDN) that the chatbot-ui service tries to access (marked `<<LOCALAI_IP>>` below):
```
If you want to use the [chatbot-ui example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) with an externally managed LocalAI service, you can alter the `docker-compose.yaml` file so that it looks like the below. You will notice the file is smaller, because we have removed the section that would normally start the LocalAI service. Take care to update the IP address (or FQDN) that the chatbot-ui service tries to access (marked `<<LOCALAI_IP>>` below):
```yaml
version: '3.6'
services:
@@ -40,9 +43,8 @@ services:
- 'OPENAI_API_HOST=http://<<LOCALAI_IP>>:8080'
```
Once you've edited the Dockerfile, you can start it with `docker compose up`, then browse to `http://localhost:3000`.
Once you've edited the `docker-compose.yaml`, you can start it with `docker compose up`, then browse to `http://localhost:3000` to view the Web UI.
## Accessing chatbot-ui
Open http://localhost:3000 for the Web UI.

View File

@@ -20,10 +20,13 @@ docker-compose up --pull always
# docker-compose up -d --build
```
Then browse to `http://localhost:3000` to view the Web UI.
## Pointing chatbot-ui to a separately managed LocalAI service
If you want to use the [chatbot-ui example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) with an externally managed LocalAI service, you can alter the `docker-compose` file so that it looks like the below. You will notice the file is smaller, because we have removed the section that would normally start the LocalAI service. Take care to update the IP address (or FQDN) that the chatbot-ui service tries to access (marked `<<LOCALAI_IP>>` below):
```
If you want to use the [chatbot-ui example](https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui) with an externally managed LocalAI service, you can alter the `docker-compose.yaml` file so that it looks like the below. You will notice the file is smaller, because we have removed the section that would normally start the LocalAI service. Take care to update the IP address (or FQDN) that the chatbot-ui service tries to access (marked `<<LOCALAI_IP>>` below):
```yaml
version: '3.6'
services:
@@ -36,9 +39,8 @@ services:
- 'OPENAI_API_HOST=http://<<LOCALAI_IP>>:8080'
```
Once you've edited the Dockerfile, you can start it with `docker compose up`, then browse to `http://localhost:3000`.
Once you've edited the `docker-compose.yaml`, you can start it with `docker compose up`, then browse to `http://localhost:3000` to view the Web UI.
## Accessing chatbot-ui
Open http://localhost:3000 for the Web UI.

View File

@@ -0,0 +1,53 @@
# Continue
![logo](https://continue.dev/docs/assets/images/continue-cover-logo-aa135cc83fe8a14af480d1633ed74eb5.png)
This document presents an example of integration with [continuedev/continue](https://github.com/continuedev/continue).
![Screenshot](https://continue.dev/docs/assets/images/continue-screenshot-1f36b99467817f755739d7f4c4c08fe3.png)
For a live demonstration, please click on the link below:
- [How it works (Video demonstration)](https://www.youtube.com/watch?v=3Ocrc-WX4iQ)
## Integration Setup Walkthrough
1. [As outlined in `continue`'s documentation](https://continue.dev/docs/getting-started), install the [Visual Studio Code extension from the marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue) and open it.
2. In this example, LocalAI will download the gpt4all model and set it up as "gpt-3.5-turbo". Refer to the `docker-compose.yaml` file for details.
```bash
# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI
cd LocalAI/examples/continue
# Start with docker-compose
docker-compose up --build -d
```
3. Type `/config` within Continue's VSCode extension, or edit the file located at `~/.continue/config.py` on your system with the following configuration:
```py
from continuedev.src.continuedev.libs.llm.openai import OpenAI
config = ContinueConfig(
...
models=Models(
default=OpenAI(
api_key="my-api-key",
model="gpt-3.5-turbo",
api_base="http://localhost:8080",
)
),
)
```
This setup enables you to make queries directly to your model running in the Docker container. Note that the `api_key` does not need to be properly set up; it is included here as a placeholder.
If editing the configuration seems confusing, you may copy and paste the provided default `config.py` file over the existing one in `~/.continue/config.py` after initializing the extension in the VSCode IDE.
## Additional Resources
- [Official Continue documentation](https://continue.dev/docs/intro)
- [Documentation page on using self-hosted models](https://continue.dev/docs/customization#self-hosting-an-open-source-model)
- [Official extension link](https://marketplace.visualstudio.com/items?itemName=Continue.continue)

148
examples/continue/config.py Normal file
View File

@@ -0,0 +1,148 @@
"""
This is the Continue configuration file.
See https://continue.dev/docs/customization to learn more.
"""
import subprocess
from continuedev.src.continuedev.core.main import Step
from continuedev.src.continuedev.core.sdk import ContinueSDK
from continuedev.src.continuedev.core.models import Models
from continuedev.src.continuedev.core.config import CustomCommand, SlashCommand, ContinueConfig
from continuedev.src.continuedev.plugins.context_providers.github import GitHubIssuesContextProvider
from continuedev.src.continuedev.plugins.context_providers.google import GoogleContextProvider
from continuedev.src.continuedev.plugins.policies.default import DefaultPolicy
from continuedev.src.continuedev.libs.llm.openai import OpenAI, OpenAIServerInfo
from continuedev.src.continuedev.libs.llm.ggml import GGML
from continuedev.src.continuedev.plugins.steps.open_config import OpenConfigStep
from continuedev.src.continuedev.plugins.steps.clear_history import ClearHistoryStep
from continuedev.src.continuedev.plugins.steps.feedback import FeedbackStep
from continuedev.src.continuedev.plugins.steps.comment_code import CommentCodeStep
from continuedev.src.continuedev.plugins.steps.share_session import ShareSessionStep
from continuedev.src.continuedev.plugins.steps.main import EditHighlightedCodeStep
from continuedev.src.continuedev.plugins.context_providers.search import SearchContextProvider
from continuedev.src.continuedev.plugins.context_providers.diff import DiffContextProvider
from continuedev.src.continuedev.plugins.context_providers.url import URLContextProvider
class CommitMessageStep(Step):
"""
This is a Step, the building block of Continue.
It can be used below as a slash command, so that
run will be called when you type '/commit'.
"""
async def run(self, sdk: ContinueSDK):
# Get the root directory of the workspace
dir = sdk.ide.workspace_directory
# Run git diff in that directory
diff = subprocess.check_output(
["git", "diff"], cwd=dir).decode("utf-8")
# Ask the LLM to write a commit message,
# and set it as the description of this step
self.description = await sdk.models.default.complete(
f"{diff}\n\nWrite a short, specific (less than 50 chars) commit message about the above changes:")
config = ContinueConfig(
# If set to False, we will not collect any usage data
# See here to learn what anonymous data we collect: https://continue.dev/docs/telemetry
allow_anonymous_telemetry=True,
models = Models(
default = OpenAI(
api_key = "my-api-key",
model = "gpt-3.5-turbo",
openai_server_info = OpenAIServerInfo(
api_base = "http://localhost:8080",
model = "gpt-3.5-turbo"
)
)
),
# Set a system message with information that the LLM should always keep in mind
# E.g. "Please give concise answers. Always respond in Spanish."
system_message=None,
# Set temperature to any value between 0 and 1. Higher values will make the LLM
# more creative, while lower values will make it more predictable.
temperature=0.5,
# Custom commands let you map a prompt to a shortened slash command
# They are like slash commands, but more easily defined - write just a prompt instead of a Step class
# Their output will always be in chat form
custom_commands=[
# CustomCommand(
# name="test",
# description="Write unit tests for the higlighted code",
# prompt="Write a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.",
# )
],
# Slash commands let you run a Step from a slash command
slash_commands=[
# SlashCommand(
# name="commit",
# description="This is an example slash command. Use /config to edit it and create more",
# step=CommitMessageStep,
# )
SlashCommand(
name="edit",
description="Edit code in the current file or the highlighted code",
step=EditHighlightedCodeStep,
),
SlashCommand(
name="config",
description="Customize Continue - slash commands, LLMs, system message, etc.",
step=OpenConfigStep,
),
SlashCommand(
name="comment",
description="Write comments for the current file or highlighted code",
step=CommentCodeStep,
),
SlashCommand(
name="feedback",
description="Send feedback to improve Continue",
step=FeedbackStep,
),
SlashCommand(
name="clear",
description="Clear step history",
step=ClearHistoryStep,
),
SlashCommand(
name="share",
description="Download and share the session transcript",
step=ShareSessionStep,
)
],
# Context providers let you quickly select context by typing '@'
# Uncomment the following to
# - quickly reference GitHub issues
# - show Google search results to the LLM
context_providers=[
# GitHubIssuesContextProvider(
# repo_name="<your github username or organization>/<your repo name>",
# auth_token="<your github auth token>"
# ),
# GoogleContextProvider(
# serper_api_key="<your serper.dev api key>"
# )
SearchContextProvider(),
DiffContextProvider(),
URLContextProvider(
preset_urls = [
# Add any common urls you reference here so they appear in autocomplete
]
)
],
# Policies hold the main logic that decides which Step to take next
# You can use them to design agents, or deeply customize Continue
policy=DefaultPolicy()
)

View File

@@ -0,0 +1,27 @@
version: '3.6'
services:
api:
image: quay.io/go-skynet/local-ai:latest
# As initially LocalAI will download the models defined in PRELOAD_MODELS
# you might need to tweak the healthcheck values here according to your network connection.
# Here we give a timespan of 20m to download all the required files.
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 1m
timeout: 20m
retries: 20
build:
context: ../../
dockerfile: Dockerfile
ports:
- 8080:8080
environment:
- DEBUG=true
- MODELS_PATH=/models
# You can preload different models here as well.
# See: https://github.com/go-skynet/model-gallery
- 'PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}]'
volumes:
- ./models:/models:cached
command: ["/usr/bin/local-ai" ]

BIN
examples/continue/img/screen.png Executable file
View File

Binary file not shown.

After

Width:  |  Height:  |  Size: 196 KiB

9
examples/functions/.env Normal file
View File

@@ -0,0 +1,9 @@
OPENAI_API_KEY=sk---anystringhere
OPENAI_API_BASE=http://api:8080/v1
# Models to preload at start
# Here we configure gpt4all as gpt-3.5-turbo and bert as embeddings
PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/openllama-7b-open-instruct.yaml", "name": "gpt-3.5-turbo"}]
## Change the default number of threads
#THREADS=14

View File

@@ -0,0 +1,5 @@
FROM python:3.10-bullseye
COPY . /app
WORKDIR /app
RUN pip install --no-cache-dir -r requirements.txt
ENTRYPOINT [ "python", "./functions-openai.py" ];

View File

@@ -0,0 +1,18 @@
# LocalAI functions
Example of using LocalAI functions, see the [OpenAI](https://openai.com/blog/function-calling-and-other-api-updates) blog post.
## Run
```bash
# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI
cd LocalAI/examples/functions
docker-compose run --rm functions
```
Note: The example automatically downloads the `openllama` model as it is under a permissive license.
See the `.env` configuration file to set a different model with the [model-gallery](https://github.com/go-skynet/model-gallery) by editing `PRELOAD_MODELS`.

View File

@@ -0,0 +1,23 @@
version: "3.9"
services:
api:
image: quay.io/go-skynet/local-ai:master
ports:
- 8080:8080
env_file:
- .env
environment:
- DEBUG=true
- MODELS_PATH=/models
volumes:
- ./models:/models:cached
command: ["/usr/bin/local-ai" ]
functions:
build:
context: .
dockerfile: Dockerfile
depends_on:
api:
condition: service_healthy
env_file:
- .env

View File

@@ -0,0 +1,76 @@
import openai
import json
# Example dummy function hard coded to return the same weather
# In production, this could be your backend API or an external API
def get_current_weather(location, unit="fahrenheit"):
"""Get the current weather in a given location"""
weather_info = {
"location": location,
"temperature": "72",
"unit": unit,
"forecast": ["sunny", "windy"],
}
return json.dumps(weather_info)
def run_conversation():
# Step 1: send the conversation and available functions to GPT
messages = [{"role": "user", "content": "What's the weather like in Boston?"}]
functions = [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
}
]
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
functions=functions,
function_call="auto", # auto is default, but we'll be explicit
)
response_message = response["choices"][0]["message"]
# Step 2: check if GPT wanted to call a function
if response_message.get("function_call"):
# Step 3: call the function
# Note: the JSON response may not always be valid; be sure to handle errors
available_functions = {
"get_current_weather": get_current_weather,
} # only one function in this example, but you can have multiple
function_name = response_message["function_call"]["name"]
fuction_to_call = available_functions[function_name]
function_args = json.loads(response_message["function_call"]["arguments"])
function_response = fuction_to_call(
location=function_args.get("location"),
unit=function_args.get("unit"),
)
# Step 4: send the info on the function call and function response to GPT
messages.append(response_message) # extend conversation with assistant's reply
messages.append(
{
"role": "function",
"name": function_name,
"content": function_response,
}
) # extend conversation with function response
second_response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
) # get a new response from GPT where it can see the function response
return second_response
print(run_conversation())

View File

@@ -0,0 +1,2 @@
langchain==0.0.234
openai==0.27.8

View File

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,17 @@
# Insomnia
Developer Testing Request Collection for [Insomnia](https://insomnia.rest/), an open-source REST client
## Instructions
* Install Insomnia as normal
* [Import](https://docs.insomnia.rest/insomnia/import-export-data) `Insomnia_LocalAI.json`
* Control + E opens the environment settings -
| **Parameter Name** | **Default Value** | **Description** |
|--------------------|-------------------|------------------------------------------|
| HOST | localhost | LocalAI base URL |
| PORT | 8080 | LocalAI port |
| DEFAULT_MODEL | gpt-3.5-turbo | Name of the model used on most requests. |
** you may want to duplicate localhost into a "Private" environment to avoid saving private settings back to this file **

View File

@@ -38,7 +38,7 @@ helm install local-ai go-skynet/local-ai --create-namespace --namespace local-ai
# Install k8sgpt
helm repo add k8sgpt https://charts.k8sgpt.ai/
helm repo update
helm install release k8sgpt/k8sgpt-operator -n k8sgpt-operator-system --create-namespace
helm install release k8sgpt/k8sgpt-operator -n k8sgpt-operator-system --create-namespace --version 0.0.17
```
Apply the k8sgpt-operator configuration:
@@ -55,7 +55,6 @@ spec:
baseUrl: http://local-ai.local-ai.svc.cluster.local:8080/v1
noCache: false
model: gpt-3.5-turbo
noCache: false
version: v0.3.0
enableAI: true
EOF
@@ -67,4 +66,7 @@ Apply a broken pod:
```
kubectl apply -f broken-pod.yaml
```
```
## ArgoCD Deployment Example
[Deploy K8sgpt + localai with Argocd](https://github.com/tyler-harpool/gitops/tree/main/infra/k8gpt)

View File

@@ -2,12 +2,13 @@ replicaCount: 1
deployment:
# https://quay.io/repository/go-skynet/local-ai?tab=tags
image: quay.io/go-skynet/local-ai:latest
image: quay.io/go-skynet/local-ai:v1.23.0
env:
threads: 4
debug: "true"
context_size: 512
preload_models: '[{ "url": "github:go-skynet/model-gallery/wizard.yaml", "name": "gpt-3.5-turbo", "overrides": { "parameters": { "model": "WizardLM-7B-uncensored.ggmlv3.q5_1" }},"files": [ { "uri": "https://huggingface.co//WizardLM-7B-uncensored-GGML/resolve/main/WizardLM-7B-uncensored.ggmlv3.q5_1.bin", "sha256": "d92a509d83a8ea5e08ba4c2dbaf08f29015932dc2accd627ce0665ac72c2bb2b", "filename": "WizardLM-7B-uncensored.ggmlv3.q5_1" }]}]'
galleries: '[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]'
preload_models: '[{ "id": "huggingface@thebloke__open-llama-13b-open-instruct-ggml__open-llama-13b-open-instruct.ggmlv3.q3_k_m.bin", "name": "gpt-3.5-turbo", "overrides": { "f16": true, "mmap": true }}]'
modelsPath: "/models"
resources:

View File

@@ -9,7 +9,7 @@ from langchain.vectorstores.base import VectorStoreRetriever
base_path = os.environ.get('OPENAI_API_BASE', 'http://localhost:8080/v1')
# Load and process the text
embedding = OpenAIEmbeddings()
embedding = OpenAIEmbeddings(model="text-embedding-ada-002", openai_api_base=base_path)
persist_directory = 'db'
# Now we can load the persisted database from disk, and use it as normal.

View File

@@ -18,8 +18,8 @@ texts = text_splitter.split_documents(documents)
# Supplying a persist_directory will store the embeddings on disk
persist_directory = 'db'
embedding = OpenAIEmbeddings(model="text-embedding-ada-002")
embedding = OpenAIEmbeddings(model="text-embedding-ada-002", openai_api_base=base_path)
vectordb = Chroma.from_documents(documents=texts, embedding=embedding, persist_directory=persist_directory)
vectordb.persist()
vectordb = None
vectordb = None

View File

@@ -15,7 +15,7 @@ llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo
# Configure prompt parameters and initialise helper
max_input_size = 500
num_output = 256
max_chunk_overlap = 20
max_chunk_overlap = 0.2
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

View File

@@ -15,7 +15,7 @@ llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo
# Configure prompt parameters and initialise helper
max_input_size = 400
num_output = 400
max_chunk_overlap = 30
max_chunk_overlap = 0.3
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

1
examples/streamlit-bot/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
installer_files

View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2023 Manohar Joshi
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -0,0 +1,70 @@
import streamlit as st
import time
import requests
import json
def ask(prompt):
url = 'http://localhost:8080/v1/chat/completions'
myobj = {
"model": "ggml-gpt4all-j.bin",
"messages": [{"role": "user", "content": prompt}],
"temperature": 0.9
}
myheaders = { "Content-Type" : "application/json" }
x = requests.post(url, json = myobj, headers=myheaders)
print(x.text)
json_data = json.loads(x.text)
return json_data["choices"][0]["message"]["content"]
def main():
# Page setup
st.set_page_config(page_title="Ask your LLM")
st.header("Ask your Question 💬")
# Initialize chat history
if "messages" not in st.session_state:
st.session_state.messages = []
# Display chat messages from history on app rerun
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
# Scroll to bottom
st.markdown(
"""
<script>
var element = document.getElementById("end-of-chat");
element.scrollIntoView({behavior: "smooth"});
</script>
""",
unsafe_allow_html=True,
)
# React to user input
if prompt := st.chat_input("What is up?"):
# Display user message in chat message container
st.chat_message("user").markdown(prompt)
# Add user message to chat history
st.session_state.messages.append({"role": "user", "content": prompt})
print(f"User has asked the following question: {prompt}")
# Process
response = ""
with st.spinner('Processing...'):
response = ask(prompt)
#response = f"Echo: {prompt}"
# Display assistant response in chat message container
with st.chat_message("assistant"):
st.markdown(response)
# Add assistant response to chat history
st.session_state.messages.append({"role": "assistant", "content": response})
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,54 @@
## Streamlit bot
![Screenshot](streamlit-bot.png)
This is an example to deploy a Streamlit bot with LocalAI instead of OpenAI. Instructions are for Windows.
```bash
# Install & run Git Bash
# Clone LocalAI
git clone https://github.com/go-skynet/LocalAI.git
cd LocalAI
# (optional) Checkout a specific LocalAI tag
# git checkout -b build <TAG>
# Use a template from the examples
cp -rf prompt-templates/ggml-gpt4all-j.tmpl models/
# (optional) Edit the .env file to set things like context size and threads
# vim .env
# Download model
curl --progress-bar -C - -O https://gpt4all.io/models/ggml-gpt4all-j.bin > models/ggml-gpt4all-j.bin
# Install & Run Docker Desktop for Windows
https://www.docker.com/products/docker-desktop/
# start with docker-compose
docker-compose up -d --pull always
# or you can build the images with:
# docker-compose up -d --build
# Now API is accessible at localhost:8080
curl http://localhost:8080/v1/models
# {"object":"list","data":[{"id":"ggml-gpt4all-j","object":"model"}]}
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "ggml-gpt4all-j",
"messages": [{"role": "user", "content": "How are you?"}],
"temperature": 0.9
}'
# {"model":"ggml-gpt4all-j","choices":[{"message":{"role":"assistant","content":"I'm doing well, thanks. How about you?"}}]}
cd examples/streamlit-bot
install_requirements.bat
# run the bot
start_windows.bat
# UI will be launched automatically (http://localhost:8501/) in browser.
```

View File

@@ -0,0 +1,31 @@
@echo off
cd /D "%~dp0"
set PATH=%PATH%;%SystemRoot%\system32
echo "%CD%"| findstr /C:" " >nul && echo This script relies on Miniconda which can not be silently installed under a path with spaces. && goto end
@rem fix failed install when installing to a separate drive
set TMP=%cd%\installer_files
set TEMP=%cd%\installer_files
@rem config
set CONDA_ROOT_PREFIX=%cd%\installer_files\conda
set INSTALL_ENV_DIR=%cd%\installer_files\env
@rem environment isolation
set PYTHONNOUSERSITE=1
set PYTHONPATH=
set PYTHONHOME=
set "CUDA_PATH=%INSTALL_ENV_DIR%"
set "CUDA_HOME=%CUDA_PATH%"
@rem activate installer env
call "%CONDA_ROOT_PREFIX%\condabin\conda.bat" activate "%INSTALL_ENV_DIR%" || ( echo. && echo Miniconda hook not found. && goto end )
@rem enter commands
cmd /k "%*"
:end
pause

View File

@@ -0,0 +1,81 @@
@echo off
cd /D "%~dp0"
set PATH=%PATH%;%SystemRoot%\system32
echo "%CD%"| findstr /C:" " >nul && echo This script relies on Miniconda which can not be silently installed under a path with spaces. && goto end
@rem Check for special characters in installation path
set "SPCHARMESSAGE="WARNING: Special characters were detected in the installation path!" " This can cause the installation to fail!""
echo "%CD%"| findstr /R /C:"[!#\$%&()\*+,;<=>?@\[\]\^`{|}~]" >nul && (
call :PrintBigMessage %SPCHARMESSAGE%
)
set SPCHARMESSAGE=
@rem fix failed install when installing to a separate drive
set TMP=%cd%\installer_files
set TEMP=%cd%\installer_files
@rem config
set INSTALL_DIR=%cd%\installer_files
set CONDA_ROOT_PREFIX=%cd%\installer_files\conda
set INSTALL_ENV_DIR=%cd%\installer_files\env
set MINICONDA_DOWNLOAD_URL=https://repo.anaconda.com/miniconda/Miniconda3-py310_23.3.1-0-Windows-x86_64.exe
set conda_exists=F
@rem figure out whether git and conda needs to be installed
call "%CONDA_ROOT_PREFIX%\_conda.exe" --version >nul 2>&1
if "%ERRORLEVEL%" EQU "0" set conda_exists=T
@rem (if necessary) install git and conda into a contained environment
@rem download conda
if "%conda_exists%" == "F" (
echo Downloading Miniconda from %MINICONDA_DOWNLOAD_URL% to %INSTALL_DIR%\miniconda_installer.exe
mkdir "%INSTALL_DIR%"
call curl -Lk "%MINICONDA_DOWNLOAD_URL%" > "%INSTALL_DIR%\miniconda_installer.exe" || ( echo. && echo Miniconda failed to download. && goto end )
echo Installing Miniconda to %CONDA_ROOT_PREFIX%
start /wait "" "%INSTALL_DIR%\miniconda_installer.exe" /InstallationType=JustMe /NoShortcuts=1 /AddToPath=0 /RegisterPython=0 /NoRegistry=1 /S /D=%CONDA_ROOT_PREFIX%
@rem test the conda binary
echo Miniconda version:
call "%CONDA_ROOT_PREFIX%\_conda.exe" --version || ( echo. && echo Miniconda not found. && goto end )
)
@rem create the installer env
if not exist "%INSTALL_ENV_DIR%" (
echo Packages to install: %PACKAGES_TO_INSTALL%
call "%CONDA_ROOT_PREFIX%\_conda.exe" create --no-shortcuts -y -k --prefix "%INSTALL_ENV_DIR%" python=3.10 || ( echo. && echo Conda environment creation failed. && goto end )
)
@rem check if conda environment was actually created
if not exist "%INSTALL_ENV_DIR%\python.exe" ( echo. && echo Conda environment is empty. && goto end )
@rem environment isolation
set PYTHONNOUSERSITE=1
set PYTHONPATH=
set PYTHONHOME=
set "CUDA_PATH=%INSTALL_ENV_DIR%"
set "CUDA_HOME=%CUDA_PATH%"
@rem activate installer env
call "%CONDA_ROOT_PREFIX%\condabin\conda.bat" activate "%INSTALL_ENV_DIR%" || ( echo. && echo Miniconda hook not found. && goto end )
@rem setup installer env
call pip install -r requirements.txt
@rem below are functions for the script next line skips these during normal execution
goto end
:PrintBigMessage
echo. && echo.
echo *******************************************************************
for %%M in (%*) do echo * %%~M
echo *******************************************************************
echo. && echo.
exit /b
:end
pause

View File

@@ -0,0 +1,2 @@
streamlit==1.26.0
requests

View File

@@ -0,0 +1,81 @@
@echo off
cd /D "%~dp0"
set PATH=%PATH%;%SystemRoot%\system32
echo "%CD%"| findstr /C:" " >nul && echo This script relies on Miniconda which can not be silently installed under a path with spaces. && goto end
@rem Check for special characters in installation path
set "SPCHARMESSAGE="WARNING: Special characters were detected in the installation path!" " This can cause the installation to fail!""
echo "%CD%"| findstr /R /C:"[!#\$%&()\*+,;<=>?@\[\]\^`{|}~]" >nul && (
call :PrintBigMessage %SPCHARMESSAGE%
)
set SPCHARMESSAGE=
@rem fix failed install when installing to a separate drive
set TMP=%cd%\installer_files
set TEMP=%cd%\installer_files
@rem config
set INSTALL_DIR=%cd%\installer_files
set CONDA_ROOT_PREFIX=%cd%\installer_files\conda
set INSTALL_ENV_DIR=%cd%\installer_files\env
set MINICONDA_DOWNLOAD_URL=https://repo.anaconda.com/miniconda/Miniconda3-py310_23.3.1-0-Windows-x86_64.exe
set conda_exists=F
@rem figure out whether git and conda needs to be installed
call "%CONDA_ROOT_PREFIX%\_conda.exe" --version >nul 2>&1
if "%ERRORLEVEL%" EQU "0" set conda_exists=T
@rem (if necessary) install git and conda into a contained environment
@rem download conda
if "%conda_exists%" == "F" (
echo Downloading Miniconda from %MINICONDA_DOWNLOAD_URL% to %INSTALL_DIR%\miniconda_installer.exe
mkdir "%INSTALL_DIR%"
call curl -Lk "%MINICONDA_DOWNLOAD_URL%" > "%INSTALL_DIR%\miniconda_installer.exe" || ( echo. && echo Miniconda failed to download. && goto end )
echo Installing Miniconda to %CONDA_ROOT_PREFIX%
start /wait "" "%INSTALL_DIR%\miniconda_installer.exe" /InstallationType=JustMe /NoShortcuts=1 /AddToPath=0 /RegisterPython=0 /NoRegistry=1 /S /D=%CONDA_ROOT_PREFIX%
@rem test the conda binary
echo Miniconda version:
call "%CONDA_ROOT_PREFIX%\_conda.exe" --version || ( echo. && echo Miniconda not found. && goto end )
)
@rem create the installer env
if not exist "%INSTALL_ENV_DIR%" (
echo Packages to install: %PACKAGES_TO_INSTALL%
call "%CONDA_ROOT_PREFIX%\_conda.exe" create --no-shortcuts -y -k --prefix "%INSTALL_ENV_DIR%" python=3.10 || ( echo. && echo Conda environment creation failed. && goto end )
)
@rem check if conda environment was actually created
if not exist "%INSTALL_ENV_DIR%\python.exe" ( echo. && echo Conda environment is empty. && goto end )
@rem environment isolation
set PYTHONNOUSERSITE=1
set PYTHONPATH=
set PYTHONHOME=
set "CUDA_PATH=%INSTALL_ENV_DIR%"
set "CUDA_HOME=%CUDA_PATH%"
@rem activate installer env
call "%CONDA_ROOT_PREFIX%\condabin\conda.bat" activate "%INSTALL_ENV_DIR%" || ( echo. && echo Miniconda hook not found. && goto end )
@rem setup installer env
streamlit run Main.py
@rem below are functions for the script next line skips these during normal execution
goto end
:PrintBigMessage
echo. && echo.
echo *******************************************************************
for %%M in (%*) do echo * %%~M
echo *******************************************************************
echo. && echo.
exit /b
:end
pause

View File

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

View File

@@ -22,7 +22,7 @@ services:
- 'PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}, {"url": "github:go-skynet/model-gallery/stablediffusion.yaml"}, {"url": "github:go-skynet/model-gallery/whisper-base.yaml", "name": "whisper-1"}]'
volumes:
- ./models:/models:cached
command: ["/usr/bin/local-ai" ]
command: ["/usr/bin/local-ai"]
chatgpt_telegram_bot:
container_name: chatgpt_telegram_bot
command: python3 bot/bot.py

112
extra/grpc/autogptq/autogptq.py Executable file
View File

@@ -0,0 +1,112 @@
#!/usr/bin/env python3
import grpc
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
import signal
import sys
import os
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from pathlib import Path
from transformers import AutoTokenizer
from transformers import TextGenerationPipeline
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# Implement the BackendServicer class with the service methods
class BackendServicer(backend_pb2_grpc.BackendServicer):
def Health(self, request, context):
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
def LoadModel(self, request, context):
try:
device = "cuda:0"
if request.Device != "":
device = request.Device
tokenizer = AutoTokenizer.from_pretrained(request.Model, use_fast=request.UseFastTokenizer)
model = AutoGPTQForCausalLM.from_quantized(request.Model,
model_basename=request.ModelBaseName,
use_safetensors=True,
trust_remote_code=True,
device=device,
use_triton=request.UseTriton,
quantize_config=None)
self.model = model
self.tokenizer = tokenizer
except Exception as err:
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
return backend_pb2.Result(message="Model loaded successfully", success=True)
def Predict(self, request, context):
penalty = 1.0
if request.Penalty != 0.0:
penalty = request.Penalty
tokens = 512
if request.Tokens != 0:
tokens = request.Tokens
top_p = 0.95
if request.TopP != 0.0:
top_p = request.TopP
# Implement Predict RPC
pipeline = TextGenerationPipeline(
model=self.model,
tokenizer=self.tokenizer,
max_new_tokens=tokens,
temperature=request.Temperature,
top_p=top_p,
repetition_penalty=penalty,
)
t = pipeline(request.Prompt)[0]["generated_text"]
# Remove prompt from response if present
if request.Prompt in t:
t = t.replace(request.Prompt, "")
return backend_pb2.Result(message=bytes(t, encoding='utf-8'))
def PredictStream(self, request, context):
# Implement PredictStream RPC
#for reply in some_data_generator():
# yield reply
# Not implemented yet
return self.Predict(request, context)
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()
print("Server started. Listening on: " + address, file=sys.stderr)
# Define the signal handler function
def signal_handler(sig, frame):
print("Received termination signal. Shutting down...")
server.stop(0)
sys.exit(0)
# Set the signal handlers for SIGINT and SIGTERM
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the gRPC server.")
parser.add_argument(
"--addr", default="localhost:50051", help="The address to bind the server to."
)
args = parser.parse_args()
serve(args.addr)

View File

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,363 @@
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
"""Client and server classes corresponding to protobuf-defined services."""
import grpc
import backend_pb2 as backend__pb2
class BackendStub(object):
"""Missing associated documentation comment in .proto file."""
def __init__(self, channel):
"""Constructor.
Args:
channel: A grpc.Channel.
"""
self.Health = channel.unary_unary(
'/backend.Backend/Health',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Predict = channel.unary_unary(
'/backend.Backend/Predict',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.LoadModel = channel.unary_unary(
'/backend.Backend/LoadModel',
request_serializer=backend__pb2.ModelOptions.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.PredictStream = channel.unary_stream(
'/backend.Backend/PredictStream',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Embedding = channel.unary_unary(
'/backend.Backend/Embedding',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.EmbeddingResult.FromString,
)
self.GenerateImage = channel.unary_unary(
'/backend.Backend/GenerateImage',
request_serializer=backend__pb2.GenerateImageRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.AudioTranscription = channel.unary_unary(
'/backend.Backend/AudioTranscription',
request_serializer=backend__pb2.TranscriptRequest.SerializeToString,
response_deserializer=backend__pb2.TranscriptResult.FromString,
)
self.TTS = channel.unary_unary(
'/backend.Backend/TTS',
request_serializer=backend__pb2.TTSRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.TokenizeString = channel.unary_unary(
'/backend.Backend/TokenizeString',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.TokenizationResponse.FromString,
)
self.Status = channel.unary_unary(
'/backend.Backend/Status',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.StatusResponse.FromString,
)
class BackendServicer(object):
"""Missing associated documentation comment in .proto file."""
def Health(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Predict(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def LoadModel(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def PredictStream(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Embedding(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def GenerateImage(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def AudioTranscription(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TTS(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TokenizeString(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Status(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_BackendServicer_to_server(servicer, server):
rpc_method_handlers = {
'Health': grpc.unary_unary_rpc_method_handler(
servicer.Health,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Predict': grpc.unary_unary_rpc_method_handler(
servicer.Predict,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'LoadModel': grpc.unary_unary_rpc_method_handler(
servicer.LoadModel,
request_deserializer=backend__pb2.ModelOptions.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'PredictStream': grpc.unary_stream_rpc_method_handler(
servicer.PredictStream,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Embedding': grpc.unary_unary_rpc_method_handler(
servicer.Embedding,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.EmbeddingResult.SerializeToString,
),
'GenerateImage': grpc.unary_unary_rpc_method_handler(
servicer.GenerateImage,
request_deserializer=backend__pb2.GenerateImageRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'AudioTranscription': grpc.unary_unary_rpc_method_handler(
servicer.AudioTranscription,
request_deserializer=backend__pb2.TranscriptRequest.FromString,
response_serializer=backend__pb2.TranscriptResult.SerializeToString,
),
'TTS': grpc.unary_unary_rpc_method_handler(
servicer.TTS,
request_deserializer=backend__pb2.TTSRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'TokenizeString': grpc.unary_unary_rpc_method_handler(
servicer.TokenizeString,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.TokenizationResponse.SerializeToString,
),
'Status': grpc.unary_unary_rpc_method_handler(
servicer.Status,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.StatusResponse.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'backend.Backend', rpc_method_handlers)
server.add_generic_rpc_handlers((generic_handler,))
# This class is part of an EXPERIMENTAL API.
class Backend(object):
"""Missing associated documentation comment in .proto file."""
@staticmethod
def Health(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Health',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Predict(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Predict',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def LoadModel(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/LoadModel',
backend__pb2.ModelOptions.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def PredictStream(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_stream(request, target, '/backend.Backend/PredictStream',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Embedding(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Embedding',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.EmbeddingResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def GenerateImage(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/GenerateImage',
backend__pb2.GenerateImageRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def AudioTranscription(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/AudioTranscription',
backend__pb2.TranscriptRequest.SerializeToString,
backend__pb2.TranscriptResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TTS(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TTS',
backend__pb2.TTSRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TokenizeString(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TokenizeString',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.TokenizationResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Status(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Status',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.StatusResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)

View File

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,363 @@
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
"""Client and server classes corresponding to protobuf-defined services."""
import grpc
import backend_pb2 as backend__pb2
class BackendStub(object):
"""Missing associated documentation comment in .proto file."""
def __init__(self, channel):
"""Constructor.
Args:
channel: A grpc.Channel.
"""
self.Health = channel.unary_unary(
'/backend.Backend/Health',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Predict = channel.unary_unary(
'/backend.Backend/Predict',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.LoadModel = channel.unary_unary(
'/backend.Backend/LoadModel',
request_serializer=backend__pb2.ModelOptions.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.PredictStream = channel.unary_stream(
'/backend.Backend/PredictStream',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Embedding = channel.unary_unary(
'/backend.Backend/Embedding',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.EmbeddingResult.FromString,
)
self.GenerateImage = channel.unary_unary(
'/backend.Backend/GenerateImage',
request_serializer=backend__pb2.GenerateImageRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.AudioTranscription = channel.unary_unary(
'/backend.Backend/AudioTranscription',
request_serializer=backend__pb2.TranscriptRequest.SerializeToString,
response_deserializer=backend__pb2.TranscriptResult.FromString,
)
self.TTS = channel.unary_unary(
'/backend.Backend/TTS',
request_serializer=backend__pb2.TTSRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.TokenizeString = channel.unary_unary(
'/backend.Backend/TokenizeString',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.TokenizationResponse.FromString,
)
self.Status = channel.unary_unary(
'/backend.Backend/Status',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.StatusResponse.FromString,
)
class BackendServicer(object):
"""Missing associated documentation comment in .proto file."""
def Health(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Predict(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def LoadModel(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def PredictStream(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Embedding(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def GenerateImage(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def AudioTranscription(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TTS(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TokenizeString(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Status(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_BackendServicer_to_server(servicer, server):
rpc_method_handlers = {
'Health': grpc.unary_unary_rpc_method_handler(
servicer.Health,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Predict': grpc.unary_unary_rpc_method_handler(
servicer.Predict,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'LoadModel': grpc.unary_unary_rpc_method_handler(
servicer.LoadModel,
request_deserializer=backend__pb2.ModelOptions.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'PredictStream': grpc.unary_stream_rpc_method_handler(
servicer.PredictStream,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Embedding': grpc.unary_unary_rpc_method_handler(
servicer.Embedding,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.EmbeddingResult.SerializeToString,
),
'GenerateImage': grpc.unary_unary_rpc_method_handler(
servicer.GenerateImage,
request_deserializer=backend__pb2.GenerateImageRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'AudioTranscription': grpc.unary_unary_rpc_method_handler(
servicer.AudioTranscription,
request_deserializer=backend__pb2.TranscriptRequest.FromString,
response_serializer=backend__pb2.TranscriptResult.SerializeToString,
),
'TTS': grpc.unary_unary_rpc_method_handler(
servicer.TTS,
request_deserializer=backend__pb2.TTSRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'TokenizeString': grpc.unary_unary_rpc_method_handler(
servicer.TokenizeString,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.TokenizationResponse.SerializeToString,
),
'Status': grpc.unary_unary_rpc_method_handler(
servicer.Status,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.StatusResponse.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'backend.Backend', rpc_method_handlers)
server.add_generic_rpc_handlers((generic_handler,))
# This class is part of an EXPERIMENTAL API.
class Backend(object):
"""Missing associated documentation comment in .proto file."""
@staticmethod
def Health(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Health',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Predict(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Predict',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def LoadModel(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/LoadModel',
backend__pb2.ModelOptions.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def PredictStream(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_stream(request, target, '/backend.Backend/PredictStream',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Embedding(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Embedding',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.EmbeddingResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def GenerateImage(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/GenerateImage',
backend__pb2.GenerateImageRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def AudioTranscription(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/AudioTranscription',
backend__pb2.TranscriptRequest.SerializeToString,
backend__pb2.TranscriptResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TTS(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TTS',
backend__pb2.TTSRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TokenizeString(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TokenizeString',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.TokenizationResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Status(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Status',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.StatusResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)

View File

@@ -0,0 +1,86 @@
#!/usr/bin/env python3
import grpc
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
import signal
import sys
import os
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from pathlib import Path
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# Implement the BackendServicer class with the service methods
class BackendServicer(backend_pb2_grpc.BackendServicer):
def Health(self, request, context):
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
def LoadModel(self, request, context):
model_name = request.Model
try:
print("Preparing models, please wait", file=sys.stderr)
# download and load all models
preload_models()
except Exception as err:
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
# Implement your logic here for the LoadModel service
# Replace this with your desired response
return backend_pb2.Result(message="Model loaded successfully", success=True)
def TTS(self, request, context):
model = request.model
print(request, file=sys.stderr)
try:
audio_array = None
if model != "":
audio_array = generate_audio(request.text, history_prompt=model)
else:
audio_array = generate_audio(request.text)
print("saving to", request.dst, file=sys.stderr)
# save audio to disk
write_wav(request.dst, SAMPLE_RATE, audio_array)
print("saved to", request.dst, file=sys.stderr)
print("tts for", file=sys.stderr)
print(request, file=sys.stderr)
except Exception as err:
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
return backend_pb2.Result(success=True)
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()
print("Server started. Listening on: " + address, file=sys.stderr)
# Define the signal handler function
def signal_handler(sig, frame):
print("Received termination signal. Shutting down...")
server.stop(0)
sys.exit(0)
# Set the signal handlers for SIGINT and SIGTERM
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the gRPC server.")
parser.add_argument(
"--addr", default="localhost:50051", help="The address to bind the server to."
)
args = parser.parse_args()
serve(args.addr)

View File

@@ -0,0 +1,381 @@
#!/usr/bin/env python3
import grpc
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
import signal
import sys
import os
# import diffusers
import torch
from torch import autocast
from diffusers import StableDiffusionXLPipeline, StableDiffusionDepth2ImgPipeline, DPMSolverMultistepScheduler, StableDiffusionPipeline, DiffusionPipeline, EulerAncestralDiscreteScheduler
from diffusers.pipelines.stable_diffusion import safety_checker
from compel import Compel
from PIL import Image
from io import BytesIO
from diffusers import StableDiffusionImg2ImgPipeline
from transformers import CLIPTextModel
from enum import Enum
from collections import defaultdict
from safetensors.torch import load_file
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
COMPEL=os.environ.get("COMPEL", "1") == "1"
CLIPSKIP=os.environ.get("CLIPSKIP", "1") == "1"
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# https://github.com/CompVis/stable-diffusion/issues/239#issuecomment-1627615287
def sc(self, clip_input, images) : return images, [False for i in images]
# edit the StableDiffusionSafetyChecker class so that, when called, it just returns the images and an array of True values
safety_checker.StableDiffusionSafetyChecker.forward = sc
from diffusers.schedulers import (
DDIMScheduler,
DPMSolverMultistepScheduler,
DPMSolverSinglestepScheduler,
EulerAncestralDiscreteScheduler,
EulerDiscreteScheduler,
HeunDiscreteScheduler,
KDPM2AncestralDiscreteScheduler,
KDPM2DiscreteScheduler,
LMSDiscreteScheduler,
PNDMScheduler,
UniPCMultistepScheduler,
)
# The scheduler list mapping was taken from here: https://github.com/neggles/animatediff-cli/blob/6f336f5f4b5e38e85d7f06f1744ef42d0a45f2a7/src/animatediff/schedulers.py#L39
# Credits to https://github.com/neggles
# See https://github.com/huggingface/diffusers/issues/4167 for more details on sched mapping from A1111
class DiffusionScheduler(str, Enum):
ddim = "ddim" # DDIM
pndm = "pndm" # PNDM
heun = "heun" # Heun
unipc = "unipc" # UniPC
euler = "euler" # Euler
euler_a = "euler_a" # Euler a
lms = "lms" # LMS
k_lms = "k_lms" # LMS Karras
dpm_2 = "dpm_2" # DPM2
k_dpm_2 = "k_dpm_2" # DPM2 Karras
dpm_2_a = "dpm_2_a" # DPM2 a
k_dpm_2_a = "k_dpm_2_a" # DPM2 a Karras
dpmpp_2m = "dpmpp_2m" # DPM++ 2M
k_dpmpp_2m = "k_dpmpp_2m" # DPM++ 2M Karras
dpmpp_sde = "dpmpp_sde" # DPM++ SDE
k_dpmpp_sde = "k_dpmpp_sde" # DPM++ SDE Karras
dpmpp_2m_sde = "dpmpp_2m_sde" # DPM++ 2M SDE
k_dpmpp_2m_sde = "k_dpmpp_2m_sde" # DPM++ 2M SDE Karras
def get_scheduler(name: str, config: dict = {}):
is_karras = name.startswith("k_")
if is_karras:
# strip the k_ prefix and add the karras sigma flag to config
name = name.lstrip("k_")
config["use_karras_sigmas"] = True
if name == DiffusionScheduler.ddim:
sched_class = DDIMScheduler
elif name == DiffusionScheduler.pndm:
sched_class = PNDMScheduler
elif name == DiffusionScheduler.heun:
sched_class = HeunDiscreteScheduler
elif name == DiffusionScheduler.unipc:
sched_class = UniPCMultistepScheduler
elif name == DiffusionScheduler.euler:
sched_class = EulerDiscreteScheduler
elif name == DiffusionScheduler.euler_a:
sched_class = EulerAncestralDiscreteScheduler
elif name == DiffusionScheduler.lms:
sched_class = LMSDiscreteScheduler
elif name == DiffusionScheduler.dpm_2:
# Equivalent to DPM2 in K-Diffusion
sched_class = KDPM2DiscreteScheduler
elif name == DiffusionScheduler.dpm_2_a:
# Equivalent to `DPM2 a`` in K-Diffusion
sched_class = KDPM2AncestralDiscreteScheduler
elif name == DiffusionScheduler.dpmpp_2m:
# Equivalent to `DPM++ 2M` in K-Diffusion
sched_class = DPMSolverMultistepScheduler
config["algorithm_type"] = "dpmsolver++"
config["solver_order"] = 2
elif name == DiffusionScheduler.dpmpp_sde:
# Equivalent to `DPM++ SDE` in K-Diffusion
sched_class = DPMSolverSinglestepScheduler
elif name == DiffusionScheduler.dpmpp_2m_sde:
# Equivalent to `DPM++ 2M SDE` in K-Diffusion
sched_class = DPMSolverMultistepScheduler
config["algorithm_type"] = "sde-dpmsolver++"
else:
raise ValueError(f"Invalid scheduler '{'k_' if is_karras else ''}{name}'")
return sched_class.from_config(config)
# Implement the BackendServicer class with the service methods
class BackendServicer(backend_pb2_grpc.BackendServicer):
def Health(self, request, context):
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
def LoadModel(self, request, context):
try:
print(f"Loading model {request.Model}...", file=sys.stderr)
print(f"Request {request}", file=sys.stderr)
torchType = torch.float32
if request.F16Memory:
torchType = torch.float16
local = False
modelFile = request.Model
cfg_scale = 7
if request.CFGScale != 0:
cfg_scale = request.CFGScale
clipmodel = "runwayml/stable-diffusion-v1-5"
if request.CLIPModel != "":
clipmodel = request.CLIPModel
clipsubfolder = "text_encoder"
if request.CLIPSubfolder != "":
clipsubfolder = request.CLIPSubfolder
# Check if ModelFile exists
if request.ModelFile != "":
if os.path.exists(request.ModelFile):
local = True
modelFile = request.ModelFile
fromSingleFile = request.Model.startswith("http") or request.Model.startswith("/") or local
if request.IMG2IMG and request.PipelineType == "":
request.PipelineType == "StableDiffusionImg2ImgPipeline"
if request.PipelineType == "":
request.PipelineType == "StableDiffusionPipeline"
## img2img
if request.PipelineType == "StableDiffusionImg2ImgPipeline":
if fromSingleFile:
self.pipe = StableDiffusionImg2ImgPipeline.from_single_file(modelFile,
torch_dtype=torchType,
guidance_scale=cfg_scale)
else:
self.pipe = StableDiffusionImg2ImgPipeline.from_pretrained(request.Model,
torch_dtype=torchType,
guidance_scale=cfg_scale)
if request.PipelineType == "StableDiffusionDepth2ImgPipeline":
self.pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(request.Model,
torch_dtype=torchType,
guidance_scale=cfg_scale)
## text2img
if request.PipelineType == "StableDiffusionPipeline":
if fromSingleFile:
self.pipe = StableDiffusionPipeline.from_single_file(modelFile,
torch_dtype=torchType,
guidance_scale=cfg_scale)
else:
self.pipe = StableDiffusionPipeline.from_pretrained(request.Model,
torch_dtype=torchType,
guidance_scale=cfg_scale)
if request.PipelineType == "DiffusionPipeline":
self.pipe = DiffusionPipeline.from_pretrained(request.Model,
torch_dtype=torchType,
guidance_scale=cfg_scale)
if request.PipelineType == "StableDiffusionXLPipeline":
if fromSingleFile:
self.pipe = StableDiffusionXLPipeline.from_single_file(modelFile,
torch_dtype=torchType, use_safetensors=True,
guidance_scale=cfg_scale)
else:
self.pipe = StableDiffusionXLPipeline.from_pretrained(
request.Model,
torch_dtype=torchType,
use_safetensors=True,
# variant="fp16"
guidance_scale=cfg_scale)
# https://github.com/huggingface/diffusers/issues/4446
# do not use text_encoder in the constructor since then
# https://github.com/huggingface/diffusers/issues/3212#issuecomment-1521841481
if CLIPSKIP and request.CLIPSkip != 0:
text_encoder = CLIPTextModel.from_pretrained(clipmodel, num_hidden_layers=request.CLIPSkip, subfolder=clipsubfolder, torch_dtype=torchType)
self.pipe.text_encoder=text_encoder
# torch_dtype needs to be customized. float16 for GPU, float32 for CPU
# TODO: this needs to be customized
if request.SchedulerType != "":
self.pipe.scheduler = get_scheduler(request.SchedulerType, self.pipe.scheduler.config)
self.compel = Compel(tokenizer=self.pipe.tokenizer, text_encoder=self.pipe.text_encoder)
if request.CUDA:
self.pipe.to('cuda')
# Assume directory from request.ModelFile.
# Only if request.LoraAdapter it's not an absolute path
if request.LoraAdapter and request.ModelFile != "" and not os.path.isabs(request.LoraAdapter) and request.LoraAdapter:
# get base path of modelFile
modelFileBase = os.path.dirname(request.ModelFile)
# modify LoraAdapter to be relative to modelFileBase
request.LoraAdapter = os.path.join(modelFileBase, request.LoraAdapter)
device = "cpu" if not request.CUDA else "cuda"
self.device = device
if request.LoraAdapter:
# Check if its a local file and not a directory ( we load lora differently for a safetensor file )
if os.path.exists(request.LoraAdapter) and not os.path.isdir(request.LoraAdapter):
self.load_lora_weights(request.LoraAdapter, 1, device, torchType)
else:
self.pipe.unet.load_attn_procs(request.LoraAdapter)
except Exception as err:
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
# Implement your logic here for the LoadModel service
# Replace this with your desired response
return backend_pb2.Result(message="Model loaded successfully", success=True)
# https://github.com/huggingface/diffusers/issues/3064
def load_lora_weights(self, checkpoint_path, multiplier, device, dtype):
LORA_PREFIX_UNET = "lora_unet"
LORA_PREFIX_TEXT_ENCODER = "lora_te"
# load LoRA weight from .safetensors
state_dict = load_file(checkpoint_path, device=device)
updates = defaultdict(dict)
for key, value in state_dict.items():
# it is suggested to print out the key, it usually will be something like below
# "lora_te_text_model_encoder_layers_0_self_attn_k_proj.lora_down.weight"
layer, elem = key.split('.', 1)
updates[layer][elem] = value
# directly update weight in diffusers model
for layer, elems in updates.items():
if "text" in layer:
layer_infos = layer.split(LORA_PREFIX_TEXT_ENCODER + "_")[-1].split("_")
curr_layer = self.pipe.text_encoder
else:
layer_infos = layer.split(LORA_PREFIX_UNET + "_")[-1].split("_")
curr_layer = self.pipe.unet
# find the target layer
temp_name = layer_infos.pop(0)
while len(layer_infos) > -1:
try:
curr_layer = curr_layer.__getattr__(temp_name)
if len(layer_infos) > 0:
temp_name = layer_infos.pop(0)
elif len(layer_infos) == 0:
break
except Exception:
if len(temp_name) > 0:
temp_name += "_" + layer_infos.pop(0)
else:
temp_name = layer_infos.pop(0)
# get elements for this layer
weight_up = elems['lora_up.weight'].to(dtype)
weight_down = elems['lora_down.weight'].to(dtype)
alpha = elems['alpha'] if 'alpha' in elems else None
if alpha:
alpha = alpha.item() / weight_up.shape[1]
else:
alpha = 1.0
# update weight
if len(weight_up.shape) == 4:
curr_layer.weight.data += multiplier * alpha * torch.mm(weight_up.squeeze(3).squeeze(2), weight_down.squeeze(3).squeeze(2)).unsqueeze(2).unsqueeze(3)
else:
curr_layer.weight.data += multiplier * alpha * torch.mm(weight_up, weight_down)
def GenerateImage(self, request, context):
prompt = request.positive_prompt
# create a dictionary of values for the parameters
options = {
"negative_prompt": request.negative_prompt,
"width": request.width,
"height": request.height,
"num_inference_steps": request.step,
}
if request.src != "":
image = Image.open(request.src)
options["image"] = image
# Get the keys that we will build the args for our pipe for
keys = options.keys()
if request.EnableParameters != "":
keys = request.EnableParameters.split(",")
if request.EnableParameters == "none":
keys = []
# create a dictionary of parameters by using the keys from EnableParameters and the values from defaults
kwargs = {key: options[key] for key in keys}
# Set seed
if request.seed > 0:
kwargs["generator"] = torch.Generator(device=self.device).manual_seed(
request.seed
)
image = {}
if COMPEL:
conditioning = self.compel.build_conditioning_tensor(prompt)
kwargs["prompt_embeds"]= conditioning
# pass the kwargs dictionary to the self.pipe method
image = self.pipe(
**kwargs
).images[0]
else:
# pass the kwargs dictionary to the self.pipe method
image = self.pipe(
prompt,
**kwargs
).images[0]
# save the result
image.save(request.dst)
return backend_pb2.Result(message="Model loaded successfully", success=True)
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()
print("Server started. Listening on: " + address, file=sys.stderr)
# Define the signal handler function
def signal_handler(sig, frame):
print("Received termination signal. Shutting down...")
server.stop(0)
sys.exit(0)
# Set the signal handlers for SIGINT and SIGTERM
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the gRPC server.")
parser.add_argument(
"--addr", default="localhost:50051", help="The address to bind the server to."
)
args = parser.parse_args()
serve(args.addr)

View File

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,363 @@
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
"""Client and server classes corresponding to protobuf-defined services."""
import grpc
import backend_pb2 as backend__pb2
class BackendStub(object):
"""Missing associated documentation comment in .proto file."""
def __init__(self, channel):
"""Constructor.
Args:
channel: A grpc.Channel.
"""
self.Health = channel.unary_unary(
'/backend.Backend/Health',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Predict = channel.unary_unary(
'/backend.Backend/Predict',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.LoadModel = channel.unary_unary(
'/backend.Backend/LoadModel',
request_serializer=backend__pb2.ModelOptions.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.PredictStream = channel.unary_stream(
'/backend.Backend/PredictStream',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Embedding = channel.unary_unary(
'/backend.Backend/Embedding',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.EmbeddingResult.FromString,
)
self.GenerateImage = channel.unary_unary(
'/backend.Backend/GenerateImage',
request_serializer=backend__pb2.GenerateImageRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.AudioTranscription = channel.unary_unary(
'/backend.Backend/AudioTranscription',
request_serializer=backend__pb2.TranscriptRequest.SerializeToString,
response_deserializer=backend__pb2.TranscriptResult.FromString,
)
self.TTS = channel.unary_unary(
'/backend.Backend/TTS',
request_serializer=backend__pb2.TTSRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.TokenizeString = channel.unary_unary(
'/backend.Backend/TokenizeString',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.TokenizationResponse.FromString,
)
self.Status = channel.unary_unary(
'/backend.Backend/Status',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.StatusResponse.FromString,
)
class BackendServicer(object):
"""Missing associated documentation comment in .proto file."""
def Health(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Predict(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def LoadModel(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def PredictStream(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Embedding(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def GenerateImage(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def AudioTranscription(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TTS(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TokenizeString(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Status(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_BackendServicer_to_server(servicer, server):
rpc_method_handlers = {
'Health': grpc.unary_unary_rpc_method_handler(
servicer.Health,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Predict': grpc.unary_unary_rpc_method_handler(
servicer.Predict,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'LoadModel': grpc.unary_unary_rpc_method_handler(
servicer.LoadModel,
request_deserializer=backend__pb2.ModelOptions.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'PredictStream': grpc.unary_stream_rpc_method_handler(
servicer.PredictStream,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Embedding': grpc.unary_unary_rpc_method_handler(
servicer.Embedding,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.EmbeddingResult.SerializeToString,
),
'GenerateImage': grpc.unary_unary_rpc_method_handler(
servicer.GenerateImage,
request_deserializer=backend__pb2.GenerateImageRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'AudioTranscription': grpc.unary_unary_rpc_method_handler(
servicer.AudioTranscription,
request_deserializer=backend__pb2.TranscriptRequest.FromString,
response_serializer=backend__pb2.TranscriptResult.SerializeToString,
),
'TTS': grpc.unary_unary_rpc_method_handler(
servicer.TTS,
request_deserializer=backend__pb2.TTSRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'TokenizeString': grpc.unary_unary_rpc_method_handler(
servicer.TokenizeString,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.TokenizationResponse.SerializeToString,
),
'Status': grpc.unary_unary_rpc_method_handler(
servicer.Status,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.StatusResponse.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'backend.Backend', rpc_method_handlers)
server.add_generic_rpc_handlers((generic_handler,))
# This class is part of an EXPERIMENTAL API.
class Backend(object):
"""Missing associated documentation comment in .proto file."""
@staticmethod
def Health(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Health',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Predict(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Predict',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def LoadModel(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/LoadModel',
backend__pb2.ModelOptions.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def PredictStream(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_stream(request, target, '/backend.Backend/PredictStream',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Embedding(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Embedding',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.EmbeddingResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def GenerateImage(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/GenerateImage',
backend__pb2.GenerateImageRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def AudioTranscription(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/AudioTranscription',
backend__pb2.TranscriptRequest.SerializeToString,
backend__pb2.TranscriptResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TTS(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TTS',
backend__pb2.TTSRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TokenizeString(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TokenizeString',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.TokenizationResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Status(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Status',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.StatusResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)

View File

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,363 @@
# Generated by the gRPC Python protocol compiler plugin. DO NOT EDIT!
"""Client and server classes corresponding to protobuf-defined services."""
import grpc
import backend_pb2 as backend__pb2
class BackendStub(object):
"""Missing associated documentation comment in .proto file."""
def __init__(self, channel):
"""Constructor.
Args:
channel: A grpc.Channel.
"""
self.Health = channel.unary_unary(
'/backend.Backend/Health',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Predict = channel.unary_unary(
'/backend.Backend/Predict',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.LoadModel = channel.unary_unary(
'/backend.Backend/LoadModel',
request_serializer=backend__pb2.ModelOptions.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.PredictStream = channel.unary_stream(
'/backend.Backend/PredictStream',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.Reply.FromString,
)
self.Embedding = channel.unary_unary(
'/backend.Backend/Embedding',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.EmbeddingResult.FromString,
)
self.GenerateImage = channel.unary_unary(
'/backend.Backend/GenerateImage',
request_serializer=backend__pb2.GenerateImageRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.AudioTranscription = channel.unary_unary(
'/backend.Backend/AudioTranscription',
request_serializer=backend__pb2.TranscriptRequest.SerializeToString,
response_deserializer=backend__pb2.TranscriptResult.FromString,
)
self.TTS = channel.unary_unary(
'/backend.Backend/TTS',
request_serializer=backend__pb2.TTSRequest.SerializeToString,
response_deserializer=backend__pb2.Result.FromString,
)
self.TokenizeString = channel.unary_unary(
'/backend.Backend/TokenizeString',
request_serializer=backend__pb2.PredictOptions.SerializeToString,
response_deserializer=backend__pb2.TokenizationResponse.FromString,
)
self.Status = channel.unary_unary(
'/backend.Backend/Status',
request_serializer=backend__pb2.HealthMessage.SerializeToString,
response_deserializer=backend__pb2.StatusResponse.FromString,
)
class BackendServicer(object):
"""Missing associated documentation comment in .proto file."""
def Health(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Predict(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def LoadModel(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def PredictStream(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Embedding(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def GenerateImage(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def AudioTranscription(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TTS(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def TokenizeString(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def Status(self, request, context):
"""Missing associated documentation comment in .proto file."""
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
context.set_details('Method not implemented!')
raise NotImplementedError('Method not implemented!')
def add_BackendServicer_to_server(servicer, server):
rpc_method_handlers = {
'Health': grpc.unary_unary_rpc_method_handler(
servicer.Health,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Predict': grpc.unary_unary_rpc_method_handler(
servicer.Predict,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'LoadModel': grpc.unary_unary_rpc_method_handler(
servicer.LoadModel,
request_deserializer=backend__pb2.ModelOptions.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'PredictStream': grpc.unary_stream_rpc_method_handler(
servicer.PredictStream,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.Reply.SerializeToString,
),
'Embedding': grpc.unary_unary_rpc_method_handler(
servicer.Embedding,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.EmbeddingResult.SerializeToString,
),
'GenerateImage': grpc.unary_unary_rpc_method_handler(
servicer.GenerateImage,
request_deserializer=backend__pb2.GenerateImageRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'AudioTranscription': grpc.unary_unary_rpc_method_handler(
servicer.AudioTranscription,
request_deserializer=backend__pb2.TranscriptRequest.FromString,
response_serializer=backend__pb2.TranscriptResult.SerializeToString,
),
'TTS': grpc.unary_unary_rpc_method_handler(
servicer.TTS,
request_deserializer=backend__pb2.TTSRequest.FromString,
response_serializer=backend__pb2.Result.SerializeToString,
),
'TokenizeString': grpc.unary_unary_rpc_method_handler(
servicer.TokenizeString,
request_deserializer=backend__pb2.PredictOptions.FromString,
response_serializer=backend__pb2.TokenizationResponse.SerializeToString,
),
'Status': grpc.unary_unary_rpc_method_handler(
servicer.Status,
request_deserializer=backend__pb2.HealthMessage.FromString,
response_serializer=backend__pb2.StatusResponse.SerializeToString,
),
}
generic_handler = grpc.method_handlers_generic_handler(
'backend.Backend', rpc_method_handlers)
server.add_generic_rpc_handlers((generic_handler,))
# This class is part of an EXPERIMENTAL API.
class Backend(object):
"""Missing associated documentation comment in .proto file."""
@staticmethod
def Health(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Health',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Predict(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Predict',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def LoadModel(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/LoadModel',
backend__pb2.ModelOptions.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def PredictStream(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_stream(request, target, '/backend.Backend/PredictStream',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.Reply.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Embedding(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Embedding',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.EmbeddingResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def GenerateImage(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/GenerateImage',
backend__pb2.GenerateImageRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def AudioTranscription(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/AudioTranscription',
backend__pb2.TranscriptRequest.SerializeToString,
backend__pb2.TranscriptResult.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TTS(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TTS',
backend__pb2.TTSRequest.SerializeToString,
backend__pb2.Result.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def TokenizeString(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/TokenizeString',
backend__pb2.PredictOptions.SerializeToString,
backend__pb2.TokenizationResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)
@staticmethod
def Status(request,
target,
options=(),
channel_credentials=None,
call_credentials=None,
insecure=False,
compression=None,
wait_for_ready=None,
timeout=None,
metadata=None):
return grpc.experimental.unary_unary(request, target, '/backend.Backend/Status',
backend__pb2.HealthMessage.SerializeToString,
backend__pb2.StatusResponse.FromString,
options, channel_credentials,
insecure, call_credentials, compression, wait_for_ready, timeout, metadata)

145
extra/grpc/exllama/exllama.py Executable file
View File

@@ -0,0 +1,145 @@
#!/usr/bin/env python3
import grpc
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
import signal
import sys
import os, glob
from pathlib import Path
import torch
import torch.nn.functional as F
from torch import version as torch_version
from exllama.generator import ExLlamaGenerator
from exllama.model import ExLlama, ExLlamaCache, ExLlamaConfig
from exllama.tokenizer import ExLlamaTokenizer
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# Implement the BackendServicer class with the service methods
class BackendServicer(backend_pb2_grpc.BackendServicer):
def generate(self,prompt, max_new_tokens):
self.generator.end_beam_search()
# Tokenizing the input
ids = self.generator.tokenizer.encode(prompt)
self.generator.gen_begin_reuse(ids)
initial_len = self.generator.sequence[0].shape[0]
has_leading_space = False
decoded_text = ''
for i in range(max_new_tokens):
token = self.generator.gen_single_token()
if i == 0 and self.generator.tokenizer.tokenizer.IdToPiece(int(token)).startswith(''):
has_leading_space = True
decoded_text = self.generator.tokenizer.decode(self.generator.sequence[0][initial_len:])
if has_leading_space:
decoded_text = ' ' + decoded_text
if token.item() == self.generator.tokenizer.eos_token_id:
break
return decoded_text
def Health(self, request, context):
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
def LoadModel(self, request, context):
try:
# https://github.com/turboderp/exllama/blob/master/example_cfg.py
model_directory = request.ModelFile
# Locate files we need within that directory
tokenizer_path = os.path.join(model_directory, "tokenizer.model")
model_config_path = os.path.join(model_directory, "config.json")
st_pattern = os.path.join(model_directory, "*.safetensors")
model_path = glob.glob(st_pattern)[0]
# Create config, model, tokenizer and generator
config = ExLlamaConfig(model_config_path) # create config from config.json
config.model_path = model_path # supply path to model weights file
model = ExLlama(config) # create ExLlama instance and load the weights
tokenizer = ExLlamaTokenizer(tokenizer_path) # create tokenizer from tokenizer model file
cache = ExLlamaCache(model, batch_size = 2) # create cache for inference
generator = ExLlamaGenerator(model, tokenizer, cache) # create generator
self.generator= generator
self.model = model
self.tokenizer = tokenizer
self.cache = cache
except Exception as err:
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
return backend_pb2.Result(message="Model loaded successfully", success=True)
def Predict(self, request, context):
penalty = 1.15
if request.Penalty != 0.0:
penalty = request.Penalty
self.generator.settings.token_repetition_penalty_max = penalty
self.generator.settings.temperature = request.Temperature
self.generator.settings.top_k = request.TopK
self.generator.settings.top_p = request.TopP
tokens = 512
if request.Tokens != 0:
tokens = request.Tokens
if self.cache.batch_size == 1:
del self.cache
self.cache = ExLlamaCache(self.model, batch_size=2)
self.generator = ExLlamaGenerator(self.model, self.tokenizer, self.cache)
t = self.generate(request.Prompt, tokens)
# Remove prompt from response if present
if request.Prompt in t:
t = t.replace(request.Prompt, "")
return backend_pb2.Result(message=bytes(t, encoding='utf-8'))
def PredictStream(self, request, context):
# Implement PredictStream RPC
#for reply in some_data_generator():
# yield reply
# Not implemented yet
return self.Predict(request, context)
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()
print("Server started. Listening on: " + address, file=sys.stderr)
# Define the signal handler function
def signal_handler(sig, frame):
print("Received termination signal. Shutting down...")
server.stop(0)
sys.exit(0)
# Set the signal handlers for SIGINT and SIGTERM
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the gRPC server.")
parser.add_argument(
"--addr", default="localhost:50051", help="The address to bind the server to."
)
args = parser.parse_args()
serve(args.addr)

Some files were not shown because too many files have changed in this diff Show More