Clarify

Improve GenAI docs
Fix frame time access
2026-02-27 03:38:39 -05:00 · 2026-02-26 15:18:42 -07:00 · 2026-02-26 14:52:41 -07:00 · 2026-02-26 08:38:42 -07:00 · 2026-02-25 09:19:56 -07:00 · 2026-02-25 09:02:08 -07:00
24 changed files with 229 additions and 338 deletions
--- a/docs/docs/configuration/genai/config.md
+++ b/docs/docs/configuration/genai/config.md
@@ -5,39 +5,31 @@ title: Configuring Generative AI

 ## Configuration

-A Generative AI provider can be configured in the global config, which will make the Generative AI features available for use. There are currently 4 native providers available to integrate with Frigate. Other providers that support the OpenAI standard API can also be used. See the OpenAI section below.
+A Generative AI provider can be configured in the global config, which will make the Generative AI features available for use. There are currently 4 native providers available to integrate with Frigate. Other providers that support the OpenAI standard API can also be used. See the OpenAI-Compatible section below.

 To use Generative AI, you must define a single provider at the global level of your Frigate configuration. If the provider you choose requires an API key, you may either directly paste it in your configuration, or store it in an environment variable prefixed with `FRIGATE_`.

-## Ollama
+## Local Providers
+
+Local providers run on your own hardware and keep all data processing private. These require a GPU or dedicated hardware for best performance.

 :::warning

-Using Ollama on CPU is not recommended, high inference times make using Generative AI impractical.
+Running Generative AI models on CPU is not recommended, as high inference times make using Generative AI impractical.

 :::

-[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance.
+### Recommended Local Models

-Most of the 7b parameter 4-bit vision models will fit inside 8GB of VRAM. There is also a [Docker container](https://hub.docker.com/r/ollama/ollama) available.
+You must use a vision-capable model with Frigate. The following models are recommended for local deployment:

-Parallel requests also come with some caveats. You will need to set `OLLAMA_NUM_PARALLEL=1` and choose a `OLLAMA_MAX_QUEUE` and `OLLAMA_MAX_LOADED_MODELS` values that are appropriate for your hardware and preferences. See the [Ollama documentation](https://docs.ollama.com/faq#how-does-ollama-handle-concurrent-requests).
-
-### Model Types: Instruct vs Thinking
-
-Most vision-language models are available as **instruct** models, which are fine-tuned to follow instructions and respond concisely to prompts. However, some models (such as certain Qwen-VL or minigpt variants) offer both **instruct** and **thinking** versions.
-
- **Instruct models** are always recommended for use with Frigate. These models generate direct, relevant, actionable descriptions that best fit Frigate's object and event summary use case.
- **Thinking models** are fine-tuned for more free-form, open-ended, and speculative outputs, which are typically not concise and may not provide the practical summaries Frigate expects. For this reason, Frigate does **not** recommend or support using thinking models.
-
-Some models are labeled as **hybrid** (capable of both thinking and instruct tasks). In these cases, Frigate will always use instruct-style prompts and specifically disables thinking-mode behaviors to ensure concise, useful responses.
-
-**Recommendation:**
-Always select the `-instruct` or documented instruct/tagged variant of any model you use in your Frigate configuration. If in doubt, refer to your model provider’s documentation or model library for guidance on the correct model variant to use.
-
-### Supported Models
-
-You must use a vision capable model with Frigate. Current model variants can be found [in their model library](https://ollama.com/library). Note that Frigate will not automatically download the model you specify in your config, Ollama will try to download the model but it may take longer than the timeout, it is recommended to pull the model beforehand by running `ollama pull your_model` on your Ollama server/Docker container. Note that the model specified in Frigate's config must match the downloaded model tag.
+| Model         | Notes                                                                                                                                                                |
+| ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `qwen3-vl`    | Strong visual and situational understanding, strong ability to identify smaller objects and interactions with object.                                                |
+| `qwen3.5`     | Strong situational understanding, but missing DeepStack from qwen3-vl leading to worse performance for identifying objects in people's hand and other small details. |
+| `Intern3.5VL` | Relatively fast with good vision comprehension                                                                                                                       |
+| `gemma3`      | Slower model with good vision and temporal understanding                                                                                                             |
+| `qwen2.5-vl`  | Fast but capable model with good vision comprehension                                                                                                                |

 :::info

@@ -45,32 +37,64 @@ Each model is available in multiple parameter sizes (3b, 4b, 8b, etc.). Larger s

 :::

+:::note
+
+You should have at least 8 GB of RAM available (or VRAM if running on GPU) to run the 7B models, 16 GB to run the 13B models, and 24 GB to run the 33B models.
+
+:::
+
+### Model Types: Instruct vs Thinking
+
+Most vision-language models are available as **instruct** models, which are fine-tuned to follow instructions and respond concisely to prompts. However, some models (such as certain Qwen-VL or minigpt variants) offer both **instruct** and **thinking** versions.
+
+- **Instruct models** are always recommended for use with Frigate. These models generate direct, relevant, actionable descriptions that best fit Frigate's object and event summary use case.
+- **Reasoning / Thinking models** are fine-tuned for more free-form, open-ended, and speculative outputs, which are typically not concise and may not provide the practical summaries Frigate expects. For this reason, Frigate does **not** recommend or support using thinking models.
+
+Some models are labeled as **hybrid** (capable of both thinking and instruct tasks). In these cases, it is recommended to disable reasoning / thinking, which is generally model specific (see your models documentation).
+
+**Recommendation:**
+Always select the `-instruct` or documented instruct/tagged variant of any model you use in your Frigate configuration. If in doubt, refer to your model provider's documentation or model library for guidance on the correct model variant to use.
+
+### llama.cpp
+
+[llama.cpp](https://github.com/ggml-org/llama.cpp) is a C++ implementation of LLaMA that provides a high-performance inference server.
+
+It is highly recommended to host the llama.cpp server on a machine with a discrete graphics card, or on an Apple silicon Mac for best performance.
+
+#### Supported Models
+
+You must use a vision capable model with Frigate. The llama.cpp server supports various vision models in GGUF format.
+
+#### Configuration
+
+All llama.cpp native options can be passed through `provider_options`, including `temperature`, `top_k`, `top_p`, `min_p`, `repeat_penalty`, `repeat_last_n`, `seed`, `grammar`, and more. See the [llama.cpp server documentation](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md) for a complete list of available parameters.
+
+```yaml
+genai:
+  provider: llamacpp
+  base_url: http://localhost:8080
+  model: your-model-name
+  provider_options:
+    context_size: 16000 # Tell Frigate your context size so it can send the appropriate amount of information.
+```
+
+### Ollama
+
+[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance.
+
+Most of the 7b parameter 4-bit vision models will fit inside 8GB of VRAM. There is also a [Docker container](https://hub.docker.com/r/ollama/ollama) available.
+
+Parallel requests also come with some caveats. You will need to set `OLLAMA_NUM_PARALLEL=1` and choose a `OLLAMA_MAX_QUEUE` and `OLLAMA_MAX_LOADED_MODELS` values that are appropriate for your hardware and preferences. See the [Ollama documentation](https://docs.ollama.com/faq#how-does-ollama-handle-concurrent-requests).
+
 :::tip

 If you are trying to use a single model for Frigate and HomeAssistant, it will need to support vision and tools calling. qwen3-VL supports vision and tools simultaneously in Ollama.

 :::

-The following models are recommended:
+Note that Frigate will not automatically download the model you specify in your config. Ollama will try to download the model but it may take longer than the timeout, so it is recommended to pull the model beforehand by running `ollama pull your_model` on your Ollama server/Docker container. The model specified in Frigate's config must match the downloaded model tag.

-| Model         | Notes                                                                |
-| ------------- | -------------------------------------------------------------------- |
-| `qwen3-vl`    | Strong visual and situational understanding, higher vram requirement |
-| `Intern3.5VL` | Relatively fast with good vision comprehension                       |
-| `gemma3`      | Strong frame-to-frame understanding, slower inference times          |
-| `qwen2.5-vl`  | Fast but capable model with good vision comprehension                |
-
-:::note
-
-You should have at least 8 GB of RAM available (or VRAM if running on GPU) to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
-
-:::
-
-#### Ollama Cloud models
-
-Ollama also supports [cloud models](https://ollama.com/cloud), where your local Ollama instance handles requests from Frigate, but model inference is performed in the cloud. Set up Ollama locally, sign in with your Ollama account, and specify the cloud model name in your Frigate config. For more details, see the Ollama cloud model [docs](https://docs.ollama.com/cloud).
-
-### Configuration
+#### Configuration

 ```yaml
 genai:
@@ -83,49 +107,65 @@ genai:
      num_ctx: 8192 # make sure the context matches other services that are using ollama
 ```

-## llama.cpp
+### OpenAI-Compatible

-[llama.cpp](https://github.com/ggml-org/llama.cpp) is a C++ implementation of LLaMA that provides a high-performance inference server. Using llama.cpp directly gives you access to all native llama.cpp options and parameters.
+Frigate supports any provider that implements the OpenAI API standard. This includes self-hosted solutions like [vLLM](https://docs.vllm.ai/), [LocalAI](https://localai.io/), and other OpenAI-compatible servers.

-:::warning
+:::tip

-Using llama.cpp on CPU is not recommended, high inference times make using Generative AI impractical.
-
-:::
-
-It is highly recommended to host the llama.cpp server on a machine with a discrete graphics card, or on an Apple silicon Mac for best performance.
-
-### Supported Models
-
-You must use a vision capable model with Frigate. The llama.cpp server supports various vision models in GGUF format.
-
-### Configuration
+For OpenAI-compatible servers (such as llama.cpp) that don't expose the configured context size in the API response, you can manually specify the context size in `provider_options`:

 ```yaml
 genai:
-  provider: llamacpp
-  base_url: http://localhost:8080
+  provider: openai
+  base_url: http://your-llama-server
  model: your-model-name
  provider_options:
-    temperature: 0.7
-    repeat_penalty: 1.05
-    top_p: 0.8
-    top_k: 40
-    min_p: 0.05
-    seed: -1
+    context_size: 8192 # Specify the configured context size
 ```

-All llama.cpp native options can be passed through `provider_options`, including `temperature`, `top_k`, `top_p`, `min_p`, `repeat_penalty`, `repeat_last_n`, `seed`, `grammar`, and more. See the [llama.cpp server documentation](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md) for a complete list of available parameters.
+This ensures Frigate uses the correct context window size when generating prompts.

-## Google Gemini
+:::
+
+#### Configuration
+
+```yaml
+genai:
+  provider: openai
+  base_url: http://your-server:port
+  api_key: your-api-key # May not be required for local servers
+  model: your-model-name
+```
+
+To use a different OpenAI-compatible API endpoint, set the `OPENAI_BASE_URL` environment variable to your provider's API URL.
+
+## Cloud Providers
+
+Cloud providers run on remote infrastructure and require an API key for authentication. These services handle all model inference on their servers.
+
+### Ollama Cloud
+
+Ollama also supports [cloud models](https://ollama.com/cloud), where your local Ollama instance handles requests from Frigate, but model inference is performed in the cloud. Set up Ollama locally, sign in with your Ollama account, and specify the cloud model name in your Frigate config. For more details, see the Ollama cloud model [docs](https://docs.ollama.com/cloud).
+
+#### Configuration
+
+```yaml
+genai:
+  provider: ollama
+  base_url: http://localhost:11434
+  model: cloud-model-name
+```
+
+### Google Gemini

 Google Gemini has a [free tier](https://ai.google.dev/pricing) for the API, however the limits may not be sufficient for standard Frigate usage. Choose a plan appropriate for your installation.

-### Supported Models
+#### Supported Models

 You must use a vision capable model with Frigate. Current model variants can be found [in their documentation](https://ai.google.dev/gemini-api/docs/models/gemini).

-### Get API Key
+#### Get API Key

 To start using Gemini, you must first get an API key from [Google AI Studio](https://aistudio.google.com).

@@ -134,7 +174,7 @@ To start using Gemini, you must first get an API key from [Google AI Studio](htt
 3. Click "Create API key in new project"
 4. Copy the API key for use in your config

-### Configuration
+#### Configuration

 ```yaml
 genai:
@@ -159,19 +199,19 @@ Other HTTP options are available, see the [python-genai documentation](https://g

 :::

-## OpenAI
+### OpenAI

 OpenAI does not have a free tier for their API. With the release of gpt-4o, pricing has been reduced and each generation should cost fractions of a cent if you choose to go this route.

-### Supported Models
+#### Supported Models

 You must use a vision capable model with Frigate. Current model variants can be found [in their documentation](https://platform.openai.com/docs/models).

-### Get API Key
+#### Get API Key

 To start using OpenAI, you must first [create an API key](https://platform.openai.com/api-keys) and [configure billing](https://platform.openai.com/settings/organization/billing/overview).

-### Configuration
+#### Configuration

 ```yaml
 genai:
@@ -180,42 +220,19 @@ genai:
  model: gpt-4o
 ```

-:::note
-
-To use a different OpenAI-compatible API endpoint, set the `OPENAI_BASE_URL` environment variable to your provider's API URL.
-
-:::
-
-:::tip
-
-For OpenAI-compatible servers (such as llama.cpp) that don't expose the configured context size in the API response, you can manually specify the context size in `provider_options`:
-
-```yaml
-genai:
-  provider: openai
-  base_url: http://your-llama-server
-  model: your-model-name
-  provider_options:
-    context_size: 8192 # Specify the configured context size
-```
-
-This ensures Frigate uses the correct context window size when generating prompts.
-
-:::
-
-## Azure OpenAI
+### Azure OpenAI

 Microsoft offers several vision models through Azure OpenAI. A subscription is required.

-### Supported Models
+#### Supported Models

 You must use a vision capable model with Frigate. Current model variants can be found [in their documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models).

-### Create Resource and Get API Key
+#### Create Resource and Get API Key

 To start using Azure OpenAI, you must first [create a resource](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource). You'll need your API key, model name, and resource URL, which must include the `api-version` parameter (see the example below).

-### Configuration
+#### Configuration

 ```yaml
 genai:
@@ -223,4 +240,4 @@ genai:
  base_url: https://instance.cognitiveservices.azure.com/openai/responses?api-version=2025-04-01-preview
  model: gpt-5-mini
  api_key: "{FRIGATE_OPENAI_API_KEY}"
-```
+```
--- a/docs/docs/configuration/hardware_acceleration_enrichments.md
+++ b/docs/docs/configuration/hardware_acceleration_enrichments.md
@@ -12,20 +12,23 @@ Some of Frigate's enrichments can use a discrete GPU or integrated GPU for accel
 Object detection and enrichments (like Semantic Search, Face Recognition, and License Plate Recognition) are independent features. To use a GPU / NPU for object detection, see the [Object Detectors](/configuration/object_detectors.md) documentation. If you want to use your GPU for any supported enrichments, you must choose the appropriate Frigate Docker image for your GPU / NPU and configure the enrichment according to its specific documentation.

 - **AMD**
+
  - ROCm support in the `-rocm` Frigate image is automatically detected for enrichments, but only some enrichment models are available due to ROCm's focus on LLMs and limited stability with certain neural network models. Frigate disables models that perform poorly or are unstable to ensure reliable operation, so only compatible enrichments may be active.

 - **Intel**
+
  - OpenVINO will automatically be detected and used for enrichments in the default Frigate image.
  - **Note:** Intel NPUs have limited model support for enrichments. GPU is recommended for enrichments when available.

 - **Nvidia**
+
  - Nvidia GPUs will automatically be detected and used for enrichments in the `-tensorrt` Frigate image.
  - Jetson devices will automatically be detected and used for enrichments in the `-tensorrt-jp6` Frigate image.

 - **RockChip**
  - RockChip NPU will automatically be detected and used for semantic search v1 and face recognition in the `-rk` Frigate image.

-Utilizing a GPU for enrichments does not require you to use the same GPU for object detection. For example, you can run the `tensorrt` Docker image to run enrichments on an Nvidia GPU and still use other dedicated hardware like a Coral or Hailo for object detection. However, one combination that is not supported is the `tensorrt` image for object detection on an Nvidia GPU and Intel iGPU for enrichments.
+Utilizing a GPU for enrichments does not require you to use the same GPU for object detection. For example, you can run the `tensorrt` Docker image for enrichments and still use other dedicated hardware like a Coral or Hailo for object detection. However, one combination that is not supported is TensorRT for object detection and OpenVINO for enrichments.

 :::note

--- a/docs/docs/configuration/index.md
+++ b/docs/docs/configuration/index.md
@@ -29,12 +29,12 @@ cameras:

 When running Frigate through the HA Add-on, the Frigate `/config` directory is mapped to `/addon_configs/<addon_directory>` in the host, where `<addon_directory>` is specific to the variant of the Frigate Add-on you are running.

-| Add-on Variant             | Configuration directory                   |
-| -------------------------- | ----------------------------------------- |
-| Frigate                    | `/addon_configs/ccab4aaf_frigate`         |
-| Frigate (Full Access)      | `/addon_configs/ccab4aaf_frigate-fa`      |
-| Frigate Beta               | `/addon_configs/ccab4aaf_frigate-beta`    |
-| Frigate Beta (Full Access) | `/addon_configs/ccab4aaf_frigate-fa-beta` |
+| Add-on Variant             | Configuration directory                      |
+| -------------------------- | -------------------------------------------- |
+| Frigate                    | `/addon_configs/ccab4aaf_frigate`            |
+| Frigate (Full Access)      | `/addon_configs/ccab4aaf_frigate-fa`         |
+| Frigate Beta               | `/addon_configs/ccab4aaf_frigate-beta`       |
+| Frigate Beta (Full Access) | `/addon_configs/ccab4aaf_frigate-fa-beta`    |

 **Whenever you see `/config` in the documentation, it refers to this directory.**

@@ -109,16 +109,15 @@ detectors:

 record:
  enabled: True
-  motion:
+  retain:
    days: 7
+    mode: motion
  alerts:
    retain:
      days: 30
-      mode: motion
  detections:
    retain:
      days: 30
-      mode: motion

 snapshots:
  enabled: True
@@ -166,16 +165,15 @@ detectors:

 record:
  enabled: True
-  motion:
+  retain:
    days: 7
+    mode: motion
  alerts:
    retain:
      days: 30
-      mode: motion
  detections:
    retain:
      days: 30
-      mode: motion

 snapshots:
  enabled: True
@@ -233,16 +231,15 @@ model:

 record:
  enabled: True
-  motion:
+  retain:
    days: 7
+    mode: motion
  alerts:
    retain:
      days: 30
-      mode: motion
  detections:
    retain:
      days: 30
-      mode: motion

 snapshots:
  enabled: True
--- a/docs/docs/configuration/object_detectors.md
+++ b/docs/docs/configuration/object_detectors.md
@@ -34,7 +34,7 @@ Frigate supports multiple different detectors that work on different types of ha

 **Nvidia GPU**

- [ONNX](#onnx): Nvidia GPUs will automatically be detected and used as a detector in the `-tensorrt` Frigate image when a supported ONNX model is configured.
+- [ONNX](#onnx): TensorRT will automatically be detected and used as a detector in the `-tensorrt` Frigate image when a supported ONNX model is configured.

 **Nvidia Jetson** <CommunityBadge />

@@ -65,7 +65,7 @@ This does not affect using hardware for accelerating other tasks such as [semant

 # Officially Supported Detectors

-Frigate provides a number of builtin detector types. By default, Frigate will use a single CPU detector. Other detectors may require additional configuration as described below. When using multiple detectors they will run in dedicated processes, but pull from a common queue of detection requests from across all cameras.
+Frigate provides the following builtin detector types: `cpu`, `edgetpu`, `hailo8l`, `memryx`, `onnx`, `openvino`, `rknn`, and `tensorrt`. By default, Frigate will use a single CPU detector. Other detectors may require additional configuration as described below. When using multiple detectors they will run in dedicated processes, but pull from a common queue of detection requests from across all cameras.

 ## Edge TPU Detector

@@ -157,13 +157,7 @@ A TensorFlow Lite model is provided in the container at `/edgetpu_model.tflite`

 #### YOLOv9

-YOLOv9 models that are compiled for TensorFlow Lite and properly quantized are supported, but not included by default. [Instructions](#yolov9-for-google-coral-support) for downloading a model with support for the Google Coral.
-
-:::tip
-
-**Frigate+ Users:** Follow the [instructions](../integrations/plus#use-models) to set a model ID in your config file.
-
-:::
+YOLOv9 models that are compiled for TensorFlow Lite and properly quantized are supported, but not included by default. [Download the model](https://github.com/dbro/frigate-detector-edgetpu-yolo9/releases/download/v1.0/yolov9-s-relu6-best_320_int8_edgetpu.tflite), bind mount the file into the container, and provide the path with `model.path`. Note that the linked model requires a 17-label [labelmap file](https://raw.githubusercontent.com/dbro/frigate-detector-edgetpu-yolo9/refs/heads/main/labels-coco17.txt) that includes only 17 COCO classes.

 <details>
  <summary>YOLOv9 Setup & Config</summary>
@@ -660,9 +654,11 @@ ONNX is an open format for building machine learning models, Frigate supports ru
 If the correct build is used for your GPU then the GPU will be detected and used automatically.

 - **AMD**
+
  - ROCm will automatically be detected and used with the ONNX detector in the `-rocm` Frigate image.

 - **Intel**
+
  - OpenVINO will automatically be detected and used with the ONNX detector in the default Frigate image.

 - **Nvidia**
@@ -1560,11 +1556,7 @@ cd tensorrt_demos/yolo
 python3 yolo_to_onnx.py -m yolov7-320
 ```

-#### YOLOv9 for Google Coral Support
-
-[Download the model](https://github.com/dbro/frigate-detector-edgetpu-yolo9/releases/download/v1.0/yolov9-s-relu6-best_320_int8_edgetpu.tflite), bind mount the file into the container, and provide the path with `model.path`. Note that the linked model requires a 17-label [labelmap file](https://raw.githubusercontent.com/dbro/frigate-detector-edgetpu-yolo9/refs/heads/main/labels-coco17.txt) that includes only 17 COCO classes.
-
-#### YOLOv9 for other detectors
+#### YOLOv9

 YOLOv9 model can be exported as ONNX using the command below. You can copy and paste the whole thing to your terminal and execute, altering `MODEL_SIZE=t` and `IMG_SIZE=320` in the first line to the [model size](https://github.com/WongKinYiu/yolov9#performance) you would like to convert (available model sizes are `t`, `s`, `m`, `c`, and `e`, common image sizes are `320` and `640`).

--- a/docs/docs/frigate/hardware.md
+++ b/docs/docs/frigate/hardware.md
@@ -41,8 +41,8 @@ If the EQ13 is out of stock, the link below may take you to a suggested alternat
 | Name                                                                                                          | Capabilities                                                               | Notes                                               |
 | ------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- | --------------------------------------------------- |
 | Beelink EQ13 (<a href="https://amzn.to/4jn2qVr" target="_blank" rel="nofollow noopener sponsored">Amazon</a>) | Can run object detection on several 1080p cameras with low-medium activity | Dual gigabit NICs for easy isolated camera network. |
-| Intel 1120p ([Amazon](https://www.amazon.com/Beelink-i3-1220P-Computer-Display-Gigabit/dp/B0DDCKT9YP))        | Can handle a large number of 1080p cameras with high activity              |                                                     |
-| Intel 125H ([Amazon](https://www.amazon.com/MINISFORUM-Pro-125H-Barebone-Computer-HDMI2-1/dp/B0FH21FSZM))     | Can handle a significant number of 1080p cameras with high activity        | Includes NPU for more efficient detection in 0.17+  |
+| Intel 1120p ([Amazon](https://www.amazon.com/Beelink-i3-1220P-Computer-Display-Gigabit/dp/B0DDCKT9YP)         | Can handle a large number of 1080p cameras with high activity              |                                                     |
+| Intel 125H ([Amazon](https://www.amazon.com/MINISFORUM-Pro-125H-Barebone-Computer-HDMI2-1/dp/B0FH21FSZM)      | Can handle a significant number of 1080p cameras with high activity        | Includes NPU for more efficient detection in 0.17+  |

 ## Detectors

@@ -86,7 +86,7 @@ Frigate supports multiple different detectors that work on different types of ha

 **Nvidia**

- [Nvidia GPU](#nvidia-gpus): Nvidia GPUs can provide efficient object detection.
+- [TensortRT](#tensorrt---nvidia-gpu): TensorRT can run on Nvidia GPUs to provide efficient object detection.
  - [Supports majority of model architectures via ONNX](../../configuration/object_detectors#onnx-supported-models)
  - Runs well with any size models including large

@@ -172,7 +172,7 @@ Inference speeds vary greatly depending on the CPU or GPU used, some known examp
 | Intel Arc A380 | ~ 6 ms                     |                                                   | 320: ~ 10 ms 640: ~ 22 ms | 336: 20 ms 448: 27 ms  |                                    |
 | Intel Arc A750 | ~ 4 ms                     |                                                   | 320: ~ 8 ms               |                        |                                    |

-### Nvidia GPUs
+### TensorRT - Nvidia GPU

 Frigate is able to utilize an Nvidia GPU which supports the 12.x series of CUDA libraries.

@@ -182,6 +182,8 @@ Frigate is able to utilize an Nvidia GPU which supports the 12.x series of CUDA

 Make sure your host system has the [nvidia-container-runtime](https://docs.docker.com/config/containers/resource_constraints/#access-an-nvidia-gpu) installed to pass through the GPU to the container and the host system has a compatible driver installed for your GPU.

+There are improved capabilities in newer GPU architectures that TensorRT can benefit from, such as INT8 operations and Tensor cores. The features compatible with your hardware will be optimized when the model is converted to a trt file. Currently the script provided for generating the model provides a switch to enable/disable FP16 operations. If you wish to use newer features such as INT8 optimization, more work is required.
+
 #### Compatibility References:

 [NVIDIA TensorRT Support Matrix](https://docs.nvidia.com/deeplearning/tensorrt-rtx/latest/getting-started/support-matrix.html)
@@ -190,7 +192,7 @@ Make sure your host system has the [nvidia-container-runtime](https://docs.docke

 [NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus)

-Inference is done with the `onnx` detector type. Speeds will vary greatly depending on the GPU and the model used.
+Inference speeds will vary greatly depending on the GPU and the model used.
 `tiny (t)` variants are faster than the equivalent non-tiny model, some known examples are below:

 ✅ - Accelerated with CUDA Graphs
--- a/docs/docs/frigate/installation.md
+++ b/docs/docs/frigate/installation.md
@@ -56,7 +56,7 @@ services:
    volumes:
      - /path/to/your/config:/config
      - /path/to/your/storage:/media/frigate
-      - type: tmpfs # 1GB In-memory filesystem for recording segment storage
+      - type: tmpfs # Recommended: 1GB of memory
        target: /tmp/cache
        tmpfs:
          size: 1000000000
@@ -123,7 +123,7 @@ On Raspberry Pi OS **Trixie**, the Hailo driver is no longer shipped with the ke
   :::note

   If you are **not** using a Raspberry Pi with **Bookworm OS**, skip this step and proceed directly to step 2.
-
+   
   If you are using Raspberry Pi with **Trixie OS**, also skip this step and proceed directly to step 2.

   :::
@@ -133,13 +133,13 @@ On Raspberry Pi OS **Trixie**, the Hailo driver is no longer shipped with the ke
   ```bash
   lsmod | grep hailo
   ```
-
+   
   If it shows `hailo_pci`, unload it:

   ```bash
   sudo modprobe -r hailo_pci
   ```
-
+   
   Then locate the built-in kernel driver and rename it so it cannot be loaded.
   Renaming allows the original driver to be restored later if needed.
   First, locate the currently installed kernel module:
@@ -149,29 +149,28 @@ On Raspberry Pi OS **Trixie**, the Hailo driver is no longer shipped with the ke
   ```

   Example output:
-
+   
   ```
   /lib/modules/6.6.31+rpt-rpi-2712/kernel/drivers/media/pci/hailo/hailo_pci.ko.xz
   ```
-
   Save the module path to a variable:
-
+   
   ```bash
   BUILTIN=$(modinfo -n hailo_pci)
   ```

   And rename the module by appending .bak:
-
+    
   ```bash
   sudo mv "$BUILTIN" "${BUILTIN}.bak"
   ```
-
+   
   Now refresh the kernel module map so the system recognizes the change:
-
+   
   ```bash
   sudo depmod -a
   ```
-
+   
   Reboot your Raspberry Pi:

   ```bash
@@ -207,6 +206,7 @@ On Raspberry Pi OS **Trixie**, the Hailo driver is no longer shipped with the ke
   ```

   The script will:
+
   - Install necessary build dependencies
   - Clone and build the Hailo driver from the official repository
   - Install the driver
@@ -236,18 +236,18 @@ On Raspberry Pi OS **Trixie**, the Hailo driver is no longer shipped with the ke
   ```

   Verify the driver version:
-
+   
   ```bash
   cat /sys/module/hailo_pci/version
   ```
-
+   
   Verify that the firmware was installed correctly:
-
+   
   ```bash
   ls -l /lib/firmware/hailo/hailo8_fw.bin
   ```

-   **Optional: Fix PCIe descriptor page size error**
+  **Optional: Fix PCIe descriptor page size error**

   If you encounter the following error:

@@ -462,7 +462,7 @@ services:
      - /etc/localtime:/etc/localtime:ro
      - /path/to/your/config:/config
      - /path/to/your/storage:/media/frigate
-      - type: tmpfs # 1GB In-memory filesystem for recording segment storage
+      - type: tmpfs # Recommended: 1GB of memory
        target: /tmp/cache
        tmpfs:
          size: 1000000000
@@ -502,12 +502,12 @@ The official docker image tags for the current stable version are:

 - `stable` - Standard Frigate build for amd64 & RPi Optimized Frigate build for arm64. This build includes support for Hailo devices as well.
 - `stable-standard-arm64` - Standard Frigate build for arm64
- `stable-tensorrt` - Frigate build specific for amd64 devices running an Nvidia GPU
+- `stable-tensorrt` - Frigate build specific for amd64 devices running an nvidia GPU
 - `stable-rocm` - Frigate build for [AMD GPUs](../configuration/object_detectors.md#amdrocm-gpu-detector)

 The community supported docker image tags for the current stable version are:

- `stable-tensorrt-jp6` - Frigate build optimized for Nvidia Jetson devices running Jetpack 6
+- `stable-tensorrt-jp6` - Frigate build optimized for nvidia Jetson devices running Jetpack 6
 - `stable-rk` - Frigate build for SBCs with Rockchip SoC

 ## Home Assistant Add-on
@@ -521,7 +521,7 @@ There are important limitations in HA OS to be aware of:
 - Separate local storage for media is not yet supported by Home Assistant
 - AMD GPUs are not supported because HA OS does not include the mesa driver.
 - Intel NPUs are not supported because HA OS does not include the NPU firmware.
- Nvidia GPUs are not supported because addons do not support the Nvidia runtime.
+- Nvidia GPUs are not supported because addons do not support the nvidia runtime.

 :::

@@ -694,18 +694,17 @@ Log into QNAP, open Container Station. Frigate docker container should be listed

 :::warning

-macOS uses port 5000 for its Airplay Receiver service. If you want to expose port 5000 in Frigate for local app and API access the port will need to be mapped to another port on the host e.g. 5001
+macOS uses port 5000 for its Airplay Receiver service.  If you want to expose port 5000 in Frigate for local app and API access the port will need to be mapped to another port on the host e.g. 5001

 Failure to remap port 5000 on the host will result in the WebUI and all API endpoints on port 5000 being unreachable, even if port 5000 is exposed correctly in Docker.

 :::

-Docker containers on macOS can be orchestrated by either [Docker Desktop](https://docs.docker.com/desktop/setup/install/mac-install/) or [OrbStack](https://orbstack.dev) (native swift app). The difference in inference speeds is negligable, however CPU, power consumption and container start times will be lower on OrbStack because it is a native Swift application.
+Docker containers on macOS can be orchestrated by either [Docker Desktop](https://docs.docker.com/desktop/setup/install/mac-install/) or [OrbStack](https://orbstack.dev) (native swift app). The difference in inference speeds is negligable, however CPU, power consumption and container start times will be lower on OrbStack because it is a native Swift application. 

 To allow Frigate to use the Apple Silicon Neural Engine / Processing Unit (NPU) the host must be running [Apple Silicon Detector](../configuration/object_detectors.md#apple-silicon-detector) on the host (outside Docker)

 #### Docker Compose example
-
 ```yaml
 services:
  frigate:
@@ -720,7 +719,7 @@ services:
    ports:
      - "8971:8971"
      # If exposing on macOS map to a diffent host port like 5001 or any orher port with no conflicts
-      # - "5001:5000" # Internal unauthenticated access. Expose carefully.
+      # - "5001:5000" # Internal unauthenticated access. Expose carefully. 
      - "8554:8554" # RTSP feeds
    extra_hosts:
      # This is very important
--- a/docs/docs/frigate/updating.md
+++ b/docs/docs/frigate/updating.md
@@ -20,6 +20,7 @@ Keeping Frigate up to date ensures you benefit from the latest features, perform
 If you’re running Frigate via Docker (recommended method), follow these steps:

 1. **Stop the Container**:
+
   - If using Docker Compose:
     ```bash
     docker compose down frigate
@@ -30,8 +31,9 @@ If you’re running Frigate via Docker (recommended method), follow these steps:
     ```

 2. **Update and Pull the Latest Image**:
+
   - If using Docker Compose:
-     - Edit your `docker-compose.yml` file to specify the desired version tag (e.g., `0.17.0` instead of `0.16.4`). For example:
+     - Edit your `docker-compose.yml` file to specify the desired version tag (e.g., `0.17.0` instead of `0.16.3`). For example:
       ```yaml
       services:
         frigate:
@@ -49,6 +51,7 @@ If you’re running Frigate via Docker (recommended method), follow these steps:
       ```

 3. **Start the Container**:
+
   - If using Docker Compose:
     ```bash
     docker compose up -d
@@ -72,15 +75,18 @@ If you’re running Frigate via Docker (recommended method), follow these steps:
 For users running Frigate as a Home Assistant Addon:

 1. **Check for Updates**:
+
   - Navigate to **Settings > Add-ons** in Home Assistant.
   - Find your installed Frigate addon (e.g., "Frigate NVR" or "Frigate NVR (Full Access)").
   - If an update is available, you’ll see an "Update" button.

 2. **Update the Addon**:
+
   - Click the "Update" button next to the Frigate addon.
   - Wait for the process to complete. Home Assistant will handle downloading and installing the new version.

 3. **Restart the Addon**:
+
   - After updating, go to the addon’s page and click "Restart" to apply the changes.

 4. **Verify the Update**:
@@ -99,8 +105,8 @@ If an update causes issues:
 1. Stop Frigate.
 2. Restore your backed-up config file and database.
 3. Revert to the previous image version:
-   - For Docker: Specify an older tag (e.g., `ghcr.io/blakeblackshear/frigate:0.16.4`) in your `docker run` command.
-   - For Docker Compose: Edit your `docker-compose.yml`, specify the older version tag (e.g., `ghcr.io/blakeblackshear/frigate:0.16.4`), and re-run `docker compose up -d`.
+   - For Docker: Specify an older tag (e.g., `ghcr.io/blakeblackshear/frigate:0.16.3`) in your `docker run` command.
+   - For Docker Compose: Edit your `docker-compose.yml`, specify the older version tag (e.g., `ghcr.io/blakeblackshear/frigate:0.16.3`), and re-run `docker compose up -d`.
   - For Home Assistant: Reinstall the previous addon version manually via the repository if needed and restart the addon.
 4. Verify the old version is running again.

--- a/docs/docs/guides/getting_started.md
+++ b/docs/docs/guides/getting_started.md
@@ -119,7 +119,7 @@ services:
    volumes:
      - ./config:/config
      - ./storage:/media/frigate
-      - type: tmpfs # 1GB In-memory filesystem for recording segment storage
+      - type: tmpfs # Optional: 1GB of memory, reduces SSD/SD Card wear
        target: /tmp/cache
        tmpfs:
          size: 1000000000
--- a/docs/docs/integrations/plus.md
+++ b/docs/docs/integrations/plus.md
@@ -54,8 +54,6 @@ Once you have [requested your first model](../plus/first_model.md) and gotten yo
 You can either choose the new model from the Frigate+ pane in the Settings page of the Frigate UI, or manually set the model at the root level in your config:

 ```yaml
-detectors: ...
-
 model:
  path: plus://<your_model_id>
 ```
--- a/docs/docs/plus/first_model.md
+++ b/docs/docs/plus/first_model.md
@@ -24,8 +24,6 @@ You will receive an email notification when your Frigate+ model is ready.
 Models available in Frigate+ can be used with a special model path. No other information needs to be configured because it fetches the remaining config from Frigate+ automatically.

 ```yaml
-detectors: ...
-
 model:
  path: plus://<your_model_id>
 ```
--- a/docs/docs/plus/index.md
+++ b/docs/docs/plus/index.md
@@ -15,15 +15,15 @@ There are three model types offered in Frigate+, `mobiledet`, `yolonas`, and `yo

 Not all model types are supported by all detectors, so it's important to choose a model type to match your detector as shown in the table under [supported detector types](#supported-detector-types). You can test model types for compatibility and speed on your hardware by using the base models.

-| Model Type  | Description                                                                                                                                                    |
-| ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `mobiledet` | Based on the same architecture as the default model included with Frigate. Runs on Google Coral devices and CPUs.                                              |
-| `yolonas`   | A newer architecture that offers slightly higher accuracy and improved detection of small objects. Runs on Intel, NVidia GPUs, and AMD GPUs.                   |
-| `yolov9`    | A leading SOTA (state of the art) object detection model with similar performance to yolonas, but on a wider range of hardware options. Runs on most hardware. |
+| Model Type  | Description                                                                                                                                                                                                                    |
+| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `mobiledet` | Based on the same architecture as the default model included with Frigate. Runs on Google Coral devices and CPUs.                                                                                                              |
+| `yolonas`   | A newer architecture that offers slightly higher accuracy and improved detection of small objects. Runs on Intel, NVidia GPUs, and AMD GPUs.                                                                                   |
+| `yolov9`    | A leading SOTA (state of the art) object detection model with similar performance to yolonas, but on a wider range of hardware options. Runs on Intel, NVidia GPUs, AMD GPUs, Hailo, MemryX, Apple Silicon, and Rockchip NPUs. |

 ### YOLOv9 Details

-YOLOv9 models are available in `s`, `t`, `edgetpu` variants. When requesting a `yolov9` model, you will be prompted to choose a variant. If you want the model to be compatible with a Google Coral, you will need to choose the `edgetpu` variant. If you are unsure what variant to choose, you should perform some tests with the base models to find the performance level that suits you. The `s` size is most similar to the current `yolonas` models in terms of inference times and accuracy, and a good place to start is the `320x320` resolution model for `yolov9s`.
+YOLOv9 models are available in `s` and `t` sizes. When requesting a `yolov9` model, you will be prompted to choose a size. If you are unsure what size to choose, you should perform some tests with the base models to find the performance level that suits you. The `s` size is most similar to the current `yolonas` models in terms of inference times and accuracy, and a good place to start is the `320x320` resolution model for `yolov9s`.

 :::info

@@ -37,21 +37,23 @@ If you have a Hailo device, you will need to specify the hardware you have when

 #### Rockchip (RKNN) Support

-Rockchip models are automatically converted as of 0.17. For 0.16, YOLOv9 onnx models will need to be manually converted. First, you will need to configure Frigate to use the model id for your YOLOv9 onnx model so it downloads the model to your `model_cache` directory. From there, you can follow the [documentation](/configuration/object_detectors.md#converting-your-own-onnx-model-to-rknn-format) to convert it.
+For 0.16, YOLOv9 onnx models will need to be manually converted. First, you will need to configure Frigate to use the model id for your YOLOv9 onnx model so it downloads the model to your `model_cache` directory. From there, you can follow the [documentation](/configuration/object_detectors.md#converting-your-own-onnx-model-to-rknn-format) to convert it. Automatic conversion is available in 0.17 and later.

 ## Supported detector types

-Currently, Frigate+ models support CPU (`cpu`), Google Coral (`edgetpu`), OpenVino (`openvino`), ONNX (`onnx`), Hailo (`hailo8l`), and Rockchip (`rknn`) detectors.
+Currently, Frigate+ models support CPU (`cpu`), Google Coral (`edgetpu`), OpenVino (`openvino`), ONNX (`onnx`), Hailo (`hailo8l`), and Rockchip\* (`rknn`) detectors.

 | Hardware                                                                         | Recommended Detector Type | Recommended Model Type |
 | -------------------------------------------------------------------------------- | ------------------------- | ---------------------- |
 | [CPU](/configuration/object_detectors.md#cpu-detector-not-recommended)           | `cpu`                     | `mobiledet`            |
-| [Coral (all form factors)](/configuration/object_detectors.md#edge-tpu-detector) | `edgetpu`                 | `yolov9`               |
+| [Coral (all form factors)](/configuration/object_detectors.md#edge-tpu-detector) | `edgetpu`                 | `mobiledet`            |
 | [Intel](/configuration/object_detectors.md#openvino-detector)                    | `openvino`                | `yolov9`               |
 | [NVidia GPU](/configuration/object_detectors#onnx)                               | `onnx`                    | `yolov9`               |
 | [AMD ROCm GPU](/configuration/object_detectors#amdrocm-gpu-detector)             | `onnx`                    | `yolov9`               |
 | [Hailo8/Hailo8L/Hailo8R](/configuration/object_detectors#hailo-8)                | `hailo8l`                 | `yolov9`               |
-| [Rockchip NPU](/configuration/object_detectors#rockchip-platform)                | `rknn`                    | `yolov9`               |
+| [Rockchip NPU](/configuration/object_detectors#rockchip-platform)\*              | `rknn`                    | `yolov9`               |
+
+_\* Requires manual conversion in 0.16. Automatic conversion available in 0.17 and later._

 ## Improving your model

@@ -79,7 +81,7 @@ Candidate labels are also available for annotation. These labels don't have enou

 Where possible, these labels are mapped to existing labels during training. For example, any `baby` labels are mapped to `person` until support for new labels is added.

-The candidate labels are: `baby`, `bpost`, `badger`, `possum`, `rodent`, `chicken`, `groundhog`, `boar`, `hedgehog`, `tractor`, `golf cart`, `garbage truck`, `bus`, `sports ball`, `la_poste`, `lawnmower`, `heron`, `rickshaw`, `wombat`, `auspost`, `aramex`, `bobcat`, `mustelid`, `transoflex`, `airplane`, `drone`, `mountain_lion`, `crocodile`, `turkey`, `baby_stroller`, `monkey`, `coyote`, `porcupine`, `parcelforce`, `sheep`, `snake`, `helicopter`, `lizard`, `duck`, `hermes`, `cargus`, `fan_courier`, `sameday`
+The candidate labels are: `baby`, `bpost`, `badger`, `possum`, `rodent`, `chicken`, `groundhog`, `boar`, `hedgehog`, `tractor`, `golf cart`, `garbage truck`, `bus`, `sports ball`

 Candidate labels are not available for automatic suggestions.

--- a/frigate/api/app.py
+++ b/frigate/api/app.py
@@ -38,7 +38,6 @@ from frigate.config.camera.updater import (
    CameraConfigUpdateTopic,
 )
 from frigate.ffmpeg_presets import FFMPEG_HWACCEL_VAAPI, _gpu_selector
-from frigate.genai import GenAIClientManager
 from frigate.jobs.media_sync import (
    get_current_media_sync_job,
    get_media_sync_job_by_id,
@@ -433,7 +432,6 @@ def config_set(request: Request, body: AppConfigSetBody):
    if body.requires_restart == 0 or body.update_topic:
        old_config: FrigateConfig = request.app.frigate_config
        request.app.frigate_config = config
-        request.app.genai_manager = GenAIClientManager(config)

        if body.update_topic:
            if body.update_topic.startswith("config/cameras/"):
--- a/frigate/api/auth.py
+++ b/frigate/api/auth.py
@@ -1037,4 +1037,4 @@ async def get_allowed_cameras_for_filter(request: Request):
    role = current_user["role"]
    all_camera_names = set(request.app.frigate_config.cameras.keys())
    roles_dict = request.app.frigate_config.auth.roles
-    return User.get_allowed_cameras(role, roles_dict, all_camera_names)
+    return User.get_allowed_cameras(role, roles_dict, all_camera_names)
--- a/frigate/api/chat.py
+++ b/frigate/api/chat.py
@@ -23,6 +23,7 @@ from frigate.api.defs.response.chat_response import (
 )
 from frigate.api.defs.tags import Tags
 from frigate.api.event import events
+from frigate.genai import get_genai_client

 logger = logging.getLogger(__name__)

@@ -189,7 +190,7 @@ async def _execute_search_objects(
        return JSONResponse(
            content={
                "success": False,
-                "message": "Error searching objects",
+                "message": f"Error searching objects: {str(e)}",
            },
            status_code=500,
        )
@@ -278,7 +279,7 @@ async def _execute_get_live_context(
    except Exception as e:
        logger.error(f"Error executing get_live_context: {e}", exc_info=True)
        return {
-            "error": "Error getting live context",
+            "error": f"Error getting live context: {str(e)}",
        }


@@ -382,7 +383,7 @@ async def chat_completion(
    6. Repeats until final answer
    7. Returns response to user
    """
-    genai_client = request.app.genai_manager.tool_client
+    genai_client = get_genai_client(request.app.frigate_config)
    if not genai_client:
        return JSONResponse(
            content={
@@ -598,7 +599,9 @@ Always be accurate with time calculations based on the current date provided.{ca
                        f"Error executing tool {tool_name} (id: {tool_call_id}): {e}",
                        exc_info=True,
                    )
-                    error_content = json.dumps({"error": "Tool execution failed"})
+                    error_content = json.dumps(
+                        {"error": f"Tool execution failed: {str(e)}"}
+                    )
                    tool_results.append(
                        {
                            "role": "tool",
--- a/frigate/api/fastapi_app.py
+++ b/frigate/api/fastapi_app.py
@@ -33,7 +33,6 @@ from frigate.comms.event_metadata_updater import (
 from frigate.config import FrigateConfig
 from frigate.config.camera.updater import CameraConfigUpdatePublisher
 from frigate.embeddings import EmbeddingsContext
-from frigate.genai import GenAIClientManager
 from frigate.ptz.onvif import OnvifController
 from frigate.stats.emitter import StatsEmitter
 from frigate.storage import StorageMaintainer
@@ -135,7 +134,6 @@ def create_fastapi_app(
    app.include_router(record.router)
    # App Properties
    app.frigate_config = frigate_config
-    app.genai_manager = GenAIClientManager(frigate_config)
    app.embeddings = embeddings
    app.detected_frames_processor = detected_frames_processor
    app.storage_maintainer = storage_maintainer
--- a/frigate/api/review.py
+++ b/frigate/api/review.py
@@ -33,6 +33,7 @@ from frigate.api.defs.response.review_response import (
    ReviewSummaryResponse,
 )
 from frigate.api.defs.tags import Tags
+from frigate.config import FrigateConfig
 from frigate.embeddings import EmbeddingsContext
 from frigate.models import Recordings, ReviewSegment, UserReviewStatus
 from frigate.review.types import SeverityEnum
@@ -746,7 +747,9 @@ async def set_not_reviewed(
    description="Use GenAI to summarize review items over a period of time.",
 )
 def generate_review_summary(request: Request, start_ts: float, end_ts: float):
-    if not request.app.genai_manager.vision_client:
+    config: FrigateConfig = request.app.frigate_config
+
+    if not config.genai.provider:
        return JSONResponse(
            content=(
                {
--- a/frigate/config/camera/genai.py
+++ b/frigate/config/camera/genai.py
@@ -6,7 +6,7 @@ from pydantic import Field
 from ..base import FrigateBaseModel
 from ..env import EnvString

-__all__ = ["GenAIConfig", "GenAIProviderEnum", "GenAIRoleEnum"]
+__all__ = ["GenAIConfig", "GenAIProviderEnum"]


 class GenAIProviderEnum(str, Enum):
@@ -17,12 +17,6 @@ class GenAIProviderEnum(str, Enum):
    llamacpp = "llamacpp"


-class GenAIRoleEnum(str, Enum):
-    tools = "tools"
-    vision = "vision"
-    embeddings = "embeddings"
-
-
 class GenAIConfig(FrigateBaseModel):
    """Primary GenAI Config to define GenAI Provider."""

@@ -30,14 +24,6 @@ class GenAIConfig(FrigateBaseModel):
    base_url: Optional[str] = Field(default=None, title="Provider base url.")
    model: str = Field(default="gpt-4o", title="GenAI model.")
    provider: GenAIProviderEnum | None = Field(default=None, title="GenAI provider.")
-    roles: list[GenAIRoleEnum] = Field(
-        default_factory=lambda: [
-            GenAIRoleEnum.embeddings,
-            GenAIRoleEnum.vision,
-            GenAIRoleEnum.tools,
-        ],
-        title="GenAI roles (tools, vision, embeddings); one provider per role.",
-    )
    provider_options: dict[str, Any] = Field(
        default={}, title="GenAI Provider extra options."
    )
--- a/frigate/config/config.py
+++ b/frigate/config/config.py
@@ -45,7 +45,7 @@ from .camera.audio import AudioConfig
 from .camera.birdseye import BirdseyeConfig
 from .camera.detect import DetectConfig
 from .camera.ffmpeg import FfmpegConfig
-from .camera.genai import GenAIConfig, GenAIRoleEnum
+from .camera.genai import GenAIConfig
 from .camera.motion import MotionConfig
 from .camera.notification import NotificationConfig
 from .camera.objects import FilterConfig, ObjectConfig
@@ -347,9 +347,9 @@ class FrigateConfig(FrigateBaseModel):
        default_factory=ModelConfig, title="Detection model configuration."
    )

-    # GenAI config (named provider configs: name -> GenAIConfig)
-    genai: Dict[str, GenAIConfig] = Field(
-        default_factory=dict, title="Generative AI configuration (named providers)."
+    # GenAI config
+    genai: GenAIConfig = Field(
+        default_factory=GenAIConfig, title="Generative AI configuration."
    )

    # Camera config
@@ -431,18 +431,6 @@ class FrigateConfig(FrigateBaseModel):
        # set notifications state
        self.notifications.enabled_in_config = self.notifications.enabled

-        # validate genai: each role (tools, vision, embeddings) at most once
-        role_to_name: dict[GenAIRoleEnum, str] = {}
-        for name, genai_cfg in self.genai.items():
-            for role in genai_cfg.roles:
-                if role in role_to_name:
-                    raise ValueError(
-                        f"GenAI role '{role.value}' is assigned to both "
-                        f"'{role_to_name[role]}' and '{name}'; each role must have "
-                        "exactly one provider."
-                    )
-                role_to_name[role] = name
-
        # set default min_score for object attributes
        for attribute in self.model.all_attributes:
            if not self.objects.filters.get(attribute):
--- a/frigate/detectors/detection_runners.py
+++ b/frigate/detectors/detection_runners.py
@@ -603,4 +603,4 @@ def get_optimized_runner(
            provider_options=options,
        ),
        model_type=model_type,
-    )
+    )
--- a/frigate/embeddings/maintainer.py
+++ b/frigate/embeddings/maintainer.py
@@ -59,7 +59,7 @@ from frigate.data_processing.real_time.license_plate import (
 from frigate.data_processing.types import DataProcessorMetrics, PostProcessDataEnum
 from frigate.db.sqlitevecq import SqliteVecQueueDatabase
 from frigate.events.types import EventTypeEnum, RegenerateDescriptionEnum
-from frigate.genai import GenAIClientManager
+from frigate.genai import get_genai_client
 from frigate.models import Event, Recordings, ReviewSegment, Trigger
 from frigate.util.builtin import serialize
 from frigate.util.file import get_event_thumbnail_bytes
@@ -144,7 +144,7 @@ class EmbeddingMaintainer(threading.Thread):
        self.frame_manager = SharedMemoryFrameManager()

        self.detected_license_plates: dict[str, dict[str, Any]] = {}
-        self.genai_manager = GenAIClientManager(config)
+        self.genai_client = get_genai_client(config)

        # model runners to share between realtime and post processors
        if self.config.lpr.enabled:
@@ -203,15 +203,12 @@ class EmbeddingMaintainer(threading.Thread):
        # post processors
        self.post_processors: list[PostProcessorApi] = []

-        if self.genai_manager.vision_client is not None and any(
+        if self.genai_client is not None and any(
            c.review.genai.enabled_in_config for c in self.config.cameras.values()
        ):
            self.post_processors.append(
                ReviewDescriptionProcessor(
-                    self.config,
-                    self.requestor,
-                    self.metrics,
-                    self.genai_manager.vision_client,
+                    self.config, self.requestor, self.metrics, self.genai_client
                )
            )

@@ -249,7 +246,7 @@ class EmbeddingMaintainer(threading.Thread):
            )
            self.post_processors.append(semantic_trigger_processor)

-        if self.genai_manager.vision_client is not None and any(
+        if self.genai_client is not None and any(
            c.objects.genai.enabled_in_config for c in self.config.cameras.values()
        ):
            self.post_processors.append(
@@ -258,7 +255,7 @@ class EmbeddingMaintainer(threading.Thread):
                    self.embeddings,
                    self.requestor,
                    self.metrics,
-                    self.genai_manager.vision_client,
+                    self.genai_client,
                    semantic_trigger_processor,
                )
            )
--- a/frigate/genai/init.py
+++ b/frigate/genai/init.py
@@ -9,24 +9,13 @@ from typing import Any, Optional

 from playhouse.shortcuts import model_to_dict

-from frigate.config import CameraConfig, GenAIConfig, GenAIProviderEnum
+from frigate.config import CameraConfig, FrigateConfig, GenAIConfig, GenAIProviderEnum
 from frigate.const import CLIPS_DIR
 from frigate.data_processing.post.types import ReviewMetadata
-from frigate.genai.manager import GenAIClientManager
 from frigate.models import Event

 logger = logging.getLogger(__name__)

-__all__ = [
-    "GenAIClient",
-    "GenAIClientManager",
-    "GenAIConfig",
-    "GenAIProviderEnum",
-    "PROVIDERS",
-    "load_providers",
-    "register_genai_provider",
-]
-
 PROVIDERS = {}


@@ -363,6 +352,19 @@ Guidelines:
        }


+def get_genai_client(config: FrigateConfig) -> Optional[GenAIClient]:
+    """Get the GenAI client."""
+    if not config.genai.provider:
+        return None
+
+    load_providers()
+    provider = PROVIDERS.get(config.genai.provider)
+    if provider:
+        return provider(config.genai)
+
+    return None
+
+
 def load_providers():
    package_dir = os.path.dirname(__file__)
    for filename in os.listdir(package_dir):
--- a/frigate/genai/llama_cpp.py
+++ b/frigate/genai/llama_cpp.py
@@ -67,7 +67,6 @@ class LlamaCppClient(GenAIClient):

            # Build request payload with llama.cpp native options
            payload = {
-                "model": self.genai_config.model,
                "messages": [
                    {
                        "role": "user",
@@ -135,7 +134,6 @@ class LlamaCppClient(GenAIClient):
                    openai_tool_choice = "required"

            payload = {
-                "model": self.genai_config.model,
                "messages": messages,
            }

--- a/frigate/genai/manager.py
+++ b/frigate/genai/manager.py
@@ -1,89 +0,0 @@
-"""GenAI client manager for Frigate.
-
-Manages GenAI provider clients from Frigate config. Configuration is read only
-in _update_config(); no other code should read config.genai. Exposes clients
-by role: tool_client, vision_client, embeddings_client.
-"""
-
-import logging
-from typing import TYPE_CHECKING, Optional
-
-from frigate.config import FrigateConfig
-from frigate.config.camera.genai import GenAIRoleEnum
-
-if TYPE_CHECKING:
-    from frigate.genai import GenAIClient
-
-logger = logging.getLogger(__name__)
-
-
-class GenAIClientManager:
-    """Manages GenAI provider clients from Frigate config."""
-
-    def __init__(self, config: FrigateConfig) -> None:
-        self._config = config
-        self._tool_client: Optional[GenAIClient] = None
-        self._vision_client: Optional[GenAIClient] = None
-        self._embeddings_client: Optional[GenAIClient] = None
-        self._update_config()
-
-    def _update_config(self) -> None:
-        """Build role clients from current Frigate config.genai.
-
-        Called from __init__ and can be called again when config is reloaded.
-        Each role (tools, vision, embeddings) gets the client for the provider
-        that has that role in its roles list.
-        """
-        from frigate.genai import PROVIDERS, load_providers
-
-        self._tool_client = None
-        self._vision_client = None
-        self._embeddings_client = None
-
-        if not self._config.genai:
-            return
-
-        load_providers()
-
-        for _name, genai_cfg in self._config.genai.items():
-            if not genai_cfg.provider:
-                continue
-            provider_cls = PROVIDERS.get(genai_cfg.provider)
-            if not provider_cls:
-                logger.warning(
-                    "Unknown GenAI provider %s in config, skipping.",
-                    genai_cfg.provider,
-                )
-                continue
-            try:
-                client = provider_cls(genai_cfg)
-            except Exception as e:
-                logger.exception(
-                    "Failed to create GenAI client for provider %s: %s",
-                    genai_cfg.provider,
-                    e,
-                )
-                continue
-
-            for role in genai_cfg.roles:
-                if role == GenAIRoleEnum.tools:
-                    self._tool_client = client
-                elif role == GenAIRoleEnum.vision:
-                    self._vision_client = client
-                elif role == GenAIRoleEnum.embeddings:
-                    self._embeddings_client = client
-
-    @property
-    def tool_client(self) -> "Optional[GenAIClient]":
-        """Client configured for the tools role (e.g. chat with function calling)."""
-        return self._tool_client
-
-    @property
-    def vision_client(self) -> "Optional[GenAIClient]":
-        """Client configured for the vision role (e.g. review descriptions, object descriptions)."""
-        return self._vision_client
-
-    @property
-    def embeddings_client(self) -> "Optional[GenAIClient]":
-        """Client configured for the embeddings role."""
-        return self._embeddings_client
--- a/frigate/util/config.py
+++ b/frigate/util/config.py
@@ -438,13 +438,6 @@ def migrate_018_0(config: dict[str, dict[str, Any]]) -> dict[str, dict[str, Any]
    """Handle migrating frigate config to 0.18-0"""
    new_config = config.copy()

-    # Migrate GenAI to new format
-    genai = new_config.get("genai")
-
-    if genai and genai.get("provider"):
-        genai["roles"] = ["embeddings", "vision", "tools"]
-        new_config["genai"] = {"default": genai}
-
    # Remove deprecated sync_recordings from global record config
    if new_config.get("record", {}).get("sync_recordings") is not None:
        del new_config["record"]["sync_recordings"]
Author	SHA1	Message	Date
Nicolas Mowen	a505775242	Clarify	2026-02-26 15:18:42 -07:00
Nicolas Mowen	b42b011a56	Improve GenAI docs	2026-02-26 14:52:41 -07:00
Nicolas Mowen	8793650c2f	Fix frame time access	2026-02-26 08:38:42 -07:00
Nicolas Mowen	9c8dd9a6ba	Adapt to new Gemini format	2026-02-25 09:19:56 -07:00
nulledy	507b495b90	ffmpeg Preview Segment Optimization for "high" and "very_high" (#21996 ) * Introduce qmax parameter for ffmpeg preview encoding Added PREVIEW_QMAX_PARAM to control ffmpeg encoding quality. * formatting * Fix spacing in qmax parameters for preview quality	2026-02-25 09:02:08 -07:00
nulledy	3525f32bc2	Allow API Events to be Detections or Alerts, depending on the Event Label (#21923 ) * - API created events will be alerts OR detections, depending on the event label, defaulting to alerts - Indefinite API events will extend the recording segment until those events are ended - API event start time is the actual start time, instead of having a pre-buffer of record.event_pre_capture * Instead of checking for indefinite events on a camera before deciding if we should end the segment, only update last_detection_time and last_alert_time if frame_time is greater, which should have the same effect * Add the ability to set a pre_capture number of seconds when creating a manual event via the API. Default behavior unchanged * Remove unnecessary _publish_segment_start() call * Formatting * handle last_alert_time or last_detection_time being None when checking them against the frame_time * comment manual_info["label"].split(": ")[0] for clarity	2026-02-25 09:02:08 -07:00
Josh Hawkins	ac142449f1	Improve jsmpeg player websocket handling (#21943 ) * improve jsmpeg player websocket handling prevent websocket console messages from appearing when player is destroyed * reformat files after ruff upgrade	2026-02-25 09:02:08 -07:00
FL42	47b89a1d60	feat: add X-Frame-Time when returning snapshot (#21932 ) Co-authored-by: Florent MORICONI <170678386+fmcloudconsulting@users.noreply.github.com>	2026-02-25 09:02:08 -07:00
Eric Work	cdcf56092c	Add networking options for configuring listening ports (#21779 )	2026-02-25 09:02:08 -07:00
Nicolas Mowen	08ee2e21de	Add live context tool to LLM (#21754 ) * Add live context tool * Improve handling of images in request * Improve prompt caching	2026-02-25 09:02:08 -07:00
Nicolas Mowen	9ab4dd4538	Update to ROCm 7.2.0 (#21753 ) * Update to ROCm 7.2.0 * ROCm now works properly with JinaV1 * Arcface has compilation error	2026-02-25 09:02:08 -07:00
Josh Hawkins	fe5441349b	Offline preview image (#21752 ) * use latest preview frame for latest image when camera is offline * remove frame extraction logic * tests * frontend * add description to api endpoint	2026-02-25 09:02:08 -07:00
Nicolas Mowen	a4b1cc3a54	Implement LLM Chat API with tool calling support (#21731 ) * Implement initial tools definiton APIs * Add initial chat completion API with tool support * Implement other providers * Cleanup	2026-02-25 09:02:08 -07:00
John Shaw	99e25661b2	Remove parents in remove_empty_directories (#21726 ) The original implementation did a full directory tree walk to find and remove empty directories, so this implementation should remove the parents as well, like the original did.	2026-02-25 09:02:08 -07:00
Nicolas Mowen	20360db2c9	Implement llama.cpp GenAI Provider (#21690 ) * Implement llama.cpp GenAI Provider * Add docs * Update links * Fix broken mqtt links * Fix more broken anchors	2026-02-25 09:02:08 -07:00
John Shaw	3826d72c2a	Optimize empty directory cleanup for recordings (#21695 ) The previous empty directory cleanup did a full recursive directory walk, which can be extremely slow. This new implementation only removes directories which have a chance of being empty due to a recent file deletion.	2026-02-25 09:02:08 -07:00
Nicolas Mowen	3d5757c640	Refactor Time-Lapse Export (#21668 ) * refactor time lapse creation to be a separate API call with ability to pass arbitrary ffmpeg args * Add CPU fallback	2026-02-25 09:02:08 -07:00
Eugeny Tulupov	86100fde6f	Update go2rtc to v1.9.13 (#21648 ) Co-authored-by: Eugeny Tulupov <eugeny.tulupov@spirent.com>	2026-02-25 09:02:08 -07:00
Josh Hawkins	28b1195a79	Fix incorrect counting in sync_recordings (#21626 )	2026-02-25 09:02:08 -07:00
Josh Hawkins	b6db38bd4e	use same logging pattern in sync_recordings as the other sync functions (#21625 )	2026-02-25 09:02:08 -07:00
Josh Hawkins	92c6b8e484	Media sync API refactor and UI (#21542 ) * generic job infrastructure * types and dispatcher changes for jobs * save data in memory only for completed jobs * implement media sync job and endpoints * change logs to debug * websocket hook and types * frontend * i18n * docs tweaks * endpoint descriptions * tweak docs	2026-02-25 09:02:07 -07:00
Josh Hawkins	9381f26352	Add media sync API endpoint (#21526 ) * add media cleanup functions * add endpoint * remove scheduled sync recordings from cleanup * move to utils dir * tweak import * remove sync_recordings and add config migrator * remove sync_recordings * docs * remove key * clean up docs * docs fix * docs tweak	2026-02-25 09:02:07 -07:00
Nicolas Mowen	e0180005be	Add API to handle deleting recordings (#21520 ) * Add recording delete API * Re-organize recordings apis * Fix import * Consolidate query types	2026-02-25 09:02:07 -07:00
Nicolas Mowen	2041798702	Exports Improvements (#21521 ) * Add images to case folder view * Add ability to select case in export dialog * Add to mobile review too	2026-02-25 09:02:07 -07:00
Nicolas Mowen	3d23b5de30	Add support for GPU and NPU temperatures (#21495 ) * Add rockchip temps * Add support for GPU and NPU temperatures in the frontend * Add support for Nvidia temperature * Improve separation * Adjust graph scaling	2026-02-25 09:02:07 -07:00
Andrew Roberts	209bb44518	Camera-specific hwaccel settings for timelapse exports (correct base) (#21386 ) * added hwaccel_args to camera.record.export config struct * populate camera.record.export.hwaccel_args with a cascade up to camera then global if 'auto' * use new hwaccel args in export * added documentation for camera-specific hwaccel export * fix c/p error * missed an import * fleshed out the docs and comments a bit * ruff lint * separated out the tips in the doc * fix documentation * fix and simplify reference config doc	2026-02-25 09:02:07 -07:00
Nicolas Mowen	88462cd6c3	Refactor temperature reporting for detectors and implement Hailo temp reading (#21395 ) * Add Hailo temperature retrieval * Refactor `get_hailo_temps()` to use ctxmanager * Show Hailo temps in system UI * Move hailo_platform import to get_hailo_temps * Refactor temperatures calculations to use within detector block * Adjust webUI to handle new location --------- Co-authored-by: tigattack <10629864+tigattack@users.noreply.github.com>	2026-02-25 09:02:07 -07:00
Nicolas Mowen	c2cc23861a	Export filter UI (#21322 ) * Get started on export filters * implement basic filter * Implement filtering and adjust api * Improve filter handling * Improve navigation * Cleanup * handle scrolling	2026-02-25 09:02:07 -07:00
Josh Hawkins	2b46084260	Camera connection quality indicator (#21297 ) * add camera connection quality metrics and indicator * formatting * move stall calcs to watchdog * clean up * change watchdog to 1s and separately track time for ffmpeg retry_interval * implement status caching to reduce message volume	2026-02-25 09:02:07 -07:00
Nicolas Mowen	67466f215c	Case management UI (#21299 ) * Refactor export cards to match existing cards in other UI pages * Show cases separately from exports * Add proper filtering and display of cases * Add ability to edit and select cases for exports * Cleanup typing * Hide if no unassigned * Cleanup hiding logic * fix scrolling * Improve layout	2026-02-25 09:02:07 -07:00
Josh Hawkins	e011424947	refactor vainfo to search for first GPU (#21296 ) use existing LibvaGpuSelector to pick appropritate libva device	2026-02-25 09:02:07 -07:00
Nicolas Mowen	a1a0051dd7	implement case management for export apis (#21295 )	2026-02-25 09:02:07 -07:00
Nicolas Mowen	ff331060c3	Create scaffolding for case management (#21293 )	2026-02-25 09:02:07 -07:00
Nicolas Mowen	7aab1f02ec	Update version	2026-02-25 09:02:07 -07:00