Files
LocalAI/docs/content/features/agents.md
LocalAI [bot] d200401e86 feat: Add --data-path CLI flag for persistent data separation (#8888)
feat: add --data-path CLI flag for persistent data separation

- Add LOCALAI_DATA_PATH environment variable and --data-path CLI flag
- Default data path: /data (separate from configuration directory)
- Automatic migration on startup: moves agent_tasks.json, agent_jobs.json, collections/, and assets/ from old config dir to new data path
- Backward compatible: preserves old behavior if LOCALAI_DATA_PATH is not set
- Agent state and job directories now use DataPath with proper fallback chain
- Update documentation with new flag and docker-compose example

This separates mutable persistent data (collectiondb, agents, assets, skills) from configuration files, enabling better volume mounting and data persistence in containerized deployments.

Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
2026-03-09 14:11:15 +01:00

12 KiB

+++ disableToc = false title = "🤖 Agents" weight = 21 url = '/features/agents' +++

LocalAI includes a built-in agent platform powered by LocalAGI. Agents are autonomous AI entities that can reason, use tools, maintain memory, and interact with external services — all running locally as part of the LocalAI process.

Overview

The agent system provides:

  • Autonomous agents with configurable goals, personalities, and capabilities
  • Tool/Action support — agents can execute actions (web search, code execution, API calls, etc.)
  • Knowledge base (RAG) — per-agent collections with document upload, chunking, and semantic search
  • Skills system — reusable skill definitions that agents can leverage, with git-based skill repositories
  • SSE streaming — real-time chat with agents via Server-Sent Events
  • Import/Export — share agent configurations as JSON files
  • Agent Hub — browse and download ready-made agents from agenthub.localai.io
  • Web UI — full management interface for creating, editing, chatting with, and monitoring agents

Getting Started

Agents are enabled by default. To disable them, set:

LOCALAI_DISABLE_AGENTS=true

Creating an Agent

  1. Navigate to the Agents page in the web UI
  2. Click Create Agent or import one from the Agent Hub
  3. Configure the agent's name, model, system prompt, and actions
  4. Save and start chatting

Importing an Agent

You can import agent configurations from JSON files:

  1. Download an agent configuration from the Agent Hub or export one from another LocalAI instance
  2. On the Agents page, click Import
  3. Select the JSON file — you'll be taken to the edit form to review and adjust the configuration before saving
  4. Click Create Agent to finalize the import

Configuration

Environment Variables

All agent-related settings can be configured via environment variables:

Variable Default Description
LOCALAI_DISABLE_AGENTS false Disable the agent pool feature entirely
LOCALAI_AGENT_POOL_API_URL (self-referencing) Default API URL for agents. By default, agents call back into LocalAI's own API (http://127.0.0.1:<port>). Set this to point agents to an external LLM provider.
LOCALAI_AGENT_POOL_API_KEY (LocalAI key) Default API key for agents. Defaults to the first LocalAI API key. Set this when using an external provider.
LOCALAI_AGENT_POOL_DEFAULT_MODEL (empty) Default LLM model for new agents
LOCALAI_AGENT_POOL_MULTIMODAL_MODEL (empty) Default multimodal (vision) model for agents
LOCALAI_AGENT_POOL_TRANSCRIPTION_MODEL (empty) Default transcription (speech-to-text) model for agents
LOCALAI_AGENT_POOL_TRANSCRIPTION_LANGUAGE (empty) Default transcription language for agents
LOCALAI_AGENT_POOL_TTS_MODEL (empty) Default TTS (text-to-speech) model for agents
LOCALAI_AGENT_POOL_STATE_DIR (data path) Directory for persisting agent state. Defaults to LOCALAI_DATA_PATH if set, otherwise falls back to LOCALAI_CONFIG_DIR
LOCALAI_AGENT_POOL_TIMEOUT 5m Default timeout for agent operations
LOCALAI_AGENT_POOL_ENABLE_SKILLS false Enable the skills service
LOCALAI_AGENT_POOL_VECTOR_ENGINE chromem Vector engine for knowledge base (chromem or postgres)
LOCALAI_AGENT_POOL_EMBEDDING_MODEL granite-embedding-107m-multilingual Embedding model for knowledge base
LOCALAI_AGENT_POOL_CUSTOM_ACTIONS_DIR (empty) Directory for custom action plugins
LOCALAI_AGENT_POOL_DATABASE_URL (empty) PostgreSQL connection string for collections (required when vector engine is postgres)
LOCALAI_AGENT_POOL_MAX_CHUNKING_SIZE 400 Maximum chunk size for document ingestion
LOCALAI_AGENT_POOL_CHUNK_OVERLAP 0 Overlap between document chunks
LOCALAI_AGENT_POOL_ENABLE_LOGS false Enable detailed agent logging
LOCALAI_AGENT_POOL_COLLECTION_DB_PATH (empty) Custom path for the collections database
LOCALAI_AGENT_HUB_URL https://agenthub.localai.io URL for the Agent Hub (shown in the UI)

Knowledge Base Storage

By default, the knowledge base uses chromem — an in-process vector store that requires no external dependencies. For production deployments with larger knowledge bases, you can switch to PostgreSQL with pgvector support:

LOCALAI_AGENT_POOL_VECTOR_ENGINE=postgres
LOCALAI_AGENT_POOL_DATABASE_URL=postgresql://localrecall:localrecall@postgres:5432/localrecall?sslmode=disable

The PostgreSQL image quay.io/mudler/localrecall:v0.5.2-postgresql is pre-configured with pgvector and ready to use.

Docker Compose Example

Basic setup with in-memory vector store:

services:
  localai:
    image: localai/localai:latest
    ports:
      - 8080:8080
    environment:
      - MODELS_PATH=/models
      - LOCALAI_DATA_PATH=/data
      - LOCALAI_AGENT_POOL_DEFAULT_MODEL=hermes-3-llama3.1-8b
      - LOCALAI_AGENT_POOL_EMBEDDING_MODEL=granite-embedding-107m-multilingual
      - LOCALAI_AGENT_POOL_ENABLE_SKILLS=true
      - LOCALAI_AGENT_POOL_ENABLE_LOGS=true
    volumes:
      - models:/models
      - localai_data:/data
      - localai_config:/etc/localai
volumes:
  models:
  localai_data:
  localai_config:

Setup with PostgreSQL for persistent knowledge base:

services:
  localai:
    image: localai/localai:latest
    depends_on:
      postgres:
        condition: service_healthy
    ports:
      - 8080:8080
    environment:
      - MODELS_PATH=/models
      - LOCALAI_AGENT_POOL_DEFAULT_MODEL=hermes-3-llama3.1-8b
      - LOCALAI_AGENT_POOL_EMBEDDING_MODEL=granite-embedding-107m-multilingual
      - LOCALAI_AGENT_POOL_ENABLE_SKILLS=true
      - LOCALAI_AGENT_POOL_ENABLE_LOGS=true
      # PostgreSQL-backed knowledge base
      - LOCALAI_AGENT_POOL_VECTOR_ENGINE=postgres
      - LOCALAI_AGENT_POOL_DATABASE_URL=postgresql://localrecall:localrecall@postgres:5432/localrecall?sslmode=disable
    volumes:
      - models:/models
      - localai_config:/etc/localai

  postgres:
    image: quay.io/mudler/localrecall:v0.5.2-postgresql
    environment:
      - POSTGRES_DB=localrecall
      - POSTGRES_USER=localrecall
      - POSTGRES_PASSWORD=localrecall
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U localrecall"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  models:
  localai_config:
  postgres_data:

Agent Configuration

Each agent has its own configuration that controls its behavior. Key settings include:

  • Name — unique identifier for the agent
  • Model — the LLM model the agent uses for reasoning
  • System Prompt — defines the agent's personality and instructions
  • Actions — tools the agent can use (web search, code execution, etc.)
  • Connectors — external integrations (Slack, Discord, etc.)
  • Knowledge Base — collections of documents for RAG
  • MCP Servers — Model Context Protocol servers for additional tool access

The pool-level defaults (API URL, API key, models) can be set via environment variables. Individual agents can further override these in their configuration, allowing them to use different LLM providers (OpenAI, other LocalAI instances, etc.) on a per-agent basis.

API Endpoints

All agent endpoints are grouped under /api/agents/:

Agent Management

Method Path Description
GET /api/agents List all agents with status
POST /api/agents Create a new agent
GET /api/agents/:name Get agent info
PUT /api/agents/:name Update agent configuration
DELETE /api/agents/:name Delete an agent
GET /api/agents/:name/config Get agent configuration
PUT /api/agents/:name/pause Pause an agent
PUT /api/agents/:name/resume Resume a paused agent
GET /api/agents/:name/status Get agent status and observables
POST /api/agents/:name/chat Send a message to an agent
GET /api/agents/:name/sse SSE stream for real-time agent events
GET /api/agents/:name/export Export agent configuration as JSON
POST /api/agents/import Import an agent from JSON
GET /api/agents/config/metadata Get dynamic config form metadata

Skills

Method Path Description
GET /api/agents/skills List all skills
POST /api/agents/skills Create a new skill
GET /api/agents/skills/:name Get a skill
PUT /api/agents/skills/:name Update a skill
DELETE /api/agents/skills/:name Delete a skill
GET /api/agents/skills/search Search skills
GET /api/agents/skills/export/* Export a skill
POST /api/agents/skills/import Import a skill

Collections (Knowledge Base)

Method Path Description
GET /api/agents/collections List collections
POST /api/agents/collections Create a collection
POST /api/agents/collections/:name/upload Upload a document
GET /api/agents/collections/:name/entries List entries
POST /api/agents/collections/:name/search Search a collection
POST /api/agents/collections/:name/reset Reset a collection

Actions

Method Path Description
GET /api/agents/actions List available actions
POST /api/agents/actions/:name/definition Get action definition
POST /api/agents/actions/:name/run Execute an action

Using Agents via the Responses API

Agents can be used programmatically via the standard /v1/responses endpoint (OpenAI Responses API). Simply use the agent name as the model field:

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-agent",
    "input": "What is the weather today?"
  }'

This returns a standard Responses API response:

{
  "id": "resp_...",
  "object": "response",
  "status": "completed",
  "model": "my-agent",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "The agent's response..."
        }
      ]
    }
  ]
}

You can also send structured message arrays as input:

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-agent",
    "input": [
      {"role": "user", "content": "Summarize the latest news about AI"}
    ]
  }'

When the model name matches an agent, the request is routed to the agent pool. If no agent matches, it falls through to the normal model-based inference pipeline.

Chat with SSE Streaming

For real-time streaming responses, use the chat endpoint with SSE:

Send a message to an agent:

curl -X POST http://localhost:8080/api/agents/my-agent/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the weather today?"}'

Listen to real-time events via SSE:

curl -N http://localhost:8080/api/agents/my-agent/sse

The SSE stream emits the following event types:

  • json_message — agent/user messages
  • json_message_status — processing status updates (processing / completed)
  • status — system messages (reasoning steps, action results)
  • json_error — error notifications

Architecture

Agents run in-process within LocalAI. By default, each agent calls back into LocalAI's own API (http://127.0.0.1:<port>/v1/chat/completions) for LLM inference. This means:

  • No external dependencies — everything runs in a single binary
  • Agents use the same models loaded in LocalAI
  • Per-agent overrides allow pointing individual agents to external providers
  • Agent state is persisted to disk and restored on restart
User → POST /api/agents/:name/chat → LocalAI
  → AgentPool → Agent reasoning loop
    → POST /v1/chat/completions (self-referencing)
      → LocalAI model inference → response
        → SSE events → GET /api/agents/:name/sse → UI