* feat: add fine-tuning endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(experimental): add fine-tuning endpoint and TRL support This changeset defines new GRPC signatues for Fine tuning backends, and add TRL backend as initial fine-tuning engine. This implementation also supports exporting to GGUF and automatically importing it to LocalAI after fine-tuning. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * commit TRL backend, stop by killing process Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * move fine-tune to generic features Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add evals, reorder menu Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
5.5 KiB
Debugging and Rebuilding Backends
When a backend fails at runtime (e.g. a gRPC method error, a Python import error, or a dependency conflict), use this guide to diagnose, fix, and rebuild.
Architecture Overview
- Source directory:
backend/python/<name>/(orbackend/go/<name>/,backend/cpp/<name>/) - Installed directory:
backends/<name>/— this is what LocalAI actually runs. It is populated bymake backends/<name>which builds a Docker image, exports it, and installs it vialocal-ai backends install. - Virtual environment:
backends/<name>/venv/— the installed Python venv (for Python backends). The Python binary is atbackends/<name>/venv/bin/python.
Editing files in backend/python/<name>/ does not affect the running backend until you rebuild with make backends/<name>.
Diagnosing Failures
1. Check the logs
Backend gRPC processes log to LocalAI's stdout/stderr. Look for lines tagged with the backend's model ID:
GRPC stderr id="trl-finetune-127.0.0.1:37335" line="..."
Common error patterns:
- "Method not implemented" — the backend is missing a gRPC method that the Go side calls. The model loader (
pkg/model/initializers.go) always callsLoadModelafterHealth; fine-tuning backends must implement it even as a no-op stub. - Python import errors /
AttributeError— usually a dependency version mismatch (e.g.pyarrowremovingPyExtensionType). - "failed to load backend" — the gRPC process crashed or never started. Check stderr lines for the traceback.
2. Test the Python environment directly
You can run the installed venv's Python to check imports without starting the full server:
backends/<name>/venv/bin/python -c "import datasets; print(datasets.__version__)"
If pip is missing from the venv, bootstrap it:
backends/<name>/venv/bin/python -m ensurepip
Then use backends/<name>/venv/bin/python -m pip install ... to test fixes in the installed venv before committing them to the source requirements.
3. Check upstream dependency constraints
When you hit a dependency conflict, check what the main library expects. For example, TRL's upstream requirements.txt:
https://github.com/huggingface/trl/blob/main/requirements.txt
Pin minimum versions in the backend's requirements files to match upstream.
Common Fixes
Missing gRPC methods
If the Go side calls a method the backend doesn't implement (e.g. LoadModel), add a no-op stub in backend.py:
def LoadModel(self, request, context):
"""No-op — actual loading happens elsewhere."""
return backend_pb2.Result(success=True, message="OK")
The gRPC contract requires LoadModel to succeed for the model loader to return a usable client, even if the backend doesn't need upfront model loading.
Dependency version conflicts
Python backends often break when a transitive dependency releases a breaking change (e.g. pyarrow removing PyExtensionType). Steps:
- Identify the broken import in the logs
- Test in the installed venv:
backends/<name>/venv/bin/python -c "import <module>" - Check upstream requirements for version constraints
- Update all requirements files in
backend/python/<name>/:requirements.txt— base deps (grpcio, protobuf)requirements-cpu.txt— CPU-specific (includes PyTorch CPU index)requirements-cublas12.txt— CUDA 12requirements-cublas13.txt— CUDA 13
- Rebuild:
make backends/<name>
PyTorch index conflicts (uv resolver)
The Docker build uses uv for pip installs. When --extra-index-url points to the PyTorch wheel index, uv may refuse to fetch packages like requests from PyPI if it finds a different version on the PyTorch index first. Fix this by adding --index-strategy=unsafe-first-match to install.sh:
EXTRA_PIP_INSTALL_FLAGS+=" --upgrade --index-strategy=unsafe-first-match"
installRequirements
Most Python backends already do this — check backend/python/transformers/install.sh or similar for reference.
Rebuilding
Rebuild a single backend
make backends/<name>
This runs the Docker build (Dockerfile.python), exports the image to backend-images/<name>.tar, and installs it into backends/<name>/. It also rebuilds the local-ai Go binary (without extra tags).
Important: If you were previously running with GO_TAGS=auth, the make backends/<name> step will overwrite your binary without that tag. Rebuild the Go binary afterward:
GO_TAGS=auth make build
Rebuild and restart
After rebuilding a backend, you must restart LocalAI for it to pick up the new backend files. The backend gRPC process is spawned on demand when the model is first loaded.
# Kill existing process
kill <pid>
# Restart
./local-ai run --debug [your flags]
Quick iteration (skip Docker rebuild)
For fast iteration on a Python backend's backend.py without a full Docker rebuild, you can edit the installed copy directly:
# Edit the installed copy
vim backends/<name>/backend.py
# Restart LocalAI to respawn the gRPC process
This is useful for testing but does not persist — the next make backends/<name> will overwrite it. Always commit fixes to the source in backend/python/<name>/.
Verification
After fixing and rebuilding:
- Start LocalAI and confirm the backend registers: look for
Registering backend name="<name>"in the logs - Trigger the operation that failed (e.g. start a fine-tuning job)
- Watch the GRPC stderr/stdout lines for the backend's model ID
- Confirm no errors in the traceback