Files
wizarr/Dockerfile
Matthieu B 2283c4de68 fix: prevent startup race condition during migrations
This fixes a critical issue where Gunicorn workers would fail to start
after upgrading to v2025.11.0, causing containers to show as unhealthy
with only the uv wrapper process running and no actual workers.

Root Cause:
-----------
In v2025.11.0, library scanning and session recovery were added to the
create_app() function, which runs during EVERY app creation including:
1. During 'flask db upgrade' (migrations)
2. During Gunicorn master when_ready() hook
3. During each Gunicorn worker spawn

The migration 20251103_properly_fix_foreign_keys recreates 4 database
tables with CASCADE foreign keys using raw SQL. This holds exclusive
database locks during table recreation.

When library scanning and session recovery try to query these tables
during migration, they hit database locks, creating a race condition
that causes workers to timeout and crash during startup.

Fix:
----
- Skip library scanning during migrations (FLASK_SKIP_SCHEDULER=true)
- Skip activity monitoring/session recovery during migrations
- Make Gunicorn log level configurable (GUNICORN_LOG_LEVEL env var)
- Add worker lifecycle hooks for better crash debugging
- Increase healthcheck start period from 10s to 60s
- Increase Gunicorn worker timeout from 30s to 120s

Testing:
--------
- Verified app starts successfully with FLASK_SKIP_SCHEDULER=true
- Verified library scanning runs normally without the flag
- Confirmed 0.38s startup during migrations vs 1.61s normal startup

Closes #976
2025-11-03 20:41:52 +01:00

111 lines
3.7 KiB
Docker

# ─── Stage 1: Dependencies ───────────────────────────────────────────────
FROM ghcr.io/astral-sh/uv:python3.13-alpine AS deps
# Install system dependencies
RUN apk add --no-cache nodejs npm
# Set working directory
WORKDIR /app
# Enable bytecode compilation for faster startup
ENV UV_COMPILE_BYTECODE=1
# Use copy link mode to avoid warnings with cache mounts
ENV UV_LINK_MODE=copy
# Copy dependency files first for better caching
COPY pyproject.toml uv.lock ./
# Install Python dependencies only (not project) with cache mount for speed
# Use --frozen to ensure reproducible builds from uv.lock
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --frozen --no-install-project --no-dev
# Copy npm dependency files and install with cache
COPY app/static/package*.json ./app/static/
RUN npm --prefix app/static/ ci --cache /tmp/npm-cache
# ─── Stage 2: Build assets ────────────────────────────────────────────────
FROM deps AS builder
# Copy source files needed for building
COPY app/ ./app/
COPY babel.cfg ./
# Install the project now that we have source code
# Use --frozen to ensure reproducible builds from uv.lock
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --frozen --no-dev
# Build translations (include fuzzy entries so pending translations are bundled)
RUN uv run --frozen --no-dev pybabel compile --use-fuzzy -d app/translations
# Ensure static directories exist and build static assets
RUN mkdir -p app/static/js app/static/css && npm --prefix app/static/ run build
# ─── Stage 3: Runtime ─────────────────────────────────────────────────────
FROM ghcr.io/astral-sh/uv:python3.13-alpine
# Set default environment variables for user/group IDs
ENV PUID=1000
ENV PGID=1000
# Install runtime dependencies only
RUN apk add --no-cache curl tzdata su-exec
# Set application working directory
WORKDIR /app
# Copy Python environment from builder stage (includes project)
COPY --chown=1000:1000 --from=builder /app/.venv /app/.venv
# Copy source files first (run.py, gunicorn.conf.py, migrations/, etc.)
COPY --chown=1000:1000 . /app
# Then overwrite app/ with built version (has compiled translations)
COPY --chown=1000:1000 --from=builder /app/app /app/app
# Create data directory for database (backward compatibility)
RUN mkdir -p /data/database
# Create wizard steps config directory
RUN mkdir -p /etc/wizarr/wizard_steps
# Create directories that need to be writable
RUN mkdir -p /.cache
ARG APP_VERSION=dev
ENV APP_VERSION=${APP_VERSION}
# Set Flask environment to production
ENV FLASK_ENV=production
# Healthcheck: curl to localhost:5690/health
# Increased start-period to 60s to account for:
# - Database migrations
# - Library scanning
# - Wizard step imports
# - Worker initialization (4 workers * ~10s each)
HEALTHCHECK --interval=30s --timeout=5s --start-period=60s --retries=3 \
CMD curl -fs http://localhost:5690/health || exit 1
# Expose port 5690
EXPOSE 5690
# Copy any wizard steps into /opt
COPY wizard_steps /opt/default_wizard_steps
# Copy entrypoint script and make it executable
COPY docker-entrypoint.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
# Entrypoint and default CMD
ENTRYPOINT ["docker-entrypoint.sh"]
# By default we run Gunicorn under wizarruser
CMD ["uv", "run", "--frozen", "--no-dev", "gunicorn", \
"--config", "gunicorn.conf.py", \
"--bind", "0.0.0.0:5690", \
"--umask", "007", \
"run:app"]