Files
Anthias/docker-compose.dev.yml
Viktor Petersson f421130b24 refactor(server): collapse nginx + websocket containers into uvicorn (#2757)
* refactor(server): collapse nginx + websocket containers into uvicorn

Replace the nginx + gunicorn + gevent-websocket trio with a single
uvicorn ASGI server inside `anthias-server`:

* HTTP, /static/, /anthias_assets/, /static_with_mime/, and /hotspot
  are now served from Django (WhiteNoise + small file-serving views in
  `anthias_app/views_files.py` that re-implement nginx's IP allowlists).
* WebSockets move from a separate gevent process talking ZMQ to Django
  Channels with a Redis-backed channel layer, fanned out by celery via
  `channel_layer.group_send`.
* TLS termination is handled by uvicorn directly when SSL_CERTFILE /
  SSL_KEYFILE are set; `bin/enable_ssl.sh` now writes a compose
  override (no longer ansible) and a companion `bin/disable_ssl.sh`
  removes it. Cert + key live under `~/.anthias/ssl/`.
* `bin/upgrade_containers.sh` removes the legacy `anthias-nginx` and
  `anthias-websocket` containers on upgrade so they don't linger.
* Drop `gunicorn`, `gevent`, `gevent-websocket`, and the `websocket`
  uv group from `pyproject.toml`; add `channels`, `channels-redis`,
  `daphne`, `uvicorn[standard]`, and `whitenoise`.

Notes on hardening: `--forwarded-allow-ips` defaults to off so the IP
allowlist can't be bypassed via a spoofed `X-Forwarded-For`; operators
behind a reverse proxy can opt in via the `FORWARDED_ALLOW_IPS` env
var. Backup uploads previously sized by nginx's `client_max_body_size
4G` are preserved by setting `DATA_UPLOAD_MAX_MEMORY_SIZE = None`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address review feedback on uvicorn migration

* Drop USE_X_FORWARDED_HOST (inconsistent with the deliberate
  --forwarded-allow-ips hardening; without a proxy, X-Forwarded-Host is
  client-controlled).
* Remove daphne — uvicorn runs production and the test environment now
  uses it too (bin/prepare_test_environment.sh).
* Replace _safe_join's parents-membership check with Path.is_relative_to.
* Drop AllowedHostsOriginValidator wrapper (no-op under ALLOWED_HOSTS=['*'])
  and document where to put it back if hosts are ever locked down.
* Rename DOCKER_CIDR → DOCKER_BRIDGE_CIDR with a comment that this is
  defense-in-depth, not a real perimeter (LAN clients via the published
  port also appear in 172.16/12).
* Add anthias_app/tests.py covering the IP allowlists, mime override,
  hotspot gating, and traversal/symlink rejection in _safe_join (17 tests).
* Note the single-worker ZmqPublisher bind constraint in start_server.sh
  so a future scale-up doesn't EADDRINUSE on tcp://0.0.0.0:10001.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): clear SonarCloud hotspots on uvicorn migration

* Restrict views_files.anthias_assets / static_with_mime / hotspot to
  GET via @require_GET (Sonar S3752, x3): they are read-only file
  servers and should reject other methods at the view boundary.
* Mark RFC1918 / Docker-bridge CIDR literals as NOSONAR S1313 (x4):
  they are intentional, well-known private network ranges.
* Mark `http://*` in CSRF_TRUSTED_ORIGINS as NOSONAR S5332 with a
  comment explaining devices ship over HTTP and operators opt into TLS
  via bin/enable_ssl.sh.

Existing 17 view tests continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: clear remaining static-analysis findings

* ruff format -- the previous tests.py reformatted itself; CI's
  `ruff format --check` now passes.
* CodeQL py/path-injection on _safe_join: rewrite using
  os.path.realpath + os.path.commonpath, which CodeQL recognises as a
  sanitiser for path-injection sinks. Behaviour is identical to the
  Path.is_relative_to version (both reject `..` and symlink escapes;
  the 17 tests in anthias_app/tests.py still pass).
* SonarCloud NOSONAR markers: switch to the codebase's bare `# NOSONAR`
  form (matches host_agent.py and tests/test_backup_helper.py); the
  earlier `# NOSONAR <rule>` form was not being honoured.
* Centralise the test-fixture IPs in module-level constants so S1313
  is suppressed in one place rather than at every callsite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): inline path-injection check in views

CodeQL only treats os.path.commonpath as a sanitiser when the check
sits in the same function as the file-system sink — calling
_safe_join() from a separate function still leaves the open()/isfile()
sinks tainted (4 alerts on PR #2757).

Repeat the realpath + commonpath check inline in anthias_assets and
static_with_mime so CodeQL can prove the post-check path stays under
the configured root. _safe_join is kept for the SafeJoinTest unit
tests and as a documented helper.

Existing 17 tests in anthias_app/tests.py continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): use realpath+startswith path sanitiser for CodeQL

CodeQL's path-injection model recognises the canonical
`realpath(...).startswith(base + sep)` pattern but apparently not
`os.path.commonpath(...) == root` in this codepath. Switch the inline
check in anthias_assets and static_with_mime to startswith so the
analyser can prove the post-check path stays under the configured
root.

Behaviour is identical: traversal and symlink-escape still 404
(verified by SafeJoinTest + view tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot review feedback

* lib/utils.py imported channels/asgiref at module level. The viewer
  container imports lib.utils via viewer/__init__.py but its uv
  dependency group does not ship channels, so the viewer would
  ImportError on startup. Move the channels imports into
  YoutubeDownloadThread.run() (server/celery-only path) so lib.utils
  remains importable from the viewer.
* Drop the unused _safe_join() helper and its three SafeJoinTest
  cases — the views inline a realpath+startswith sanitiser (CodeQL
  needs the check in the same function as the sink), and the helper
  was only being exercised in isolation. Add an equivalent
  symlink-escape test against anthias_assets so the actual code path
  used by the views is covered.
* Refresh the anthias_django/settings.py docstring + Django doc URLs
  from /3.2/ → /4.2/ to match the pinned Django version.

15 view tests pass (was 17 — lost 3 SafeJoinTest + gained 1 symlink
test against the real view).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: refresh architecture diagram for uvicorn migration

Drop the anthias-nginx and anthias-websocket nodes (and their edges)
from docs/d2/anthias-diagram-overview.d2 — the user now talks
directly to anthias-server (uvicorn handling HTTP + /ws), Celery
fans out asset-update events through the Redis-backed Channels
layer, and the viewer fetches media from anthias-server over HTTP.

Regenerate the SVG with d2 v0.7.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot SSL + CSRF / WS-origin feedback

* Dual uvicorn listeners when SSL is enabled (Copilot #1, #2). HTTP on
  $HTTP_PORT (default 8080) for inter-container traffic — viewer +
  webview hit anthias-server over plain HTTP on the Docker network and
  cannot validate uvicorn's self-signed cert. HTTPS on $HTTPS_PORT
  (default 8443) for external clients. bin/enable_ssl.sh now appends
  443:8443 to the compose ports list (instead of using `!override` to
  swap 80:8080 for 443:8080), so port 80 stays available for backward
  compatibility and the Docker-network HTTP port keeps working.
* Drop CSRF_TRUSTED_ORIGINS = ['http://*', 'https://*'] (Copilot #3).
  Verified via Django shell: those leading wildcards are ignored by
  Django 4.2 (only subdomain wildcards like https://*.example.com are
  honoured), so the setting was a no-op. Same-origin POSTs still pass
  through Django's built-in Origin/Host check.
* Re-add channels.security.websocket.AllowedHostsOriginValidator to
  the WebSocket router (Copilot #5). Currently a no-op under
  ALLOWED_HOSTS=['*'], but tightening ALLOWED_HOSTS later will now
  also tighten /ws.

Smoke test (dev + SSL override):
- HTTP  http://localhost:8000/      -> 200
- HTTPS https://localhost:8443/     -> 200
- HTTP  http://localhost:8443/      -> 000 (TLS-only, expected)
- internal http://localhost:8080/   -> 200
- 15 view tests still pass.

Note: Copilot #4 (Docker-bridge CIDR is bypassable via the published
port) is documented in views_files.py as defense-in-depth and matches
the original nginx posture; switching to app-layer auth is out of
scope for this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(ssl): switch from in-uvicorn TLS to a Caddy sidecar

The previous SSL implementation gave anthias-server two uvicorn
listeners (HTTP + HTTPS) so the viewer/webview could keep talking
plain HTTP over the Docker network while external clients got TLS.
That dual-listener dance is non-zero overhead and complicates signal
handling. Switch to the standard reverse-proxy pattern instead.

When SSL is enabled by bin/enable_ssl.sh:

* anthias-server stays a single uvicorn listener on plain HTTP 8080
  (no SSL_CERTFILE/SSL_KEYFILE knobs, no dual-port logic).
* A Caddy sidecar (caddy:2-alpine, only present when the override is
  installed) terminates TLS on host port 443, redirects 80→443, and
  reverse-proxies to anthias-server:8080 — so X-Forwarded-Proto /
  X-Forwarded-For are forwarded as-is by Caddy.
* The override removes anthias-server's external port mapping
  (`ports: !override []`), so all external traffic must enter through
  Caddy and the IP allowlists in views_files.py see the original LAN
  client IP rather than the docker-bridge gateway. Inter-container
  traffic is unchanged.
* `FORWARDED_ALLOW_IPS=*` is set on anthias-server in the override —
  safe because anthias-server is no longer reachable from outside the
  Docker network — and `SECURE_PROXY_SSL_HEADER` is added in Django
  settings so request.is_secure() returns True for HTTPS callers.
* When SSL is *not* enabled there is zero new container, zero new
  config — the base compose file is untouched and Caddy isn't pulled
  or run.

bin/disable_ssl.sh now also removes the anthias-caddy container
before deleting the override, so HTTPS-only state is fully reversed.

Smoke-tested with a temporary Caddy override:
- HTTPS via Caddy:        200
- HTTP via Caddy:         301 → https://...
- Direct anthias-server:  refused (port mapping dropped by override)
- WebSocket upgrade:      101 Switching Protocols
- request.is_secure() with X-Forwarded-Proto=https: True
- 15 anthias_app view tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(views_files): document IP-allowlist threat model

Spell out exactly when the docker-bridge CIDR check is and isn't a
real perimeter:

* No-SSL default: anthias-server is published as 80:8080, so requests
  arrive with REMOTE_ADDR set to the docker bridge gateway (172.x) and
  LAN clients aren't actually excluded. Trying to plug the gap with
  auth would be security theatre — credentials would travel in
  plaintext over the LAN anyway.
* SSL via the Caddy sidecar: Caddy terminates TLS, rewrites
  X-Forwarded-For, uvicorn honours it (FORWARDED_ALLOW_IPS=*), and the
  check sees the real client IP — so the bypass is closed for any
  deployment that actually cares about confidentiality.

This is documentation only; no behavioural change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ssl): add --domain (auto Let's Encrypt) + drop openssl shim

bin/enable_ssl.sh now has three modes instead of two:

* Default (no args) — Caddy issues per-SNI certs lazily from its
  built-in local CA via `tls internal { on_demand }`. Drops the
  openssl self-signed-cert generation step entirely; Caddy persists
  the CA in the anthias-caddy-data volume and rotates leaf certs
  itself. Browsers still warn (CA is local) but no openssl/cert
  hygiene is needed on the host.

* `--domain example.com [--email you@example.com] [--staging]` —
  Caddy auto-issues + renews from Let's Encrypt. Caddy auto-creates
  the HTTP→HTTPS redirect for hostname sites. Use `--staging` to point
  at the ACME staging endpoint while testing, so the production rate
  limits aren't burned.

* `--cert /path/to/cert.pem --key /path/to/key.pem [--domain ...]` —
  unchanged: bring your own cert, Caddy serves it as-is with
  `auto_https off`.

Verified:
- All three Caddyfiles pass `caddy validate`.
- Default mode end-to-end: HTTPS=200 with cert from "Caddy Local
  Authority - ECC Intermediate", per-SNI SANs (DNS:localhost,
  IP Address:192.168.99.99 etc.), HTTP→HTTPS=301, /ws upgrade=101,
  anthias-server's external port mapping is dropped so direct access
  is refused.

Docs (CLAUDE.md, docs/README.md, docs/developer-documentation.md)
updated to describe the Caddy sidecar instead of in-uvicorn TLS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address self-review findings on PR #2757

* Gate SECURE_PROXY_SSL_HEADER on FORWARDED_ALLOW_IPS
  (anthias_django/settings.py): without the gate, a client on a
  plain-HTTP deploy could send `X-Forwarded-Proto: https` and flip
  `request.is_secure()`. Django reads the header from META directly,
  independent of uvicorn's --proxy-headers flag, so the previous
  unconditional setting was actually exploitable in non-SSL mode
  (secure-cookied sessions would drop on the next plain-HTTP request,
  redirects would point at https:// URLs that don't exist).

  Verified live: non-SSL → SECURE_PROXY_SSL_HEADER is None and
  is_secure() with spoofed XFP=https returns False; SSL via Caddy
  override → header is set and is_secure() returns True.

* Replace the isfile() pre-check + open() in anthias_assets and
  static_with_mime with a try/except FileNotFoundError around open()
  (anthias_app/views_files.py). Eliminates a (tiny but real) TOCTOU
  window between the stat and the open. IsADirectoryError handled
  too, since `realpath('/dir/')` resolves to the directory and open()
  would otherwise 500.

* Comment FORWARDED_ALLOW_IPS=* assumption in bin/enable_ssl.sh: the
  wildcard is only safe because the override drops anthias-server's
  external port mapping, so any future edit that re-adds a host:port
  publication has to either tighten the wildcard to Caddy's IP/CIDR
  or unset it.

* Replace ANSI-C escape sequences in the Caddyfile generator with
  plain multi-line strings. `read -r -d ''` was the first attempt
  but it strips trailing newlines, which collapsed `auto_https off`
  onto the same line as `}` in cert mode. Multi-line literals with
  echo "$VAR" are unambiguous and Caddy validates all three modes
  cleanly again.

* Add a docker-volume cleanup hint to bin/disable_ssl.sh: Caddy's
  local CA persists in anthias_anthias-caddy-data so an enable →
  disable → enable cycle reuses the same CA (intentional — browsers
  that trusted it stay trusted), and operators who want a fresh CA
  now have the exact `docker volume rm` command in the script's
  output.

15 view tests still pass; default + SSL Caddyfiles still validate;
default + SSL endpoints still return 200 / 301 / 101 in smoke tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot's host/MIME hardening feedback

Two security tightenings on top of the prior SECURE_PROXY_SSL_HEADER
gate (which Copilot flagged on a stale snapshot — that one's already
fixed in 07b784b9):

* `ALLOWED_HOSTS` is now driven by the `ALLOWED_HOSTS` env var, with
  `*` kept as the default so flexible LAN-by-IP / mDNS access still
  works out of the box. Operators on hardened LANs can opt into a
  strict allowlist (`ALLOWED_HOSTS=192.168.1.50,anthias.local,...`)
  to defend against DNS-rebinding without us guessing the right set
  of hostnames at install time. Verified the env override parses to
  `['192.168.1.50', 'anthias.local', 'localhost']`.

* `static_with_mime` now allowlists the `?mime=` query param against
  a small set of download-only types
  (`application/{gzip,octet-stream,x-gzip,x-tar,x-tgz,zip}`) instead
  of accepting whatever the caller sends. Closes the XSS footgun
  where `?mime=text/html` would have served a stored file as HTML.
  The frontend's only legitimate caller (the backup download) sends
  `application/x-tgz`, which is in the allowlist; anything else
  falls back to mimetypes.guess_type. Added
  `test_mime_override_rejects_html` to lock that behaviour in.

16 view tests pass; ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:51:40 +01:00

45 lines
906 B
YAML

# vim: ft=yaml.docker-compose
services:
anthias-server:
build:
context: .
dockerfile: docker/Dockerfile.server
ports:
- 8000:8080
environment:
- HOME=/data
- LISTEN=0.0.0.0
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/0
- ENVIRONMENT=development
depends_on:
- redis
restart: always
volumes:
- anthias-data:/data
- ./:/usr/src/app/
anthias-celery:
build:
context: .
dockerfile: docker/Dockerfile.celery
depends_on:
- anthias-server
- redis
environment:
- HOME=/data
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/0
restart: always
volumes:
- anthias-data:/data
redis:
platform: "linux/amd64"
image: redis:alpine
volumes:
anthias-data:
redis-data: