Files
Anthias/docker-compose.dev.yml
Viktor Petersson 43b937563b fix(sentry): stop reporting transient redis blips and client disconnects (#3018)
* fix(sentry): stop reporting transient redis blips and client disconnects

- Redis restarting (container recycle, compose startup before DNS
  resolves) produced an error event per process per blip even though
  every consumer self-heals: celery reconnects with backoff, the
  viewer's resolution reporter retries next tick, Channels
  re-establishes on the next frame (Sentry ANTHIAS-M, ANTHIAS-K,
  ANTHIAS-H, ANTHIAS-J)
- Add a before_send hook that drops events whose exception chain
  contains redis.exceptions.ConnectionError or asyncio.CancelledError
  (an HTTP client hanging up mid-request under ASGI — ANTHIAS-N)
- Silence celery's per-reconnect-attempt ERROR log at the logger
  (it arrives as a log message, not an exception)
- Downgrade the viewer reporter's redis-down log to a warning and
  extract the tick body into a testable helper
- Add regression tests for the filter and the reporter tick

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(sentry): address review — typed before_send, cleaner test fixtures

- Annotate the hook with sentry_sdk.types Event/Hint for strict mypy
- Build exc_info triples directly in tests instead of catching
  BaseException (Sonar S5754) and compare events by equality
  (Sonar S5796)
- Use record.getMessage() in the caplog assertion (Copilot)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(tests): address review — make the ignored-logger test order-independent

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(tests): address review — lift the module-wide logging disable for caplog tests

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(sentry): silence celery beat's reconnect-retry log too

- The embedded beat scheduler logs every broker reconnect attempt at
  ERROR ("beat: Connection error ... Trying again"), the same
  expected-transient noise as the consumer logger (Sentry ANTHIAS-P)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(sentry): address review — respect __suppress_context__ in the chain walk

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(compose): healthcheck redis and gate services on it answering PING

- depends_on with bare service_started only orders container
  creation; uvicorn/celery/viewer could still race a redis that
  hadn't finished loading its RDB, producing the startup
  connection-refused noise (review feedback on this PR)
- Add a redis-cli ping healthcheck to the prod template, dev, and
  test composes, and gate anthias-server / anthias-viewer /
  anthias-celery on service_healthy
- compose-only: the balena supervisor doesn't support depends_on
  conditions, and a redis container recycling mid-life is gated by
  nothing — so the Sentry-side handling of transient redis errors
  stays

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 15:51:54 +02:00

65 lines
1.8 KiB
YAML

# vim: ft=yaml.docker-compose
services:
anthias-server:
# Explicit image tag so anthias-celery below can reference the same
# built image without a duplicate `build:` block (which would
# produce a separate, byte-identical-but-distinct image tag).
image: anthias-server:dev
build:
context: .
dockerfile: docker/Dockerfile.server
ports:
- 8000:8080
environment:
- HOME=/data
- LISTEN=0.0.0.0
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/0
- ENVIRONMENT=development
depends_on:
redis:
condition: service_healthy
restart: always
volumes:
- anthias-data:/data
- ./:/usr/src/app/
anthias-celery:
# Reuses anthias-server:dev via the explicit image tag above.
# Compose builds anthias-server first (it owns the build:) and
# this service inherits the same image, only overriding CMD.
image: anthias-server:dev
depends_on:
anthias-server:
condition: service_started
redis:
condition: service_healthy
command: >
nice -n 19 ionice -c 3
celery -A anthias_server.celery_tasks.celery worker -B -n worker@anthias
--loglevel=info --scheduler celery.beat.Scheduler
environment:
- HOME=/data
- CELERY_BROKER_URL=redis://redis:6379/0
- CELERY_RESULT_BACKEND=redis://redis:6379/0
- ENVIRONMENT=development
restart: always
volumes:
- anthias-data:/data
- ./:/usr/src/app/
redis:
platform: "linux/amd64"
image: mirror.gcr.io/library/redis:alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 12
start_period: 10s
volumes:
anthias-data:
redis-data: