mirror of
https://github.com/Screenly/Anthias.git
synced 2026-06-10 09:08:09 -04:00
da53e045a6801b5056fb448e68e55da7bf106913
32 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
57b4f25c77 |
feat(viewer,server): per-board HW decode dispatch + codec gate on upload (#2885)
* perf(viewer): pi4-64/pi5 use mpv --vo=gpu --gpu-context=drm On Pi the connector's preferred mode is usually 4K (most modern TVs report 3840x2160 in their EDID), and the previous --vo=drm path ran a CPU zimg upscale from 1080p source to that 4K output. On a 4-core A72 that's the bottleneck — mpv VO drops 59-75 frames per 30s on a stock 1080p H.264 signage clip. Pi5's A76 is faster but the same upscale path is still the limit. Switching the VO to GL with the DRM context (mpv --vo=gpu --gpu-context=drm) hands the upscale to the V3D and leaves everything else identical — mpv still owns DRM master, still reads --drm-mode=1920x1080@60 (kept), still runs in --vd-lavc-threads=4 software decode (mpv 0.40 in Debian Trixie has v4l2m2m-copy but not v4l2request, so --hwdec=auto-safe falls back to software on this asset; that hasn't changed). Measured on a 4K-connected Pi4-64 Rev 1.5, same clip, same 30 s window: --vo=drm : 59-75 vo drops / 30 s --vo=gpu --gpu-context=drm (this patch) : 3-6 vo drops / 30 s `decoder-frame-drop-count` is 0 in both — the regression was purely on the VO side, and shifting scaling off the CPU is what buys the headroom. x86 (cage + --gpu-context=wayland) is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf(viewer): drop --drm-mode pin on Pi4-64/Pi5 under --gpu-context=drm The previous commit moved Pi4-64/Pi5 to `mpv --vo=gpu --gpu-context=drm` but kept the `--drm-mode=1920x1080@60` pin from the old --vo=drm path. On-device testing showed the pin *hurts* throughput under GBM: 294 vo drops/30s with the pin, 3-6 without, on the same 4K-connected Pi4 and the same H.264 clip. The pin existed in the first place to dodge CPU zimg upscale to 4K, which the A72 couldn't keep up with on the legacy --vo=drm path. Under --gpu-context=drm the V3D does the scaling for free at the connector's preferred mode, so the workaround is no longer needed and is in fact harmful. `--vd-lavc-threads=4` stays — software decode under --hwdec=auto-safe (mpv 0.40 has v4l2m2m-copy but not v4l2request) still benefits from explicit threading. Verified on a 4K-connected Pi4-64 across H.264 (30/24 fps) and HEVC clips: 2-6 vo drops/30s in every case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(viewer): consolidate Qt6 boards onto cage + Wayland, pin Pi 4 to 1080p Folds in PR #2883: Pi 4-64 / Pi 5 now run under cage with mpv on --vo=gpu --gpu-context=wayland, joining x86 and arm64 on a single Wayland-based display stack. Drops the --vo=drm legacy path entirely from MPVMediaPlayer. Qt 5 boards (pi2 / pi3) stay on linuxfb via VLCMediaPlayer — out of scope here. Replaces the perf branch's `--vo=gpu --gpu-context=drm` standalone fix with the consolidated cage path. The previous standalone finding (3-6 vo drops / 30 s on Pi 4 at 4K) was a Pi-without-cage optimization; once Pi runs under cage like every other Qt6 board, the same trick applies via wayland but cage's composite step adds its own pass and the V3D on Pi 4 can't keep up at 4K (738 vo drops / 30 s measured at native 4K under cage). Fix: move the 1080p mode pin one layer up from app code to host config — the new ansible/.../cmdline.txt.j2 conditional appends `video=HDMI-A-1:1920x1080@60 video=HDMI-A-2:1920x1080@60` when `device_type == 'pi4-64'`. With output pinned to 1080p there's no upscale anywhere in the pipeline, matching the bandwidth profile of today's --vo=drm production setup. Pi 5 / x86 / arm64 keep the connector's preferred mode (typically 4K). Pi 5's V3D 7.1 has roughly 2× Pi 4's throughput; x86 iGPUs handle 4K via VAAPI; arm64 SBC perf varies by SoC. Other notable changes folded in from #2883: * tools/image_builder/utils.py — `cage` + `qt6-wayland` move out of the per-board branch into the shared is_qt6 block. `wlr-randr` (was x86-only) goes in the shared block too since rotation now happens via wlr-randr on every Qt6 board. `va-driver-all` stays x86-only (no VAAPI on Pi / ARM SoCs). * docker/Dockerfile.viewer.j2 — QT_QPA_PLATFORM=wayland gated on is_qt6 instead of board in ('x86', 'arm64'). * bin/start_viewer.sh — case on DEVICE_TYPE: every Qt6 board takes the cage + sudo path. Pi2 / Pi3 stay on the legacy direct-sudo path. * src/anthias_viewer/media_player.py — single --vo=gpu --gpu-context=wayland for all reachable device types. The per-board rotate_args block is gone: every Qt6 device inherits the transform from cage via wlr-randr, so mpv would double-rotate if it set --video-rotate. * tests/test_media_player.py — parametrised tests for all four Qt6 boards (x86, arm64, pi4-64, pi5) hitting the same VO path; rotation tests assert mpv *never* sets --video-rotate under cage. * website/data/faq.yaml — rotation entry points at Settings page / wlr-randr; resolution entry calls out the Pi 4 1080p pin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ansible): propagate tags into boot.yml include_tasks The `Configure boot partition` task in system/tasks/main.yml was tagged `touches-boot-partition` / `raspberry-pi` but those tags weren't propagated to the tasks inside boot.yml — Ansible's default include_tasks behaviour matches the include against --tags but leaves the included tasks tag-less, so they get filtered back out. Running `ansible-playbook ... --tags touches-boot-partition` therefore did nothing. Use the explicit `apply: tags:` form so the include's tags are copied onto each task in boot.yml. With this, the standalone "re-render boot config" workflow actually works, which matters on Pi 4 now that the 1080p HDMI mode pin in cmdline.txt.j2 needs to land without re-running the whole playbook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(viewer): keep Pi 4 on linuxfb; only Pi 5 / x86 / arm64 go cage On-device testing on a Pi 4 Model B Rev 1.5 with a 4K HDMI display showed cage+wayland is fundamentally too heavy for the V3D 6.0: --vo=drm (existing, no cage) : 59-75 drops/30s --vo=gpu --gpu-context=drm (no cage, GPU scale): 3-6 drops/30s --vo=gpu --gpu-context=wayland (cage, even at : 730+ drops/30s, 1080p HDMI cmdline pin to avoid 4K scale) mpv at 99% CPU running ~1/4× real time The 1080p HDMI pin doesn't recover Pi 4 — cage's composite pass costs more than the V3D 6.0 has spare bandwidth for, regardless of output resolution, with the webview running in the background or not. Pi 5's V3D 7.1 has roughly 2× the throughput and is expected to keep up; x86 / arm64 already shipped on cage and remain unchanged. Net result: * Pi 4-64 stays on Qt linuxfb (no compositor) with mpv on --vo=gpu --gpu-context=drm. mpv writes straight to KMS via libgbm and lets the V3D do video scaling — keeping the standalone perf-branch finding that drops from 59-75 → 3-6 on the same clip. * Pi 5 / x86 / arm64 stay (or move) onto cage + qt6-wayland + wlr-randr with mpv on --vo=gpu --gpu-context=wayland. * Pi 2 / Pi 3 stay on the Qt5 + VLC + linuxfb track they were already on. * The Pi 4 1080p HDMI cmdline pin added in the previous commit is reverted (no longer needed without cage). * Rotation handling: mpv emits --video-rotate=N on Pi 4 (no compositor to apply the transform) and skips it on the cage boards (wlr-randr handles it there). Goal-wise this is the partial-consolidation we agreed to as last resort: three of four Qt6 boards share one Wayland stack, Pi 4 keeps the framebuffer path for as long as the V3D 6.0 + mpv 0.40 combo lacks the headroom. Pi 4 remains in scope for revisiting once mpv ships the v4l2request hwdec. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(viewer): mirror host render-GID for all Qt 6 boards, not just cage mpv uses /dev/dri/renderD128 for --vo=gpu on every Qt 6 board now — wayland (cage path on x86 / arm64 / pi5) and drm (linuxfb path on Pi 4) both go through Mesa GL. The render-GID mirror was inside the cage branch of start_viewer.sh, so Pi 4's mpv ran as viewer user, hit the render node owned by GID 992, got "Permission denied", and bailed with "Failed initializing any suitable GPU context!". Hoist the render-GID setup above the per-board case so it runs for every Qt 6 board. cage / linuxfb branching stays as-is. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(viewer): Pi 4 stays on --vo=drm (Qt linuxfb DRM master contention) Earlier commits switched Pi 4 to mpv --vo=gpu --gpu-context=drm based on a 3-6 vo-drop/30 s measurement. That test was run as root in a fresh container — no Qt linuxfb in the picture. In the production viewer where AnthiasWebview holds the framebuffer via Qt linuxfb, --vo=gpu fails: failed to open /dev/dri/renderD128: Permission denied [vo/gpu/drm] Failed to acquire DRM master: Permission denied [vo/gpu] Failed initializing any suitable GPU context! Error opening/initializing the selected video_out (--vo) device. Video: no video Mesa GBM holds DRM master persistently and contends with Qt linuxfb's framebuffer use. mpv's classic --vo=drm has its own master juggling (briefly grab → render → drop) that coexists fine with linuxfb — that's why master's existing Pi 4 config works. Revert Pi 4 mpv flags to the production master config: --vo=drm --drm-mode=1920x1080@60 --vd-lavc-threads=4 The standalone perf-finding from this branch's earlier history turns out not to apply in production; retracted from the roll-up. Pi 5 / x86 / arm64 unchanged (they're on cage + --vo=gpu --gpu-context=wayland, which has its own DRM master flow via cage). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(viewer): cage opens on the first connected connector, not HDMI-A-1 Without `-o`, cage uses whatever output the DRM backend enumerates first — typically HDMI-A-1 on Pi 5 (closer to USB-C) and the on-board panel / first HDMI on x86 / arm64. If the operator plugs into the *other* port (Pi 5 HDMI-A-2, or any DP connector on x86), cage renders to a disconnected connector and the screen stays black. start_viewer.sh now iterates /sys/class/drm/card*-*, picks the first connector whose status reads "connected", strips the cardN- prefix to get the bare name cage expects (HDMI-A-1, HDMI-A-2, DP-1, eDP-1, …), and passes it via `-o`. Falls back to letting cage pick if nothing is connected yet — the display may come up via HPD after cage starts, or this is a build/CI host with no display at all. Caught while end-to-end testing on the rig: Pi 5 cable on HDMI-A-2 went to a black screen even though `cat /sys/class/drm/card1-HDMI-A-2/status` reported "connected" and cage / the viewer were running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(viewer): mpv from apt.raspberrypi.com on Pi 4 / Pi 5, hwdec auto-copy Stock Debian Trixie's mpv 0.40 is compiled without `v4l2request` hwdec, so Pi 5's Hantro stateless decoder is invisible to it and mpv falls back to software decode for every H.264 / H.265 source. Pi 4's V4L2 M2M decoder is reachable via `v4l2m2m-copy` but mpv's `--hwdec=auto-safe` whitelist explicitly excludes that method, so auto-detect picked software there too. Two changes, applied together because they only make sense together: * Pi 4 / Pi 5 viewer images now pull mpv (and the FFmpeg library family it depends on) from `archive.raspberrypi.com/debian trixie main`. The Pi-tuned build ships `v4l2request` hwdec (Pi 5) and a maintained `v4l2m2m-copy` (Pi 4). An apt-pin restricts the Pi repo to the mpv + libav* packages only, so curl / ca-certificates / etc. continue to come from stock Debian and the rest of the image stays on the same baseline. * `MPVMediaPlayer.play()` switches `--hwdec=auto-safe` → `--hwdec=auto-copy`. auto-copy is the same family but with a broader whitelist that *includes* the v4l2-family copy hwdecs. Net effect: x86 still picks vaapi-copy (unchanged), Pi 4 picks v4l2m2m-copy, Pi 5 picks v4l2request, arm64 falls through to software (no v4l2request in stock Debian mpv, no vendor-tuned Rockchip plugin in stock either — Tier-2 follow-up). Plus an `ANTHIAS_DEBUG_DROPS=1` env knob: when set on the viewer container, mpv's stdout/stderr go to `/data/.anthias/mpv.log` (host-bound) instead of `/dev/null`, and `--no-terminal` is dropped so the status line ("AV: ... Dropped: N") is emitted. Lets us read per-asset frame-drop counts straight from the production viewer pipeline (no custom harness, no rebuild) during the test-grid runs. Default (unset) preserves the silent behaviour. Also: drops the `cage -o <connector>` autodetect attempt — cage 0.1.x in Trixie doesn't accept `-o`, just `-m last`. Use that instead so cage opens on the most-recently-connected output regardless of HDMI-A-N enumeration order. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(viewer): use deb-packaged Pi keyring for archive.raspberrypi.com apt update against http://archive.raspberrypi.com/debian trixie was failing in the Pi 4 / Pi 5 viewer image builds: Sub-process /usr/bin/sqv returned an error code (1): Signing key on CF8A1AF502A2AA2D763BAE7E82B129927FA3303E is not bound: No binding signature at time … Policy rejected non-revocation signature (PositiveCertification) requiring second pre-image resistance SHA1 is not considered secure since 2026-02-01 Pi's bare `raspberrypi.gpg.key` URL still serves the original 2012-vintage RSA 2048 key with SHA1 binding signatures that Trixie's sqv refuses to certify under the post-2026-02-01 crypto policy. The deb-packaged keyring inside `raspberrypi-archive-keyring_2025.1+rpt1_all.deb` ships the *same* key fingerprint but with rebuilt binding signatures that sqv accepts — that's the keyring Pi OS Trixie itself installs, which is why `apt update` against this exact repo works on a real Pi 5 device today. Fetch the deb directly with curl, extract its bundled `.pgp` keyring, and point `signed-by=` at the installed copy. The pin block restricts what packages the Pi repo can supply (mpv + libav* + ffmpeg + libpostproc — the FFmpeg family), so the rest of the image keeps its stock-Debian baseline. Also extend the pin to cover libpostproc* and ffmpeg, since mpv's apt deps drag those into the Pi-tagged version on install; without the pin extension, apt rejected the resolve with "broken packages". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(viewer): per-codec hwdec on Pi via Lua hook mpv 0.40's `--hwdec` accepts a single value at startup, so we can't ask it to try v4l2m2m-copy for H.264 *and* drm-copy for HEVC out of the box. The Pi-tuned mpv from archive.raspberrypi.com supports both hwdec methods but each covers a different codec subset: * v4l2m2m-copy — Pi 4's V3D V4L2 M2M decoder. H.264 works; Pi 5's Hantro G2 is V4L2-stateless-only so this no-ops there. * drm-copy — FFmpeg's `v4l2_request_hevc` hwaccel. HEVC only, works on both Pi 4 and Pi 5. Add a small `on_load` Lua hook (inlined as `_PI_HWDEC_LUA`, written to /tmp on first play(), loaded with `--script=`) that checks `video-codec-name` and picks the right hwdec at file open. Net effect: Pi 4 H.264 → v4l2m2m-copy (HW) Pi 4 HEVC → drm-copy (HW) Pi 5 H.264 → v4l2m2m-copy (no device, falls back to SW — only path until mpv re-adds v4l2_request_h264 hwdec) Pi 5 HEVC → drm-copy (HW) The base `--hwdec=auto-copy` startup value still applies on x86 / arm64 (vaapi-copy on Intel/AMD; software fall-back on Rockchip), where the hook isn't loaded. Verified on real hardware: $ mpv ... --script=/tmp/anthias-pi-hwdec.lua test_hevc.mp4 [pi-hwdec] codec=hevc -> hwdec=drm-copy Using hardware decoding (drm-copy). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(viewer,server): HW-decode everywhere on Pi 4 / Pi 5 / x86 The previous per-codec Lua hook in media_player.py was a silent no-op: mpv's video-codec-name property is empty at every script event before hwdec init (on_load, on_preloaded), so --hwdec=auto-copy leaked through. auto-copy's upstream whitelist excludes v4l2m2m-copy, so H.264 on Pi 4 fell back to software despite the V3D V4L2 M2M decoder being available. Viewer (src/anthias_viewer/media_player.py) - Replace the Lua hook with ffprobe-driven dispatch from Python at launch time. ffprobe is in the viewer image; the call is ~50 ms. - Per-board mapping: Pi 4 → {h264: v4l2m2m-copy, hevc: drm-copy}; Pi 5 → {hevc: drm-copy}. Pi 5 H.264 falls back to auto-copy because mpv has no v4l2-request H.264 hwdec for the Hantro G1, and passing v4l2m2m-copy there just logs "Could not find a valid device" before SW-falling-back. - Live-verified on Pi 4: "Using hardware decoding (v4l2m2m-copy)" for 1080p H.264 and "Using hardware decoding (drm-copy)" for HEVC at 1080p and 4K. Asset processor (src/anthias_server/processing.py) - Pi 5 profile drops H.264 from passthrough_video_codecs — Pi 5 has no mpv H.264 HW path, so H.264 uploads must transcode to HEVC at upload time to keep the HW-decode-everywhere contract. - Pi 4 profile adds passthrough_video_max_pixels for H.264, capped at 1080p (1920*1080). 4K H.264 clears the codec gate but the V3D H.264 envelope tops at 1080p60, so the cap forces it through a libx265 re-encode at upload time. HEVC keeps no cap (the dedicated HEVC block handles 4Kp60). - _ffprobe_summary now returns video_pixels alongside codec / container / audio_codec; _video_can_passthrough enforces the per-codec pixel cap when the profile declares one. Tests - test_media_player.py: new per-board hwdec tests (Pi 4 H.264 → v4l2m2m-copy; Pi 5 H.264 → auto-copy; both → drm-copy for HEVC; auto-copy fallback when ffprobe fails; no probe on x86 / arm64). - test_processing.py: matrix tests updated to include video_pixels; parametrised rows now exercise Pi 5 H.264-no-passthrough and the Pi 4 4K H.264 cap. New end-to-end tests prove _run_video_normalisation transcodes Pi 5 H.264 → HEVC and Pi 4 4K H.264 → HEVC. Docs (docs/board-enablement.md, new) - Goal + per-board HW-decode capability table. - Asset processor codec policy spelled out as a contract. - BBB test bed recipe (source clips, libx265 transcode commands, ANTHIAS_DEBUG_DROPS=1, mpv.log slicing). Follow-up: Pi 5 4K HEVC HW The Hantro G2 decoder can't allocate 4K dst buffers from Pi 5's default 64 MB CMA ("v4l2_request_hevc_start_frame: Failed to get dst buffer") and SW-falls-back. Adding cma=512M to the kernel cmdline does NOT work — the kernel takes the cmdline value over the device-tree linux,cma node, orphaning rpi-hevc-dec ("Failed to probe hardware -517") and unpopulating /dev/video*, which kills HEVC HW at every resolution. The right fix is a dtparam/dtoverlay in /boot/firmware/config.txt that resizes the existing DT-declared region without orphaning the codec's reserved-mem reference. Until that lands, the pi5 profile should downscale 4K → 1080p HEVC. Documented in cmdline.txt.j2 and docs/board-enablement.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(viewer,server): mock _probe_video_codec; fix mypy on Popen IO types CI failures on the previous commit ( |
||
|
|
f547642fc4 |
fix: e2e-test findings (host-agent venv, celery beat, asset GET 404) (#2881)
* fix(install): persistent host-agent venv (anthias-host-agent.service 203/EXEC) PR #2843 switched the installer venv to a mktemp tmpdir cleaned up on EXIT, but anthias-host-agent.service's ExecStart still hardcodes /home/${USER}/installer_venv/bin/python. Every fresh install since that refactor leaves the unit in a status=203/EXEC restart loop with no Python at the configured path, and /api/v2/info then blocks ~80s on get_node_ip() waiting for the host_agent_ready key that will never appear. Split the two venvs: * INSTALLER_VENV: still ephemeral mktemp, used by ansible-core during install/upgrade and torn down by the EXIT trap. * HOST_AGENT_VENV: new persistent venv at /home/${USER}/installer_venv (path kept stable so devices installed before the refactor don't need a unit rewrite), recreated from the host dep group on every install + upgrade so deps track pyproject.toml. provision_host_agent_venv runs after install_ansible() and before run_ansible_playbook() so the venv exists before ansible's state: started fires the unit. On upgrade the unit is already loaded with the previous venv's in-memory interpreter, so the state: started no-op never picks up the new deps — restart explicitly when the unit is already active. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(celery): switch beat to in-memory scheduler (Python 3.13 dbm.sqlite3 locking) celery -B with the default PersistentScheduler stores its schedule via shelve. On Python 3.13, shelve defaults to dbm.sqlite3, which raises dbm.sqlite3.error: locking protocol intermittently under contention — observed on x86 but not pi4-64 in this build matrix, which is consistent with a benign-looking race specific to the amd64 docker layer's filesystem ordering. When Beat stalls, reconcile_stuck_processing and the other periodic tasks set up by setup_periodic_tasks stop firing, so stuck-in-is_processing assets never get re-dispatched. setup_periodic_tasks defines every periodic task statically (no django-celery-beat / no dynamic schedule edits), so a non-persistent scheduler is sufficient. Switch to celery.beat.Scheduler in all three compose files (prod template + dev + test) and drop the --schedule /tmp/celerybeat-schedule flag that's now unused. The telemetry cooldown comment is updated to reference the new flag — the actual 24h cooldown is still gated by the Redis TTL, which is the persisted source of truth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): return 404 (not 500) for unknown asset_id across v1/v1.1/v1.2/v2 AssetViewV{1,1_1,1_2,2}.get / put / patch / update and the shared DeleteAssetViewMixin / AssetContentViewMixin / ViewerCurrentAssetViewV1 all called Asset.objects.get(asset_id=...) bare. The Asset.DoesNotExist that fires for a deleted-or-typo'd id has no DRF exception handler registered, so it bubbled up as a 500 with the database traceback — caller sees a server error for what is structurally a missing resource. AssetRecheckViewV2 already gets this right via filter(...).exists() + explicit 404; standardise the rest by routing the lookup through django.shortcuts.get_object_or_404 (DRF's exception handler converts the resulting Http404 to a clean 404 Response). The new test_unknown_asset_id_returns_404 parametrises across every API version so a future view that reverts to Asset.objects.get bare trips immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): rename queryset → asset in ViewerCurrentAssetViewV1 get_object_or_404 returns a single Asset, not a queryset; the variable name was already misleading under the previous bare Asset.objects.get(...) call. Address Copilot review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(install): silence uv cross-filesystem hardlink warning INSTALLER_VENV lands in /tmp (the mktemp -t default), while uv's cache lives at ~/.cache/uv on $HOME. On the typical Pi/Debian install /tmp is tmpfs and $HOME is the SD card, so uv's default hardlink mode fails for every wheel and falls back to a noisy "Failed to hardlink files; falling back to full copy" line. Set UV_LINK_MODE=copy on the install_ansible invocation so the fallback becomes the documented choice. provision_host_agent_venv is unaffected — both its venv and the uv cache live on $HOME, so hardlinks work there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(compose): pass --remove-orphans on every up Surfaced during e2e testing: after a compose recreate, anthias-server's up -d emitted "Found orphan containers ([anthias-anthias-viewer-run-…]) … you can run this command with the --remove-orphans flag to clean it up." These linger from earlier `docker compose run` invocations that created run-NNN sidecar containers — without --remove-orphans they just keep running and clutter `docker ps`. Apply to both the prod upgrade path (upgrade_containers.sh) and the dev bring-up (start_development_server.sh). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
133ec78ff0 |
refactor(packaging): adopt src/ layout with split server/viewer packages (#2817)
* refactor(packaging): adopt src/ layout with split server/viewer packages
Move all Python source under src/ following modern packaging conventions.
Server, viewer, host-agent, and shared common code now live as four
top-level packages with clear excision boundaries — anthias_viewer can
be removed wholesale when the rewrite-out-of-Python lands without
touching the server.
src/anthias_common/ shared: errors, utils, internal_auth, device_helper
src/anthias_server/ Django app, REST API, Celery tasks, manage.py
lib/ server-only: auth, backup_helper, diagnostics, github, telemetry
src/anthias_viewer/ player runtime (was viewer/)
src/anthias_host_agent/ systemd-driven host shim (was host_agent.py)
tools/raspberry_pi_imager/ moved from repo root
tests/conftest.py moved from repo root
pyproject.toml gets [build-system], setuptools src/ discovery, and an
anthias-manage console script. Django AppConfigs keep label='anthias_app'
and label='api' so existing migration dependency tuples don't move.
BASE_DIR computed from parents[3] to keep templates/static at repo root.
mypy_path set to ["src", "stubs"] with explicit_package_bases.
Dockerfile templates set PYTHONPATH=/usr/src/app/src; bin/start_*.sh
and CI workflows use python -m anthias_server.manage / python -m
anthias_viewer instead of bare ./manage.py and python -m viewer.
Ansible host-agent unit invokes python -m anthias_host_agent.
Verified end-to-end in the docker test container:
- 430 unit tests pass (matches baseline)
- 7 integration tests pass, 5 skipped (matches baseline)
- ruff, mypy clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* style: ruff format the new src/ tree
The longer post-rename module paths (anthias_common.internal_auth vs
lib.internal_auth, etc.) pushed several import lines past 79 chars, so
ruff format had to wrap them. Apply that formatting and split the one
multi-import in anthias_viewer/__init__.py into per-symbol lines so the
existing # noqa: E402 sits on the `from` line where ruff expects it,
without needing a re-anchor when format wraps the parens.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: realign sonar + gitignore comment to src/ layout
sonar-project.properties still pointed at the pre-refactor top-level
packages (anthias_app, anthias_django, api, lib, viewer, ...) and
their old per-file coverage.exclusions paths, which would have
produced empty Sonar runs and stale exclusions. Collapse sources to
`src` and rewrite the exclusions to the new src/anthias_*/ paths.
Also fix the stale path reference in .gitignore's comment for the
test DB (now src/anthias_server/django_project/settings.py).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: gitignore .claude/ and untrack the lock file I just leaked
Previous commit accidentally pulled in .claude/scheduled_tasks.lock
because .claude was in .dockerignore but not .gitignore. Add the
pattern to .gitignore and drop the file from the index.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(dockerignore): exclude pytest cache, __pycache__ dirs, and the local test DB
Three entries that were missing relative to the new src/ layout:
- .anthias-test.db (and -journal/-wal/-shm siblings) — created at the
repo root by src/anthias_server/django_project/settings.py when a
developer runs the host pytest suite. Without this exclude, the
next docker build COPY . bakes the file into /usr/src/app/.
- **/__pycache__ — *.py[co] only matched the .pyc/.pyo files, leaving
the empty cache directories to ship.
- .pytest_cache — host-side, regenerable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(urls): preserve 'anthias_app' URL namespace, not just the app label
Copilot caught that the import-rewrite swept up the URL namespace too:
app_name in src/anthias_server/app/urls.py changed from 'anthias_app'
to 'anthias_server.app', which leaves templates/login.html's
{% url 'anthias_app:login' %} pointing at a namespace that no longer
exists — NoReverseMatch at render time when an unauthenticated request
hits the login page.
The namespace is the same kind of stable user-facing identifier as the
AppConfig label (which we already kept as 'anthias_app'). Restore it,
and revert the two reverse() callers in lib/auth.py and app/views.py
that the rewrite changed in lockstep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): update --confcutdir to the new tools/raspberry_pi_imager path
Copilot caught that the earlier sweep missed --confcutdir=raspberry_pi_imager
(no trailing slash) — replace_all of "raspberry_pi_imager/" only matched
path-with-slash forms. Without confcutdir, pytest walks back up looking
for conftests and discovers the repo-root tests/conftest.py, which
applies the Anthias-specific Django/Redis stubs to the rpi-imager test
run on the website-deploy workflow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
0c2be6d066 |
We keep hitting rate limiting from Docker Hub - let's say goodbye (#2802)
* We keep hitting rate limiting from Docker Hub - let's say goodbye * DRY things up |
||
|
|
5e00c8ba25 |
refactor(docker): drop celery image, restore base apt layer dedup (#2776)
* refactor(docker): drop celery image, restore base apt layer dedup
- Delete Dockerfile.celery.j2; compose now runs celery on the
anthias-server image with a `command:` override.
- Make viewer extend Dockerfile.base.j2 (mirroring test); drop 17
packages duplicated between viewer and base_apt_dependencies, plus
4 within-list duplicates.
- Move `# syntax=docker/dockerfile:1.4` to line 1 of every rendered
Dockerfile. It previously lived in uv-builder.j2 line 1 and got
bumped mid-file for server by the bun-builder prelude, silently
disabling the 1.4 frontend and breaking cache-key parity with
viewer — the actual blocker for layer dedup.
- Collapse CI matrix from (board × service) to (board) so all
services for a board build on the same runner with the same
buildkit cache, producing byte-identical apt layer digests at the
registry.
- Add ENV DJANGO_SETTINGS_MODULE to the server image so the merged
image runs both server and celery CMDs.
- Update all five compose templates (prod, balena prod, balena dev,
dev, test) to redirect anthias-celery at the server image with a
command: override. dev compose pins an explicit `image:` tag so
both services share the locally-built SHA.
- Remove old anthias-celery / srly-ose-celery containers in
upgrade_containers.sh so the recreated container can take the name.
Verified end-to-end on x86: server and viewer apt layers share a
single digest; SHARED SIZE jumps from 132 MB to 1.216 GB; merged
image runs both workloads in compose (celery task round-trips
through Redis to SUCCESS).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(docker): cache buildkit layers in GHCR registry across CI runs
Add a --cache-backend / $BUILDX_CACHE_BACKEND option to
tools.image_builder with two modes:
- `local` (default): writes to /tmp/.buildx-cache/<board>/.
Unchanged from before; right for local dev.
- `registry`: pushes BuildKit cache to
ghcr.io/screenly/anthias-<service>:buildcache-<board>. Reuses the
GHCR login already done by docker-build.yaml, no extra tokens or
third-party actions needed.
Wire CI to use registry mode on push events (master) so subsequent
runs of the same board pull cached layers — the ~825 MB extracted
apt install per service goes from ~3 min cold to a few seconds
warm. workflow_dispatch on a non-master branch falls back to local
mode (effectively no-cache) so manual runs can't pollute the master
cache.
Drop the old actions/cache@v5 step that mirrored
/tmp/.buildx-cache/<board> through actions/cache — registry cache
is per-step rather than one big tarball, so it survives the GitHub
Actions cache 10 GB-per-repo eviction better.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(image-builder): move local cache out of /tmp to user XDG cache dir
SonarCloud python:S5443 flagged the previous /tmp/.buildx-cache/
default as a security hotspot — `/tmp` is world-writable, so on a
multi-user host another account could in principle tamper with the
buildkit cache. Switch to $XDG_CACHE_HOME/anthias-buildx/<board>/
(default ~/.cache/anthias-buildx/), which is per-user by default
and follows XDG Base Directory convention.
CI is unaffected: docker-build.yaml uses --cache-backend=registry
on push events, which pushes cache to GHCR and never touches the
local path. Local dev users with stale state in
/tmp/.buildx-cache/<board>/ can rm it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(docker): correct cache-backend comments to match real behavior
Two doc fixes per Copilot review on #2776:
- tools/image_builder/__main__.py: the cache-backend rationale
block still referenced /tmp/.buildx-cache/<board>; update to
$XDG_CACHE_HOME/anthias-buildx/<board> so it matches the
implementation moved in
|
||
|
|
f421130b24 |
refactor(server): collapse nginx + websocket containers into uvicorn (#2757)
* refactor(server): collapse nginx + websocket containers into uvicorn
Replace the nginx + gunicorn + gevent-websocket trio with a single
uvicorn ASGI server inside `anthias-server`:
* HTTP, /static/, /anthias_assets/, /static_with_mime/, and /hotspot
are now served from Django (WhiteNoise + small file-serving views in
`anthias_app/views_files.py` that re-implement nginx's IP allowlists).
* WebSockets move from a separate gevent process talking ZMQ to Django
Channels with a Redis-backed channel layer, fanned out by celery via
`channel_layer.group_send`.
* TLS termination is handled by uvicorn directly when SSL_CERTFILE /
SSL_KEYFILE are set; `bin/enable_ssl.sh` now writes a compose
override (no longer ansible) and a companion `bin/disable_ssl.sh`
removes it. Cert + key live under `~/.anthias/ssl/`.
* `bin/upgrade_containers.sh` removes the legacy `anthias-nginx` and
`anthias-websocket` containers on upgrade so they don't linger.
* Drop `gunicorn`, `gevent`, `gevent-websocket`, and the `websocket`
uv group from `pyproject.toml`; add `channels`, `channels-redis`,
`daphne`, `uvicorn[standard]`, and `whitenoise`.
Notes on hardening: `--forwarded-allow-ips` defaults to off so the IP
allowlist can't be bypassed via a spoofed `X-Forwarded-For`; operators
behind a reverse proxy can opt in via the `FORWARDED_ALLOW_IPS` env
var. Backup uploads previously sized by nginx's `client_max_body_size
4G` are preserved by setting `DATA_UPLOAD_MAX_MEMORY_SIZE = None`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address review feedback on uvicorn migration
* Drop USE_X_FORWARDED_HOST (inconsistent with the deliberate
--forwarded-allow-ips hardening; without a proxy, X-Forwarded-Host is
client-controlled).
* Remove daphne — uvicorn runs production and the test environment now
uses it too (bin/prepare_test_environment.sh).
* Replace _safe_join's parents-membership check with Path.is_relative_to.
* Drop AllowedHostsOriginValidator wrapper (no-op under ALLOWED_HOSTS=['*'])
and document where to put it back if hosts are ever locked down.
* Rename DOCKER_CIDR → DOCKER_BRIDGE_CIDR with a comment that this is
defense-in-depth, not a real perimeter (LAN clients via the published
port also appear in 172.16/12).
* Add anthias_app/tests.py covering the IP allowlists, mime override,
hotspot gating, and traversal/symlink rejection in _safe_join (17 tests).
* Note the single-worker ZmqPublisher bind constraint in start_server.sh
so a future scale-up doesn't EADDRINUSE on tcp://0.0.0.0:10001.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(security): clear SonarCloud hotspots on uvicorn migration
* Restrict views_files.anthias_assets / static_with_mime / hotspot to
GET via @require_GET (Sonar S3752, x3): they are read-only file
servers and should reject other methods at the view boundary.
* Mark RFC1918 / Docker-bridge CIDR literals as NOSONAR S1313 (x4):
they are intentional, well-known private network ranges.
* Mark `http://*` in CSRF_TRUSTED_ORIGINS as NOSONAR S5332 with a
comment explaining devices ship over HTTP and operators opt into TLS
via bin/enable_ssl.sh.
Existing 17 view tests continue to pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: clear remaining static-analysis findings
* ruff format -- the previous tests.py reformatted itself; CI's
`ruff format --check` now passes.
* CodeQL py/path-injection on _safe_join: rewrite using
os.path.realpath + os.path.commonpath, which CodeQL recognises as a
sanitiser for path-injection sinks. Behaviour is identical to the
Path.is_relative_to version (both reject `..` and symlink escapes;
the 17 tests in anthias_app/tests.py still pass).
* SonarCloud NOSONAR markers: switch to the codebase's bare `# NOSONAR`
form (matches host_agent.py and tests/test_backup_helper.py); the
earlier `# NOSONAR <rule>` form was not being honoured.
* Centralise the test-fixture IPs in module-level constants so S1313
is suppressed in one place rather than at every callsite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(security): inline path-injection check in views
CodeQL only treats os.path.commonpath as a sanitiser when the check
sits in the same function as the file-system sink — calling
_safe_join() from a separate function still leaves the open()/isfile()
sinks tainted (4 alerts on PR #2757).
Repeat the realpath + commonpath check inline in anthias_assets and
static_with_mime so CodeQL can prove the post-check path stays under
the configured root. _safe_join is kept for the SafeJoinTest unit
tests and as a documented helper.
Existing 17 tests in anthias_app/tests.py continue to pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(security): use realpath+startswith path sanitiser for CodeQL
CodeQL's path-injection model recognises the canonical
`realpath(...).startswith(base + sep)` pattern but apparently not
`os.path.commonpath(...) == root` in this codepath. Switch the inline
check in anthias_assets and static_with_mime to startswith so the
analyser can prove the post-check path stays under the configured
root.
Behaviour is identical: traversal and symlink-escape still 404
(verified by SafeJoinTest + view tests).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address Copilot review feedback
* lib/utils.py imported channels/asgiref at module level. The viewer
container imports lib.utils via viewer/__init__.py but its uv
dependency group does not ship channels, so the viewer would
ImportError on startup. Move the channels imports into
YoutubeDownloadThread.run() (server/celery-only path) so lib.utils
remains importable from the viewer.
* Drop the unused _safe_join() helper and its three SafeJoinTest
cases — the views inline a realpath+startswith sanitiser (CodeQL
needs the check in the same function as the sink), and the helper
was only being exercised in isolation. Add an equivalent
symlink-escape test against anthias_assets so the actual code path
used by the views is covered.
* Refresh the anthias_django/settings.py docstring + Django doc URLs
from /3.2/ → /4.2/ to match the pinned Django version.
15 view tests pass (was 17 — lost 3 SafeJoinTest + gained 1 symlink
test against the real view).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: refresh architecture diagram for uvicorn migration
Drop the anthias-nginx and anthias-websocket nodes (and their edges)
from docs/d2/anthias-diagram-overview.d2 — the user now talks
directly to anthias-server (uvicorn handling HTTP + /ws), Celery
fans out asset-update events through the Redis-backed Channels
layer, and the viewer fetches media from anthias-server over HTTP.
Regenerate the SVG with d2 v0.7.1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address Copilot SSL + CSRF / WS-origin feedback
* Dual uvicorn listeners when SSL is enabled (Copilot #1, #2). HTTP on
$HTTP_PORT (default 8080) for inter-container traffic — viewer +
webview hit anthias-server over plain HTTP on the Docker network and
cannot validate uvicorn's self-signed cert. HTTPS on $HTTPS_PORT
(default 8443) for external clients. bin/enable_ssl.sh now appends
443:8443 to the compose ports list (instead of using `!override` to
swap 80:8080 for 443:8080), so port 80 stays available for backward
compatibility and the Docker-network HTTP port keeps working.
* Drop CSRF_TRUSTED_ORIGINS = ['http://*', 'https://*'] (Copilot #3).
Verified via Django shell: those leading wildcards are ignored by
Django 4.2 (only subdomain wildcards like https://*.example.com are
honoured), so the setting was a no-op. Same-origin POSTs still pass
through Django's built-in Origin/Host check.
* Re-add channels.security.websocket.AllowedHostsOriginValidator to
the WebSocket router (Copilot #5). Currently a no-op under
ALLOWED_HOSTS=['*'], but tightening ALLOWED_HOSTS later will now
also tighten /ws.
Smoke test (dev + SSL override):
- HTTP http://localhost:8000/ -> 200
- HTTPS https://localhost:8443/ -> 200
- HTTP http://localhost:8443/ -> 000 (TLS-only, expected)
- internal http://localhost:8080/ -> 200
- 15 view tests still pass.
Note: Copilot #4 (Docker-bridge CIDR is bypassable via the published
port) is documented in views_files.py as defense-in-depth and matches
the original nginx posture; switching to app-layer auth is out of
scope for this PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(ssl): switch from in-uvicorn TLS to a Caddy sidecar
The previous SSL implementation gave anthias-server two uvicorn
listeners (HTTP + HTTPS) so the viewer/webview could keep talking
plain HTTP over the Docker network while external clients got TLS.
That dual-listener dance is non-zero overhead and complicates signal
handling. Switch to the standard reverse-proxy pattern instead.
When SSL is enabled by bin/enable_ssl.sh:
* anthias-server stays a single uvicorn listener on plain HTTP 8080
(no SSL_CERTFILE/SSL_KEYFILE knobs, no dual-port logic).
* A Caddy sidecar (caddy:2-alpine, only present when the override is
installed) terminates TLS on host port 443, redirects 80→443, and
reverse-proxies to anthias-server:8080 — so X-Forwarded-Proto /
X-Forwarded-For are forwarded as-is by Caddy.
* The override removes anthias-server's external port mapping
(`ports: !override []`), so all external traffic must enter through
Caddy and the IP allowlists in views_files.py see the original LAN
client IP rather than the docker-bridge gateway. Inter-container
traffic is unchanged.
* `FORWARDED_ALLOW_IPS=*` is set on anthias-server in the override —
safe because anthias-server is no longer reachable from outside the
Docker network — and `SECURE_PROXY_SSL_HEADER` is added in Django
settings so request.is_secure() returns True for HTTPS callers.
* When SSL is *not* enabled there is zero new container, zero new
config — the base compose file is untouched and Caddy isn't pulled
or run.
bin/disable_ssl.sh now also removes the anthias-caddy container
before deleting the override, so HTTPS-only state is fully reversed.
Smoke-tested with a temporary Caddy override:
- HTTPS via Caddy: 200
- HTTP via Caddy: 301 → https://...
- Direct anthias-server: refused (port mapping dropped by override)
- WebSocket upgrade: 101 Switching Protocols
- request.is_secure() with X-Forwarded-Proto=https: True
- 15 anthias_app view tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(views_files): document IP-allowlist threat model
Spell out exactly when the docker-bridge CIDR check is and isn't a
real perimeter:
* No-SSL default: anthias-server is published as 80:8080, so requests
arrive with REMOTE_ADDR set to the docker bridge gateway (172.x) and
LAN clients aren't actually excluded. Trying to plug the gap with
auth would be security theatre — credentials would travel in
plaintext over the LAN anyway.
* SSL via the Caddy sidecar: Caddy terminates TLS, rewrites
X-Forwarded-For, uvicorn honours it (FORWARDED_ALLOW_IPS=*), and the
check sees the real client IP — so the bypass is closed for any
deployment that actually cares about confidentiality.
This is documentation only; no behavioural change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(ssl): add --domain (auto Let's Encrypt) + drop openssl shim
bin/enable_ssl.sh now has three modes instead of two:
* Default (no args) — Caddy issues per-SNI certs lazily from its
built-in local CA via `tls internal { on_demand }`. Drops the
openssl self-signed-cert generation step entirely; Caddy persists
the CA in the anthias-caddy-data volume and rotates leaf certs
itself. Browsers still warn (CA is local) but no openssl/cert
hygiene is needed on the host.
* `--domain example.com [--email you@example.com] [--staging]` —
Caddy auto-issues + renews from Let's Encrypt. Caddy auto-creates
the HTTP→HTTPS redirect for hostname sites. Use `--staging` to point
at the ACME staging endpoint while testing, so the production rate
limits aren't burned.
* `--cert /path/to/cert.pem --key /path/to/key.pem [--domain ...]` —
unchanged: bring your own cert, Caddy serves it as-is with
`auto_https off`.
Verified:
- All three Caddyfiles pass `caddy validate`.
- Default mode end-to-end: HTTPS=200 with cert from "Caddy Local
Authority - ECC Intermediate", per-SNI SANs (DNS:localhost,
IP Address:192.168.99.99 etc.), HTTP→HTTPS=301, /ws upgrade=101,
anthias-server's external port mapping is dropped so direct access
is refused.
Docs (CLAUDE.md, docs/README.md, docs/developer-documentation.md)
updated to describe the Caddy sidecar instead of in-uvicorn TLS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address self-review findings on PR #2757
* Gate SECURE_PROXY_SSL_HEADER on FORWARDED_ALLOW_IPS
(anthias_django/settings.py): without the gate, a client on a
plain-HTTP deploy could send `X-Forwarded-Proto: https` and flip
`request.is_secure()`. Django reads the header from META directly,
independent of uvicorn's --proxy-headers flag, so the previous
unconditional setting was actually exploitable in non-SSL mode
(secure-cookied sessions would drop on the next plain-HTTP request,
redirects would point at https:// URLs that don't exist).
Verified live: non-SSL → SECURE_PROXY_SSL_HEADER is None and
is_secure() with spoofed XFP=https returns False; SSL via Caddy
override → header is set and is_secure() returns True.
* Replace the isfile() pre-check + open() in anthias_assets and
static_with_mime with a try/except FileNotFoundError around open()
(anthias_app/views_files.py). Eliminates a (tiny but real) TOCTOU
window between the stat and the open. IsADirectoryError handled
too, since `realpath('/dir/')` resolves to the directory and open()
would otherwise 500.
* Comment FORWARDED_ALLOW_IPS=* assumption in bin/enable_ssl.sh: the
wildcard is only safe because the override drops anthias-server's
external port mapping, so any future edit that re-adds a host:port
publication has to either tighten the wildcard to Caddy's IP/CIDR
or unset it.
* Replace ANSI-C escape sequences in the Caddyfile generator with
plain multi-line strings. `read -r -d ''` was the first attempt
but it strips trailing newlines, which collapsed `auto_https off`
onto the same line as `}` in cert mode. Multi-line literals with
echo "$VAR" are unambiguous and Caddy validates all three modes
cleanly again.
* Add a docker-volume cleanup hint to bin/disable_ssl.sh: Caddy's
local CA persists in anthias_anthias-caddy-data so an enable →
disable → enable cycle reuses the same CA (intentional — browsers
that trusted it stay trusted), and operators who want a fresh CA
now have the exact `docker volume rm` command in the script's
output.
15 view tests still pass; default + SSL Caddyfiles still validate;
default + SSL endpoints still return 200 / 301 / 101 in smoke tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address Copilot's host/MIME hardening feedback
Two security tightenings on top of the prior SECURE_PROXY_SSL_HEADER
gate (which Copilot flagged on a stale snapshot — that one's already
fixed in
|
||
|
|
188e3993d0 | Migrate web server back-end from Flask to Django (#2040) | ||
|
|
251308df20 |
Removes obsolete version attribute from non-Balena Docker Compose files (#2015)
* chore: clean up Docker Compose files * Remove the version field from Compose files, except those used by the Balena fleets. * Rename volume names * fix: rename `anthias-data` back to `resin-data` * fix: hide "Update Available" navbar for Balena instances |
||
|
|
ab8c1927a2 |
Refactors server scripts for development mode (#1952)
* refactor code for development mode * remove unused imports * trigger test workflow on `master` push |
||
|
|
8d21d2ba55 | More hostname fixes | ||
|
|
cf6fc75087 | Fix dev env | ||
|
|
64fcdef7dc | Refactors how test containers are built | ||
|
|
0fc7e9e69a | Make development experience less of a hassle and out-of-the-box. | ||
|
|
46a627aec3 |
Modify the compose file for test so that it can be used for development
as well. |
||
|
|
83afe40ff1 |
Bring back test suite runs on GitHub actions (#1616)
- Add a new GitHub workflow for running unit tests. - Modify existing workflow for building images, so that it will only run if no tests are failing. - "Comment out" (skip) failing Python unit tests in the meantime. Will be addressed in future PRs. |
||
|
|
56bae468e4 | Closes #1531. Fixes demo site. | ||
|
|
ac4056de52 | More developer mode fixes | ||
|
|
a389d94cbd | Fixes dev build | ||
|
|
f606514daf | Fixes docker-compose syntax | ||
|
|
f6baa1f52c | Adds cache-from directive | ||
|
|
f88a7786fc | Annotate images properly | ||
|
|
e82aad20c6 | Bumps up version for some docker compose envs | ||
|
|
c91a57e943 | Changes local post | ||
|
|
310d75ce27 | Brings developer/demo setup in line with new setup. | ||
|
|
8c6b50b51c | Fixes missing dependency | ||
|
|
f21fb31a34 | Fix: removes ports of redis | ||
|
|
b63c0cee47 | Edits: docker-compose.dev.yml | ||
|
|
35bd1d2510 | Edits: redis broker instead of rabbitmq | ||
|
|
b973bc0bcb | New: default assets | ||
|
|
d1a26a3dc0 | Celery: change redis broker to rabbitmq | ||
|
|
baca4b5d8a | Reworked: docker-compose | ||
|
|
935314c575 | Merge branch 'master' into celery |