Files
Anthias/lib/github.py
Viktor Petersson 93e5501847 ci: enforce strict mypy across the codebase (#2752)
* ci: enforce strict mypy across the codebase

Add a Python mypy job that runs on every PR (`.github/workflows/python-mypy.yaml`) and
the supporting type annotations to make `strict = true` pass cleanly across all 88
source files.

Configuration choices:
* `strict = true` with no global relaxations (no `disallow_subclassing_any = false`,
  no `disallow_untyped_decorators = false`, no `disable_error_code`,
  no `ignore_missing_imports`).
* `follow_imports = "normal"`.
* django-stubs + djangorestframework-stubs plugins; django_stubs_ext.monkeypatch()
  in settings so generic Django classes are subscriptable at runtime.
* Local stubs in `stubs/` for libraries that ship incomplete type info
  (drf_spectacular's view methods, redis-py's sync API).
* A scoped `[[tool.mypy.overrides]]` block lists 7 third-party libs without any
  type info (cec, geventwebsocket, hurry.filesize, pydbus, sh, splinter, vlc) so
  future stub releases will be noticed instead of silently ignored.

The two `# type: ignore` escape hatches that previously existed are gone:
`lib/utils.py` now imports `mplayer` from `sh` properly, and `tests/test_settings.py`
patches `os.getenv` via `mock.patch.object` instead of direct assignment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: fix mypy CI runtime deps and unused HttpResponse import

* `lib/auth.py`: drop the local `HttpResponse` import; only the string-form
  annotation needs it and the runtime `isinstance` check only uses `HttpRequest`
  (caught by ruff's F401).
* Add a `mypy` dependency group (extends `dev-host` with the runtime deps that
  `anthias_django.settings` touches) so the django-stubs plugin can introspect
  the app registry on a fresh CI runner. Skips the heavy native-extension deps
  (cec, netifaces, etc.) that aren't needed for type checking.
* python-mypy workflow: install via `uv pip install --group mypy` (consistent
  with other workflows) and create a dummy `~/.screenly/screenly.conf` so the
  settings module's first-import side effect succeeds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): migrate password hashing to PBKDF2 (CodeQL py/weak-sensitive-data-hashing)

CodeQL flagged the SHA256-based password hashing in lib/auth.py and api/views/v2.py
as a weak password hash (SHA256 is a fast hash, unsuitable for password storage).

Switch new passwords to Django's `make_password` (PBKDF2-SHA256, 600k iterations
by default in Django 4.2). Add `hash_password` / `verify_password` helpers in
lib/auth.py that:

* hash with PBKDF2 for any new password write,
* verify both Django-format hashes and legacy bare-SHA256 hex digests so existing
  config files still authenticate,
* opportunistically re-hash legacy SHA256 entries to PBKDF2 on a successful
  login, phasing out the weak format over time.

`settings.py` auto-migration now uses `hash_password` for first-load plaintext
passwords and recognises both legacy SHA256 and Django algorithm-prefixed
formats so it doesn't double-hash an already-stored hash.

Also fixes the stale ruff format check (5 files reformatted by `ruff format`)
that was breaking the python-lint job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: drop legacy SHA256 password verify and add docker-image-builder to mypy group

CodeQL kept flagging the SHA256 fallback path in `lib/auth.py:verify_password`
as `py/weak-sensitive-data-hashing`, even though it was scoped to a one-time
migration. Rather than suppress the alert, drop the legacy verify entirely so
no password material ever flows into hashlib.sha256.

Migration of existing installs now happens at settings load time:
* `settings.py` detects a 64-char hex (legacy SHA256) password hash on read
  and clears both the `password` field and `auth_backend`. The device stays
  reachable (no auth required) and the operator must re-set credentials via
  the web UI. A clear warning is logged so the change is visible.

Also fix the python-mypy CI job: include the `docker-image-builder` group in
the `mypy` group so `pygit2` and `python_on_whales` (imported by
`tools/image_builder/__main__.py`) resolve.

BREAKING: any Anthias install with a SHA256-format password in
`screenly.conf` will have basic auth disabled on first start of this version.
The operator must log in (no password required) and set a new password via
the UI to re-enable basic auth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address review feedback (Copilot + manual review)

Critical:
* settings.py — `_get` previously called `hash_password()` at module-import
  time when it found a plaintext password. That imports Django's password
  hashers, which raises `ImproperlyConfigured` if reached before
  `django.setup()` (e.g. when `viewer/__init__.py` imports `settings`).
  Drop the auto-hash entirely and treat plaintext the same as legacy
  SHA256: clear it, disable basic auth, log an `error`-level warning, and
  persist the cleaned state via `_needs_save_after_load` so the warning
  doesn't repeat on every load.
* lib/backup_helper.py — `recover()` called `tar.extractall()` on an
  uploaded archive. A crafted backup with `../` paths or symlinks could
  overwrite arbitrary files under HOME. Add `_safe_extract()` that
  validates each member's resolved path stays inside the destination and
  rejects unsafe symlinks/hardlinks.
* api/helpers.py — `custom_exception_handler()` discarded DRF's default
  response and always returned 500, breaking 4xx propagation for
  `ValidationError`/`NotFound`/etc. Return DRF's response when present;
  fall back to 500 only when it's None.
* tests/test_settings.py — `broken_settings_should_raise_value_error`
  was missing the `test_` prefix and never actually ran. Renamed.

API correctness:
* api/views/mixins.py — replace bare `raise Exception(...)` for missing
  uploads / wrong file extensions / missing asset URI with DRF's
  `ValidationError` / `NotFound`, so callers get proper 4xx with a
  structured error body instead of a 500.
* anthias_app/helpers.py, api/serializers/{mixins,v1_1}.py — replace
  `assert video_duration is not None` with explicit raises. `assert` is
  stripped under `python -O`, which would silently turn the next call
  into an `AttributeError` on None.

Typing/stubs:
* host_agent.py — pass `decode_responses=True` to both `redis.Redis()`
  constructions and switch the channel name + command map keys from
  bytes to str. The local redis stub assumes decoded responses, and
  host_agent was the only caller violating that invariant.
* lib/auth.py — document the `R | HttpResponse` return-type collapse
  for DRF Response (mirrors Django's @login_required).
* api/serializers/__init__.py — comment why `UpdateAssetSerializer`'s
  fields are widened to `Field[Any, Any, Any, Any]` (so v2's overrides
  with different field types don't trip [assignment]).

Tests/CI:
* api/tests/test_v2_endpoints.py — also assert the stored hash starts
  with `pbkdf2_sha256$` so a regression to a weaker hasher is caught
  even if `verify_password()` itself were broken.
* .github/workflows/python-mypy.yaml — write a minimal valid
  `screenly.conf` (with section headers) instead of an empty file. The
  empty-file path only worked by accident of `_get`'s defensive
  try/except.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): force re-import of `settings` in fake_settings()

The `fake_settings()` test helper only deleted `sys.modules['settings']`
*after* the yield (i.e. on clean exit). Once a prior test imported
`settings` cleanly, subsequent `import settings` calls returned the cache
without re-instantiating `AnthiasSettings`, so the fixture's config file
was silently ignored.

This was hidden because `test_broken_settings_should_raise_value_error`
was missing the `test_` prefix and never ran. After renaming it in the
previous commit it now runs and exposes the bug.

Pop the module before import (and again in `finally`) so each test gets
a fresh `AnthiasSettings()` instance bound to the fixture file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot review feedback

- backup_helper: use dedicated BackupRecoverError instead of bare
  Exception so callers can map archive failures to 4xx responses.
- mixins: catch BackupRecoverError / TarError in RecoverViewMixin and
  re-raise as DRF ValidationError so bad uploads return 400, not 500.
- utils: drop top-level `from sh import ffprobe, mplayer`; resolve
  binaries lazily with sh.Command at call sites so a missing tool
  doesn't break module import.
- host_agent: import redis ConnectionError explicitly (mypy strict).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): unblock mypy and ruff format; sanitize recover error

- stubs/redis-stubs: export ConnectionError so host_agent's
  retry_if_exception_type(redis.ConnectionError) typechecks under the
  isolated mypy CI env (no real redis package installed).
- host_agent.py: switch to redis.ConnectionError (matches new stub).
- lib/utils.py: ruff format fix for the lazy sh.Command(...) call.
- api/views/mixins.py: don't echo backup-recover exception text into
  the API response (CodeQL py/stack-trace-exposure). Log server-side
  and return a generic 'Invalid backup archive.' message instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): generate server-side name for backup recovery upload

Don't pass the client-supplied `file_upload.name` into `path.join('static', ...)`.
A crafted name (path separators, leading `../`, or an absolute path) could
write outside the static directory before the tarball is parsed. Use a
UUID-named `.tar.gz` instead — the original filename is never needed
again after the content-type check.

Addresses suppressed Copilot review comment on api/views/mixins.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot review round 2

- lib/utils.py: catch sh.CommandNotFound in get_video_duration (return
  None) and url_fails (skip streaming probe) so missing ffprobe/mplayer
  don't surface as 500s.
- lib/auth.py: decode basic-auth credentials with partition(':') so
  passwords containing ':' are accepted (RFC 7617).
- lib/backup_helper.py: tighten _safe_extract — reject non-regular tar
  members (symlinks, hardlinks, device nodes, FIFOs) and extract members
  individually after validation instead of calling extractall().
- tests/test_backup_helper.py: add coverage for path traversal,
  absolute-path, symlink, and FIFO rejection in _safe_extract.
- settings.py: fail fast in AnthiasSettings.__init__ when HOME is unset
  instead of silently rooting all config/state paths at the cwd.
- api/serializers/v1_1.py: raise ValidationError when 'duration' is
  missing/invalid for non-video assets instead of an unhandled KeyError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(test): mark tar fixture/extract calls as NOSONAR

The new SafeExtractTest builds and opens single-member tar archives
specifically to test the safe-extract guard. SonarCloud flags any
tarfile.open(...) call (rule python:S5042). Annotate the two test-
fixture call sites with NOSONAR — same pattern used elsewhere in this
repo (e.g. host_agent.py).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style(ci): use 6-space step indentation in python-mypy.yaml

Match the majority style used by the other workflows in this repo.
The previous 4-space indent under `steps:` was valid YAML and ran
fine in CI, but consistency with build-webview.yaml,
build-balena-disk-image.yaml, deploy-website.yaml, etc. makes the
file easier to read alongside the rest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot review round 3

- api/serializers/v1_1.py: raise DRF ValidationError instead of
  ValueError when video duration can't be determined, so clients get
  a 4xx with a field-level error rather than a 500.
- api/views/mixins.py: ensure the uploaded backup tarball is removed
  in all paths (success and failure). recover() only deletes on
  success; without explicit cleanup, rejected archives — including
  attacker-controlled ones — accumulate under static/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: ruff format tests/test_backup_helper.py

Single line that fit within 79 chars but I had broken across three
lines during the merge resolution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot review round 4

* Add partial PEP 561 stubs in `stubs/channels-stubs/` covering the
  `channels` API surface we actually use: `AsyncWebsocketConsumer`
  (with `as_asgi`), `ProtocolTypeRouter`, `URLRouter`,
  `AllowedHostsOriginValidator`, and `get_channel_layer`. Drop the
  `# type: ignore[misc]` from `AssetConsumer` and remove `channels.*`
  from the mypy `ignore_missing_imports` overrides — the override
  block is back to listing only libs that genuinely ship no type info.

* Delete `_safe_extract` and `_is_within_directory` from
  `lib/backup_helper.py`, plus the corresponding `SafeExtractTest` and
  `_build_archive_with` from `tests/test_backup_helper.py`. After the
  master merge, `recover()` validates each member through
  `_safe_tar_member` (skip-on-fail) and extracts inside the loop, so
  the older raise-on-fail helper became dead code with no caller.
  `RecoverLegacyTarballTest::test_recover_skips_path_traversal_member`
  already exercises the path-traversal guard end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address Copilot review round 5

* `celery_tasks.py:cleanup()` — bail out with a logged error when HOME
  is unset instead of `path.join('', 'anthias_assets')` resolving to a
  relative path. The `find ... -delete` was never going to run on a
  Pi without HOME, but the silent fallback would have chewed through
  the celery worker's cwd if it ever did.

* `api/serializers/v1_1.py:CreateAssetSerializerV1_1.prepare_asset()` —
  fix the video duration handling so non-zero and omitted durations
  are no longer dropped on the floor. Previously only the magic
  `duration == 0` case set `asset['duration']`; a client posting
  `duration=30` for a video produced a serialized dict with no
  duration key, then the view did `Asset.objects.create(**data)` and
  the row got the field's default. Now: missing or 0 means "infer
  from the file via get_video_duration()"; any other integer is
  persisted as-is. Non-int values raise ValidationError up front.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:53:37 +01:00

244 lines
7.7 KiB
Python

import json
import logging
import os
import random
import string
from requests import exceptions
from requests import get as requests_get
from requests import post as requests_post
from lib.device_helper import parse_cpu_info
from lib.diagnostics import get_git_branch, get_git_hash, get_git_short_hash
from lib.utils import connect_to_redis, is_balena_app, is_ci, is_docker
from settings import settings
r = connect_to_redis()
# Availability and HEAD commit of the remote branch to be checked
# every 24 hours.
REMOTE_BRANCH_STATUS_TTL = 60 * 60 * 24
# Suspend all external requests if we enconter an error other than
# a ConnectionError for 5 minutes.
ERROR_BACKOFF_TTL = 60 * 5
# Availability of the cached Docker Hub hash
DOCKER_HUB_HASH_TTL = 10 * 60
# Google Analytics data
ANALYTICS_MEASURE_ID = 'G-S3VX8HTPK7'
ANALYTICS_API_SECRET = 'G8NcBpRIS9qBsOj3ODK8gw'
DEFAULT_REQUESTS_TIMEOUT = 1 # in seconds
def handle_github_error(
exc: exceptions.RequestException,
action: str,
) -> None:
# After failing, dont retry until backoff timer expires
r.set('github-api-error', action)
r.expire('github-api-error', ERROR_BACKOFF_TTL)
# Print a useful error message
if exc.response:
errdesc = exc.response.content
else:
errdesc = 'no data'
logging.error(
'%s fetching %s from GitHub: %s', type(exc).__name__, action, errdesc
)
def remote_branch_available(branch: str | None) -> bool | None:
if not branch:
logging.error('No branch specified. Exiting.')
return None
# Make sure we havent recently failed before allowing fetch
if r.get('github-api-error') is not None:
logging.warning('GitHub requests suspended due to prior error')
return None
# Check for cached remote branch status
remote_branch_cache = r.get('remote-branch-available')
if remote_branch_cache is not None:
return bool(remote_branch_cache == '1')
try:
resp = requests_get(
'https://api.github.com/repos/screenly/anthias/branches',
headers={
'Accept': 'application/vnd.github.loki-preview+json',
},
timeout=DEFAULT_REQUESTS_TIMEOUT,
)
resp.raise_for_status()
except exceptions.RequestException as exc:
handle_github_error(exc, 'remote branch availability')
return None
found = False
for github_branch in resp.json():
if github_branch['name'] == branch:
found = True
break
# Cache and return the result
if found:
r.set('remote-branch-available', '1')
else:
r.set('remote-branch-available', '0')
r.expire('remote-branch-available', REMOTE_BRANCH_STATUS_TTL)
return found
def fetch_remote_hash() -> tuple[str | None, bool]:
"""
Returns both the hash and if the status was updated
or not.
"""
branch = os.getenv('GIT_BRANCH')
if not branch:
logging.error('Unable to get local Git branch')
return None, False
get_cache = r.get('latest-remote-hash')
if not get_cache:
# Ensure the remote branch is available before trying
# to fetch the HEAD ref.
if not remote_branch_available(branch):
logging.error('Remote Git branch not available')
return None, False
try:
resp = requests_get(
f'https://api.github.com/repos/screenly/anthias/git/refs/heads/{branch}', # noqa: E501
timeout=DEFAULT_REQUESTS_TIMEOUT,
)
resp.raise_for_status()
except exceptions.RequestException as exc:
handle_github_error(exc, 'remote branch HEAD')
return None, False
logging.debug('Got response from GitHub: {}'.format(resp.status_code))
latest_sha = resp.json()['object']['sha']
r.set('latest-remote-hash', latest_sha)
# Cache the result for the REMOTE_BRANCH_STATUS_TTL
r.expire('latest-remote-hash', REMOTE_BRANCH_STATUS_TTL)
return latest_sha, True
return get_cache, False
def get_latest_docker_hub_hash(device_type: str | None) -> str | None:
"""
This function is useful for cases where latest changes pushed does not
trigger Docker image builds.
"""
url = 'https://hub.docker.com/v2/namespaces/screenly/repositories/anthias-server/tags' # noqa: E501
cached_docker_hub_hash = r.get('latest-docker-hub-hash')
if cached_docker_hub_hash:
try:
response = requests_get(url, timeout=DEFAULT_REQUESTS_TIMEOUT)
response.raise_for_status()
except exceptions.RequestException as exc:
logging.debug('Failed to fetch latest Docker Hub tags: %s', exc)
return None
data = response.json()
results = data['results']
reduced = [
result['name'].split('-')[0]
for result in results
if not result['name'].startswith('latest-')
and result['name'].endswith(f'-{device_type}')
]
if len(reduced) == 0:
logging.warning(
'No commit hash found for device type: %s', device_type
)
return None
docker_hub_hash = reduced[0]
r.set('latest-docker-hub-hash', docker_hub_hash)
r.expire('latest-docker-hub-hash', DOCKER_HUB_HASH_TTL)
# Results are sorted by date in descending order,
# so we can just return the first one.
return str(reduced[0])
return str(cached_docker_hub_hash) if cached_docker_hub_hash else None
def is_up_to_date() -> bool:
"""
Primitive update check. Checks local hash against GitHub hash for branch.
Returns True if the player is up to date.
"""
latest_sha, retrieved_update = fetch_remote_hash()
git_branch = get_git_branch()
git_hash = get_git_hash()
git_short_hash = get_git_short_hash()
get_device_id = r.get('device_id')
if not latest_sha:
logging.error('Unable to get latest version from GitHub')
return True
if not get_device_id:
device_id = ''.join(
random.choice(string.ascii_lowercase + string.digits)
for _ in range(15)
)
r.set('device_id', device_id)
else:
device_id = get_device_id
if retrieved_update:
if not settings['analytics_opt_out'] and not is_ci():
ga_base_url = 'https://www.google-analytics.com/mp/collect'
ga_query_params = f'measurement_id={ANALYTICS_MEASURE_ID}&api_secret={ANALYTICS_API_SECRET}' # noqa: E501
ga_url = f'{ga_base_url}?{ga_query_params}'
payload = {
'client_id': device_id,
'events': [
{
'name': 'version',
'params': {
'Branch': str(git_branch),
'Hash': str(git_short_hash),
'NOOBS': os.path.isfile('/boot/os_config.json'),
'Balena': is_balena_app(),
'Docker': is_docker(),
'Pi_Version': parse_cpu_info().get(
'model', 'Unknown'
),
},
}
],
}
headers = {'content-type': 'application/json'}
try:
requests_post(
ga_url, data=json.dumps(payload), headers=headers
)
except exceptions.ConnectionError:
pass
device_type = os.getenv('DEVICE_TYPE')
latest_docker_hub_hash = get_latest_docker_hub_hash(device_type)
return (latest_sha == git_hash) or (
latest_docker_hub_hash == git_short_hash
)