mirror of
https://github.com/cassandra/home-information.git
synced 2026-06-11 09:05:00 -04:00
* Implement media thumbnail previews for file attributes
* Replace PyMuPDF with pdf2image for PDF thumbnail generation
* create utility file for thumbnail generation
* Refactor thumbnail generation logic
* Add backfill command for missing file thumbnails and update docker_entrypoint script
* Added poppler dependency (needed for pdf2img) to dev setup docs.
* Fixed file card grid alignment with uniform thumbnail heights.
* Fix thumbnail backfill command test: filename desync via direct path
The test saved a file at a hardcoded source_path, then created an
EntityAttribute with file_value=source_path. AttributeModel.save() calls
generate_unique_filename() to add a timestamp suffix to file_value.name,
which desyncs the attribute's stored path from the on-disk path —
file_value.size then raises FileNotFoundError inside the backfill command.
Pass the file bytes via ContentFile(..., name='...') instead. The model
now owns the full filename-generation lifecycle and the on-disk and
in-database paths stay in sync.
Applied symmetrically to the dry-run test for consistency, even though
its assertions happened to pass without seeing the underlying mismatch.
* Switch thumbnail backfill from Docker entrypoint to render-time lazy gen
The Docker entrypoint's call to ``backfill_attribute_thumbnails`` had
unbounded startup cost for users with many file attributes, paid
overhead on every container restart even when nothing needed
generation, and was a permanent ongoing entry point for what's really
a transient migration need.
Replace with synchronous lazy generation triggered the first time a
file card is rendered for an attribute that lacks a thumbnail. Each
file attribute pays the generation cost once, on first view, spread
across actual usage instead of forced into startup.
- ``package/docker_entrypoint.sh``: drop the backfill invocation.
- ``AttributeModel.ensure_thumbnail()``: model method that generates
if missing, no-op otherwise.
- ``{% ensure_thumbnail attribute %}``: template tag wrapper for
call-from-template ergonomics.
- ``file_card.html``: invoke the tag once at the top, before any
``has_thumbnail`` / ``thumbnail_url`` reads.
The ``backfill_attribute_thumbnails`` management command stays
available for users who want to warm the cache explicitly. Its
counters remain accurate because the new render-time path is
isolated to ``ensure_thumbnail()`` — ``has_thumbnail`` stays a
pure existence check.
* Harden PDF thumbnail generation against pathological input
The 20MB pre-render byte cap was the only protection against expensive
thumbnail generation, but PDF rendering cost doesn't correlate well
with file size — a small PDF can produce a multi-GB pixel buffer at
pdf2image's 200-DPI default, and a crafted PDF can hang the underlying
pdftoppm subprocess indefinitely.
Three protections, all on the PDF path:
- ``size=(640, 640)`` passed to ``convert_from_bytes()``: caps the
rasterized output dimensions, keeping pre-resize memory bounded.
640 gives roughly a 2x oversample of the 320x320 thumbnail for
LANCZOS quality without runaway buffers.
- ``timeout=30`` threaded through to the pdftoppm subprocess so
pathological content can't hang generation.
- 10MB per-mime-type source-byte cap for PDFs, separate from the
existing 20MB cap for images.
Constants exposed as class attributes on ``AttributeThumbnail`` so
they're discoverable and tunable in one place.
* Install poppler-utils in CI for pdf2image-dependent tests
pdf2image shells out to pdftoppm (from the poppler-utils package) to
rasterize PDFs. The Docker image already installs poppler-utils, but
the GitHub Actions workflow runner did not, so
test_generate_thumbnail_best_effort_pdf_success passes locally and in
Docker but failed in CI on the ``assertTrue(generated)`` check.
Add an apt-get install step for poppler-utils so CI exercises the
same PDF-rendering surface as production.
* Add tests for AttributeModel.ensure_thumbnail() lazy-generation hook
ensure_thumbnail() wraps AttributeThumbnail.generate_thumbnail_best_effort()
with three branches that aren't transitively covered by the existing
generator tests or by the backfill command tests:
- supported file, thumbnail missing -> generates
- thumbnail already present -> short-circuits before instantiating the
generator (verified by mocking AttributeThumbnail)
- unsupported mime type -> short-circuits at the supports check
Co-located with the existing generate_thumbnail_best_effort_* tests
so a future reader finds the lazy hook's coverage in the same place
as the generator's direct coverage.
---------
Co-authored-by: Anthony Cassandra <github@cassandra.org>
55 lines
1.6 KiB
Docker
55 lines
1.6 KiB
Docker
# Pin specific Python version for consistency across platforms
|
|
FROM python:3.11.8-bookworm
|
|
|
|
# Install dependencies with curl for healthcheck
|
|
RUN apt-get update \
|
|
&& apt-get install -y --no-install-recommends \
|
|
curl \
|
|
supervisor \
|
|
nginx \
|
|
redis-server \
|
|
redis-tools \
|
|
poppler-utils \
|
|
&& mkdir -p /var/log/supervisor \
|
|
&& mkdir -p /etc/supervisor/conf.d \
|
|
&& rm -rf /var/lib/apt/lists/* \
|
|
&& pip install --upgrade pip
|
|
|
|
WORKDIR /src
|
|
|
|
ENV PYTHONDONTWRITEBYTECODE=1
|
|
ENV PYTHONUNBUFFERED=1
|
|
ENV PYTHONPATH=/src
|
|
|
|
EXPOSE 8000
|
|
|
|
VOLUME /data/database /data/media
|
|
RUN mkdir -p /data/database && mkdir -p /data/media
|
|
|
|
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
|
|
CMD curl -f http://localhost:8000/health || exit 1
|
|
|
|
# Assumes base.txt is all that is needed (ignores dev-specific dependencies)
|
|
COPY src/hi/requirements/base.txt /src/requirements.txt
|
|
RUN pip install --no-cache-dir --root-user-action=ignore -r requirements.txt
|
|
|
|
COPY package/docker_supervisord.conf /etc/supervisor/conf.d/hi.conf
|
|
COPY package/docker_nginx.conf /etc/nginx/sites-available/default
|
|
|
|
# Clean up nginx default configurations and ensure proper symlinks
|
|
RUN rm -f /etc/nginx/conf.d/default.conf \
|
|
&& rm -f /etc/nginx/sites-enabled/default \
|
|
&& ln -s /etc/nginx/sites-available/default /etc/nginx/sites-enabled/default \
|
|
&& nginx -t
|
|
|
|
COPY package/docker_entrypoint.sh /src/entrypoint.sh
|
|
RUN chmod +x /src/entrypoint.sh
|
|
|
|
COPY HI_VERSION /HI_VERSION
|
|
COPY src /src
|
|
RUN chmod +x /src/bin/docker-start-gunicorn.sh
|
|
|
|
ENTRYPOINT ["/src/entrypoint.sh"]
|
|
|
|
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/supervisord.conf" ]
|