Python 3.11 is now the minimum supported version. This aligns with
the codebase's use of StrEnum (introduced in 3.11) and removes
compatibility shims that were only needed for older versions.
The pypdfium rasterizer was producing output images that differed by 1
pixel compared to Ghostscript due to floating-point precision issues in
dimension calculations.
Root cause:
- pypdfium used harmonic mean of x/y DPI to calculate a single scale
factor, losing the distinction between x and y DPI
- No DPI rounding like Ghostscript's 6-decimal precision
- Compound rounding errors when converting points to pixels
Solution:
1. Round DPI to 6 decimals to match Ghostscript's precision
2. Calculate expected output dimensions using separate x/y DPI values
3. Handle dimension swapping for 90°/270° rotations
4. Resize output image if off by 1-2 pixels (graceful correction)
This ensures pixel-perfect matching with Ghostscript while being
minimally invasive and only resizing when necessary.
Changes:
- Modified _render_page_to_bitmap() to calculate expected dimensions
- Modified _process_image_for_output() to correct small discrepancies
- Updated rasterize_pdf_page() to pass dimensions through pipeline
- Parametrized rotation tests to run with both rasterizers
All 45 rotation tests now pass with both pypdfium and ghostscript.
Fixes test_rotated_skew_timeout with pypdfium rasterizer.
Migrate watcher.py and pdf_text_diff.py from typer to cyclopts for
CLI argument parsing. Update pyproject.toml to reflect the dependency
change in the watcher optional feature.
Establish clear separation between user-facing optional dependencies
and developer-only dependency groups:
**Optional Dependencies (user features):**
- watcher: File watching service for batch processing
- webservice: Streamlit-based web UI
- Installable via: uv sync --extra <name> or pip install ocrmypdf[name]
**Dependency Groups (developer tools):**
- test: Testing infrastructure (merged from test + extended_test)
- docs: Documentation building tools
- streamlit-dev: Enhanced Streamlit development tools
- dev: General development tools (mypy, ipykernel)
- Installable via: uv sync --group <name> (uv only, NOT pip)
Breaking changes for developers:
- pip install -e .[test] no longer works → use uv sync --group test
- pip install -e .[docs] no longer works → use uv sync --group docs
- pip install -e .[extended_test] removed → merged into test group
No breaking changes for end users:
- pip install ocrmypdf[watcher] still works
- pip install ocrmypdf[webservice] still works
Updated:
- CI/CD workflows to use uv sync --group test
- Docker images to exclude test dependencies
- Documentation to recommend uv with pip as fallback
- pyproject.toml with clear comments explaining both systems
- Update pipeline to use fpdf2 renderer as default
- Remove legacy hocrtransform PDF renderer (_font.py, _hocr.py,
pdf_renderer.py)
- Update CLI and options for fpdf2 renderer
- Add fpdf2 dependency to pyproject.toml
- Update graft module for fpdf2 multi-page rendering
In 0.12.1, Typer was significantly reorganized.
- `typer-slim` is the library (for `import typer`)
- `typer-slim[standard]` adds optional dependencies (currently `rich`
and `shellingham`, basically equivalent to the old `typer[all]`)
- `typer-cli` is the `typer` command-line tool
- `typer` is now basically a metapackage that brings in *all of the
above*, and it no longer has an `all` extra
Pip will warn about this and proceed,
```
WARNING: typer 0.12.1 does not provide the extra 'all'
```
but there are other tools that will fail hard when asked to resolve a
(now) nonexistent extra.
Since this project doesn’t need the `typer` command-line tool, it looks
like changing the dependency to `typer-slim[standard]` is the best way
forward.
See https://typer.tiangolo.com/release-notes/#0121 and
tiangolo/typer#785 for further discussion
and details.