When Ghostscript reports a DeviceN colorspace with an inappropriate
alternate, the resulting PDF/A may render blank in viewers such as Adobe
Reader (#1187). The error is gated on that Ghostscript warning, which is
the authoritative signal that the *output* is broken.
Previously the error message always told the user to "use
--color-conversion-strategy", which is confusing when they already set
one and it didn't help. Crucially, the warning persists for strategies
that don't actually normalize the colorspace -- notably
UseDeviceIndependentColor (confirmed in #1187) -- so silencing the error
for any non-default strategy would emit a silently-broken PDF/A.
Keep raising whenever Ghostscript still reports the warning, regardless
of strategy, but tailor the guidance: if no conversion was requested,
suggest RGB/CMYK/Gray; if a conversion was requested but the warning
persisted, say so and point at strategies that work or --output-type pdf.
Add unit tests (mocked Ghostscript) covering the default case, the
warning-persists-despite-strategy case for both an ineffective strategy
and a normally-effective one, and the no-warning happy path.
Expose Ghostscript's -dJPEGQ and image downsampling switches as
advanced, plugin-scoped options for tuning PDF/A output, without
polluting the central OcrOptions registry. The optimizer's existing
--jpeg-quality remains the recommended JPEG quality control.
- GhostscriptOptions gains jpeg_quality and jpeg_maxdpi fields and CLI
args (advanced help text). jpeg_quality=0 is honored as Ghostscript's
maximum compression rather than being silently coerced to the default.
- _exec.ghostscript.generate_pdfa() forwards both values; when
jpeg_maxdpi is set, downsample threshold is pinned at 1.0.
- _get_plugin_options falls back to extra_attrs for namespaced fields
so plugins can own their options without registering them centrally.
- Documentation explains the rationale: Ghostscript is the legacy path
(pypdfium + verapdf is preferred in v17+), the optimizer is the
supported file-size lever, and lowering quality is almost always a
better trade than downsampling.
Ghostscript may fail when asked to rasterize at very low DPI values
(below 10 on either axis). This adds a workaround that uses a minimum
of 10 DPI for the Ghostscript call, then resizes the output image to
match the dimensions that would have resulted from the original low
DPI request.
Fixes#1612
Ghostscript 10.6 has a bug that truncates JPEG data by 1-15 bytes.
This adds detection and repair by comparing output images to input
images and restoring the original bytes when truncation is detected.
- Add warning when GS 10.6+ is used with PDF/A output
- Add _repair_gs106_jpeg_corruption() to fix damaged JPEGs after
Ghostscript processing
- Add unit tests for the repair function