Files
OCRmyPDF/src/ocrmypdf
James R. Barlow c993857752 Fix Form XObject cycle detection in image xref scan (#1321)
The 2024 guard against runaway recursion in _find_image_xrefs_container
only deduplicated image xrefs, but Form XObject xrefs are never added to
include_xrefs/exclude_xrefs, so a self-referential or DAG-shaped Form
graph re-entered every branch until the depth limit fired -- producing
the reported flood of warnings (and minutes-long hangs) on PowerPoint
exports.

Thread a visited_forms set through the recursion so each Form XObject is
descended into at most once per document. With memoization in place the
depth limit is no longer a cycle defense, so demote its log to debug.

Add a regression test that synthesises a circular-Form PDF from the
existing formxobject.pdf fixture (no new binary fixture, no license
issues) and asserts zero "Recursion depth exceeded" warnings.
2026-04-25 00:48:25 -07:00
..
2026-04-19 13:45:34 -07:00