OCRmyPDF

mirror of https://github.com/ocrmypdf/OCRmyPDF.git synced 2026-04-18 05:00:01 -04:00

Files

James R. Barlow c85c8941d3 Fix pdftotext word spacing by emitting single BT block per line

poppler/pdftotext does not carry Tz (horizontal scaling) across
BT/ET boundaries, causing words to appear on separate lines.
Replace per-word BT blocks (via fpdf2's cell/set_stretching API)
with a single BT block per line using raw PDF operators. Each
non-last word gets a trailing space with Tz calculated to span
exactly to the next word's start position.

2026-02-11 00:42:10 -08:00

images

…

advanced.md

…

api.md

…

apiref.md

…

batch.md

…

cloud.md

…

conf.py

…

contributing.md

…

cookbook.md

…

design_notes.md

…

docker.md

…

errors.md

…

index.md

…

installation.md

…

introduction.md

…

jbig2.md

…

languages.md

…

maintainers.md

…

optimizer.md

…

pdfsecurity.md

…

performance.md

…

plugins.md

…

release_notes.md

Fix pdftotext word spacing by emitting single BT block per line

2026-02-11 00:42:10 -08:00