OCRmyPDF

mirror of https://github.com/ocrmypdf/OCRmyPDF.git synced 2026-05-05 21:27:37 -04:00

Files

James R. Barlow a4f07756a5 tesseract caching: don't transcode tesseract's output, hash source file

For sanity's sake, deal with tesseract streams in binary without
transcoding (via universal_newlines, etc.). The only differences are
printing messages regarding spoofing.

Also hash the source file so that changes to the cache mechanism
invalidate old cache automatically. That is probably too aggressive,
but simple and safer than the previous approach.

2016-10-28 16:44:12 -07:00

qpdf_dummy_return2.py

Add Tesseract spoofing

2015-12-17 11:36:47 -08:00

tesseract_big_image_error.py

Improve some documentation for tests

2016-08-26 15:04:08 -07:00

tesseract_cache.py

tesseract caching: don't transcode tesseract's output, hash source file

2016-10-28 16:44:12 -07:00

tesseract_crash.py

Improve some documentation for tests