James R. Barlow
731e6792c7
Add test cases for Ghostscript PDF/A warnings
2016-12-03 00:32:09 -08:00
James R. Barlow
a4f07756a5
tesseract caching: don't transcode tesseract's output, hash source file
...
For sanity's sake, deal with tesseract streams in binary without
transcoding (via universal_newlines, etc.). The only differences are
printing messages regarding spoofing.
Also hash the source file so that changes to the cache mechanism
invalidate old cache automatically. That is probably too aggressive,
but simple and safer than the previous approach.
2016-10-28 16:44:12 -07:00
James R. Barlow
cc7e328358
Improve some documentation for tests
2016-08-26 15:04:08 -07:00
James R. Barlow
322085933b
unpaper: fix check for missing and old versions, add test case
2016-03-10 15:37:09 -08:00
James R. Barlow
8246cc0538
Gracefully recover from tesseract's failure to process very large images
...
And test cases to check this
2016-02-20 04:53:23 -08:00
James R. Barlow
ac71c3be63
4.0.2rc1 - release notes, add missing file caught by Travis
2016-02-20 03:36:37 -08:00
James R. Barlow
b907234d5c
Update tesseract spoofing to cache orientation and script detection checks
...
No cache: 269 s
With cache: 144 s
test_oversample[tesseract] now fails, all others good
2016-02-08 02:21:56 -08:00
James R. Barlow
43b0faa830
Bug in tesseract_noop spoof: produced wrong page sizes
...
Now checks input image to ensure the implied page size of its .hocr file
matches the rest of the PDF.
2016-02-04 18:48:22 -08:00
James R. Barlow
3b53e9adac
Use tesseract cache for -psm
2016-01-11 17:22:50 -08:00
James R. Barlow
09782242c8
Adjust test cases to use cache and noop more effectively
...
This reduces total execution time to 164s on my machine, down from
about double that.
2015-12-17 14:00:17 -08:00
James R. Barlow
9ec4aa039d
Add tesseract caching to speed up tests
2015-12-17 12:52:12 -08:00
James R. Barlow
7313a77c2a
Implement pdf renderer side of tess spoof
2015-12-17 11:41:54 -08:00
James R. Barlow
45113676a3
Add Tesseract spoofing
2015-12-17 11:36:47 -08:00