James R. Barlow
2846d46bb8
Remove .coveragerc and fold into setup.cfg
2021-01-06 03:58:18 -08:00
James R. Barlow
0b3a526049
Partial fix crash on 'userunit' None ( #700 )
...
Our method of getting data from pdfminer would silently consume a StopIteration
if pdfminer returned no processed pages, leading to odd error message.
We improve an error from pdfminer properly, and returning a more
descriptive error of our own.
It would be possible for ocrmypdf to repair the file before sending it to
pdfminer, but this seems to be rare enough that we won't do that yet.
2021-01-01 01:11:32 -08:00
James R. Barlow
bd0f005861
tests: tag tests that need pngquant, jbig2enc
2020-12-30 01:58:57 -08:00
James R. Barlow
72fa347c38
tests: skip metadata test for two pikepdf versions that warn incorrectly
2020-12-29 01:47:52 -08:00
James R. Barlow
babc76fa74
tests: assert that most patched functions are called
...
We were not actually checking if functions we patched we called when
expected.
2020-12-28 23:58:33 -08:00
James R. Barlow
81602cf420
Fix test not patching properly after Ghostscript polling change
2020-12-27 16:01:50 -08:00
James R. Barlow
bb258fc99c
pdfinfo: Refactor pageinfo dictionary into a class
2020-12-24 01:47:53 -08:00
James R. Barlow
3675ae918c
Fix certain invalid page ranges causing exception
...
Closes #686
2020-12-22 01:22:14 -08:00
James R. Barlow
f11bb53e61
Change prefix of temporary folders
...
Shouldn't really use a name that suggests a connection to GitHub.
2020-12-07 21:51:46 -08:00
James R. Barlow
3cba50bfbd
windows: look in registry for Tesseract and Ghostscript
2020-12-04 13:21:54 -08:00
James R. Barlow
ce0e0ecd4d
Decouple tqdm from progressbar setup
2020-12-04 13:20:28 -08:00
James R. Barlow
7e1223c12c
ghostscript: add output tracing
2020-11-29 14:53:35 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
...
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.
This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
3707af3b74
Change pdf.root to pdf.Root
2020-11-03 01:30:31 -08:00
James R. Barlow
b0dcaa7512
v11.3.0 release notes
2020-10-24 03:19:32 -07:00
James R. Barlow
e8285b1d10
Add test to confirm rasterize_pdf_page rotates correct
2020-10-24 03:10:59 -07:00
James R. Barlow
bfe4a5b329
Tidy a log message
2020-09-25 00:17:57 -07:00
James R. Barlow
e6a7b58863
Merge branch 'de-gpl'
2020-08-12 12:20:38 -07:00
James R. Barlow
9b641055e1
Fix KeyError: 'dpi' when using --threshold on image to PDF
...
Fixes #607
2020-08-07 02:21:02 -07:00
James R. Barlow
bed74501fc
Fix test breakage in validation
...
Broken in commit 4cc0dc
2020-08-05 01:35:26 -07:00
James R. Barlow
aa0ec40102
Change license of all GPLv3 files to MPL-2.0
...
https://github.com/jbarlow83/OCRmyPDF/issues/600
2020-08-05 00:44:42 -07:00
James R. Barlow
7263702de9
Remove gs.py (spoofers entirely removed) and update copyright
2020-07-29 16:31:47 -07:00
James R. Barlow
44149ad319
Disable test_error_trap for Leptonica < 1.79
...
Old error trap seems unreliable in the first place so difficult to set up
a test.
2020-07-20 21:12:00 -07:00
James R. Barlow
5cbbff8472
For Leptonica 1.79+ use leptSetStderrHandler
...
Lock free and considerably less dangerous to stderr messages.
2020-07-19 03:40:33 -07:00
James R. Barlow
86a73191b0
Plugin manager: accept Path(plugin)
2020-06-30 04:17:30 -07:00
James R. Barlow
66337813e6
Spell runslow correctly
2020-06-22 23:32:09 -07:00
James R. Barlow
eb5a211e72
New hocrtransform test isn't platform stable - mark runslow
2020-06-22 16:59:59 -07:00
James R. Barlow
06ab114aa8
Update test cache
2020-06-22 16:31:34 -07:00
James R. Barlow
1257419465
test_hocrtransform: this test is worth not caching
2020-06-22 16:31:06 -07:00
James R. Barlow
30404f53f0
Add test to sanity check our pdf renderers
2020-06-22 16:18:38 -07:00
James R. Barlow
f4cb424451
Support input/output streams at API level
2020-06-22 02:02:18 -07:00
James R. Barlow
fef14778d5
Fix missing f-string in log message
2020-06-22 01:17:16 -07:00
James R. Barlow
48e2750551
Fix some tests that were failing in Docker
2020-06-21 01:48:13 -07:00
James R. Barlow
ebfe4f0d29
Fix issue #582 - PDF/A acquires title "Untitled" after conversion
2020-06-20 02:01:16 -07:00
James R. Barlow
892db88f0e
test_two_languages: use narrower test
2020-06-12 14:33:02 -07:00
James R. Barlow
eeb44f78cc
Fix tests that failed on other platforms from previous fix
2020-06-12 12:59:46 -07:00
James R. Barlow
393c5a9ea4
Fix error on -l lang1+lang2
2020-06-12 12:10:29 -07:00
James R. Barlow
c6b9a49cbb
Fix tests that fail in CI
2020-06-10 17:08:00 -07:00
James R. Barlow
872bafad4b
Reinstate quick test for text/no text
...
Partial revert of commit 991db17
2020-06-10 12:00:52 -07:00
James R. Barlow
64891c2fc3
Pre-release delinting
2020-06-09 15:27:14 -07:00
James R. Barlow
fe156db41d
Merge branch 'release/v10' into trialmerge
2020-06-09 15:12:56 -07:00
James R. Barlow
0f942fb714
Rename ocrmypdf.exec -> ocrmypdf._exec
2020-06-09 14:59:09 -07:00
James R. Barlow
be8ca589d4
Move ocrmypdf.exec.run and friends to ocrmypdf.subprocess
2020-06-09 14:53:10 -07:00
James R. Barlow
3b6f6782f0
Remove tesseract_env, --tesseract-env
2020-06-09 00:39:53 -07:00
James R. Barlow
21c0e045cb
Remove _OCRMYPDF_TEST_PATH environment variable
2020-06-09 00:30:13 -07:00
James R. Barlow
ebbf68bd08
The big payoff: abolishing spoofing machinery
2020-06-09 00:08:20 -07:00
James R. Barlow
2059e916da
Convert all ghostscript spoofs to test plugins
2020-06-09 00:00:25 -07:00
James R. Barlow
7b9025f397
Convert generate_pdfa to plugin
2020-06-08 22:28:38 -07:00
James R. Barlow
b109445215
Move Ghostscript rasterize_pdf to plugin
2020-06-08 17:10:27 -07:00
James R. Barlow
a9a473f2e5
Convert all tesseract cache usages to plugin
2020-06-05 17:55:18 -07:00