James R. Barlow
2846d46bb8
Remove .coveragerc and fold into setup.cfg
2021-01-06 03:58:18 -08:00
James R. Barlow
47ef1914d4
v11.4.4 release notes
v11.4.4
2021-01-01 01:39:24 -08:00
James R. Barlow
df157552f3
Make ocrmypdf.ocr take a threading lock
2021-01-01 01:37:09 -08:00
James R. Barlow
0b3a526049
Partial fix crash on 'userunit' None ( #700 )
...
Our method of getting data from pdfminer would silently consume a StopIteration
if pdfminer returned no processed pages, leading to odd error message.
We improve an error from pdfminer properly, and returning a more
descriptive error of our own.
It would be possible for ocrmypdf to repair the file before sending it to
pdfminer, but this seems to be rare enough that we won't do that yet.
2021-01-01 01:11:32 -08:00
James R. Barlow
1e80d412fa
tesseract: fix typing of some optional arguments
2021-01-01 00:46:00 -08:00
James R. Barlow
df6e106203
concurrent: simplify results loop
2021-01-01 00:44:46 -08:00
James R. Barlow
bd0f005861
tests: tag tests that need pngquant, jbig2enc
v11.4.3
2020-12-30 01:58:57 -08:00
James R. Barlow
6ba4b7b3f3
ci: temporarily disable pngquant on Windows
...
Looks like a packaging error, choco complains of bad hashes.
2020-12-30 01:40:56 -08:00
James R. Barlow
2c11349ee8
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2020-12-29 21:40:46 -08:00
James R. Barlow
b0afef09ef
v11.4.3 release notes
2020-12-29 21:40:35 -08:00
James R. Barlow
72fa347c38
tests: skip metadata test for two pikepdf versions that warn incorrectly
2020-12-29 01:47:52 -08:00
James R. Barlow
96d68c2413
pipeline: refactor metadata_fixup
2020-12-29 01:47:32 -08:00
James R. Barlow
babc76fa74
tests: assert that most patched functions are called
...
We were not actually checking if functions we patched we called when
expected.
2020-12-28 23:58:33 -08:00
Tim Gates
dc06990e5d
docs: fix simple typo, instsalled -> installed ( #704 )
...
There is a small typo in docs/installation.rst.
Should read `installed` rather than `instsalled`.
2020-12-28 15:28:34 -08:00
James R. Barlow
0ff0d2f8d1
Remove PDF/A overprint debug message
...
Since we currently log all of a process's output at debug it's
redundant to log this separate message.
2020-12-27 16:19:05 -08:00
James R. Barlow
81602cf420
Fix test not patching properly after Ghostscript polling change
2020-12-27 16:01:50 -08:00
James R. Barlow
607e2d7e81
v11.4.2 release notes
v11.4.2
2020-12-27 03:29:35 -08:00
James R. Barlow
b01d9e07e8
Deal with missing pthread_sigmask on Cygwin
...
Closes #701
2020-12-27 02:24:00 -08:00
James R. Barlow
91db94cf2e
watcher: fix OCR_LOGLEVEL env var not processed
...
Closes #702
2020-12-27 02:02:44 -08:00
James R. Barlow
416df803d4
pdfinfo: stricter typing
2020-12-24 22:39:00 -08:00
James R. Barlow
037b96ca16
pdfinfo: refactor to eliminate RawPageInfo
2020-12-24 02:57:44 -08:00
James R. Barlow
bb258fc99c
pdfinfo: Refactor pageinfo dictionary into a class
2020-12-24 01:47:53 -08:00
James R. Barlow
4b8ccbe8cb
v11.4.1 release notes
v11.4.1
2020-12-22 01:41:15 -08:00
James R. Barlow
ab1ff3331b
misc: synology fix
...
Accept user-contributed fix. Not testable.
Close #690 .
2020-12-22 01:38:41 -08:00
James R. Barlow
3675ae918c
Fix certain invalid page ranges causing exception
...
Closes #686
2020-12-22 01:22:14 -08:00
James R. Barlow
0ba32b96b7
Revert "v11.4.0 release notes - remove change not actually implemented"
...
This reverts commit ad202693b3 .
Temporary folder prefix was actually changed in commit f11bb53e .
2020-12-22 00:47:25 -08:00
James R. Barlow
add64e4fa2
docs: com.github.ocrmypdf -> ocrmypdf.io
2020-12-22 00:46:42 -08:00
James R. Barlow
7fe2954ede
Change wheel tag to py36, update package_data to include py.typed
2020-12-12 16:49:04 -08:00
James R. Barlow
ad202693b3
v11.4.0 release notes - remove change not actually implemented
...
Remove a change that was pushed back to a future release.
2020-12-12 16:27:38 -08:00
James R. Barlow
594ef83551
v11.4.0 release notes
v11.4.0
2020-12-11 15:09:49 -08:00
James R. Barlow
78b71618c1
Fix BufferedReader TypeError
2020-12-11 14:19:20 -08:00
James R. Barlow
b8aa89e1ec
Fix log message queue flooding on certain files
...
Fixes #692
2020-12-11 14:14:21 -08:00
James R. Barlow
b4c1f66bc1
typing: tidy up
2020-12-11 14:14:21 -08:00
James R. Barlow
5172dbde8d
subprocess: use more mypy-friendly syntax
2020-12-11 14:14:21 -08:00
James R. Barlow
d2908640c6
pdfa: help mypy figure out a type
2020-12-11 14:14:21 -08:00
James R. Barlow
997bf7578d
hocrtransform: fix exception if no div ocr_page object
2020-12-11 14:14:21 -08:00
James R. Barlow
043258242c
hocrtransform: trivial typing
2020-12-11 14:14:21 -08:00
James R. Barlow
156d5d9a9c
cli: typing
2020-12-11 14:14:21 -08:00
James R. Barlow
0b7e52fb5e
api: parse cmdline in more type friendly way
2020-12-11 14:14:21 -08:00
James R. Barlow
a5feef07d0
Declare ocrmypdf as typed
2020-12-11 14:14:21 -08:00
James R. Barlow
f11bb53e61
Change prefix of temporary folders
...
Shouldn't really use a name that suggests a connection to GitHub.
2020-12-07 21:51:46 -08:00
James R. Barlow
68a57a7839
Add feature to generate hocr-pdf with visible debug text
2020-12-04 17:38:48 -08:00
James R. Barlow
4194430dc1
Begin next release notes
2020-12-04 13:28:04 -08:00
James R. Barlow
a707c56fae
docs: improve windows instructions
2020-12-04 13:21:54 -08:00
James R. Barlow
3cba50bfbd
windows: look in registry for Tesseract and Ghostscript
2020-12-04 13:21:54 -08:00
James R. Barlow
ed5e17d0a4
completions: consider *.PDF and some images too
2020-12-04 13:20:35 -08:00
James R. Barlow
ce0e0ecd4d
Decouple tqdm from progressbar setup
2020-12-04 13:20:28 -08:00
James R. Barlow
7e1223c12c
ghostscript: add output tracing
2020-11-29 14:53:35 -08:00
James R. Barlow
b83d7f6d1a
subprocess: refactor and add run_polling_stderr
2020-11-29 14:36:03 -08:00
James R. Barlow
80e957908a
tesseract: fix run call with logs_errors_to_stdout
2020-11-29 14:25:46 -08:00