James R. Barlow
997bf7578d
hocrtransform: fix exception if no div ocr_page object
2020-12-11 14:14:21 -08:00
James R. Barlow
043258242c
hocrtransform: trivial typing
2020-12-11 14:14:21 -08:00
James R. Barlow
156d5d9a9c
cli: typing
2020-12-11 14:14:21 -08:00
James R. Barlow
0b7e52fb5e
api: parse cmdline in more type friendly way
2020-12-11 14:14:21 -08:00
James R. Barlow
a5feef07d0
Declare ocrmypdf as typed
2020-12-11 14:14:21 -08:00
James R. Barlow
f11bb53e61
Change prefix of temporary folders
...
Shouldn't really use a name that suggests a connection to GitHub.
2020-12-07 21:51:46 -08:00
James R. Barlow
68a57a7839
Add feature to generate hocr-pdf with visible debug text
2020-12-04 17:38:48 -08:00
James R. Barlow
4194430dc1
Begin next release notes
2020-12-04 13:28:04 -08:00
James R. Barlow
a707c56fae
docs: improve windows instructions
2020-12-04 13:21:54 -08:00
James R. Barlow
3cba50bfbd
windows: look in registry for Tesseract and Ghostscript
2020-12-04 13:21:54 -08:00
James R. Barlow
ed5e17d0a4
completions: consider *.PDF and some images too
2020-12-04 13:20:35 -08:00
James R. Barlow
ce0e0ecd4d
Decouple tqdm from progressbar setup
2020-12-04 13:20:28 -08:00
James R. Barlow
7e1223c12c
ghostscript: add output tracing
2020-11-29 14:53:35 -08:00
James R. Barlow
b83d7f6d1a
subprocess: refactor and add run_polling_stderr
2020-11-29 14:36:03 -08:00
James R. Barlow
80e957908a
tesseract: fix run call with logs_errors_to_stdout
2020-11-29 14:25:46 -08:00
James R. Barlow
f0e7bea8ba
docs: remove redundant statement
2020-11-27 13:54:36 -08:00
James R. Barlow
0cdb9bd04a
docs: remove description of how OMP_THREAD_LIMIT is managed
2020-11-23 12:36:04 -08:00
James R. Barlow
8224d89bc6
v11.3.4 release notes
v11.3.4
2020-11-18 11:57:28 -08:00
James R. Barlow
a2bbbe2a26
v11.3.4 release notes
2020-11-18 11:56:29 -08:00
James R. Barlow
43f41863fa
check_pdf: document how we handle linearization
2020-11-18 11:54:07 -08:00
James R. Barlow
d71e50e83d
Fix "readLinearizationData for file that is not linearized"
...
pikepdf 2.1.0 throws wrong type of exception in this case, so special-case it.
Closes #680
Closes #681
2020-11-18 11:52:17 -08:00
James R. Barlow
1f598da3c1
ghostscript: better docs and comments
2020-11-18 11:34:17 -08:00
James R. Barlow
d0cdbd5e1c
watcher: include uppercase .PDF too
2020-11-12 02:29:47 -08:00
James R. Barlow
5c56f61209
unpaper: type hints
2020-11-11 02:59:37 -08:00
James R. Barlow
9bec85470a
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2020-11-10 04:08:05 -08:00
James R. Barlow
a03863a17d
docs: fix link to docker image
2020-11-10 04:08:01 -08:00
James R. Barlow
22cd9b2364
docs: fix csv-table errors
2020-11-10 04:07:49 -08:00
pretentious7
4fc7d6d93e
fix typo "charcter" -> "character" ( #673 )
2020-11-09 16:53:02 -08:00
James R. Barlow
71f0e7f545
v11.3.3 release notes
v11.3.3
2020-11-07 00:53:33 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
...
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.
This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
5a59e4d543
unpaper: don't use universal_newlines=True
...
There's no specific reason to do this. We can log binary output equally
well.
2020-11-07 00:18:27 -08:00
James R. Barlow
b51abf2249
azure: Fix indentation mistake
2020-11-04 12:19:35 -08:00
James R. Barlow
6d3f9ff15a
api: rework ocr() slightly to simplify variable handling
2020-11-03 17:10:52 -08:00
James R. Barlow
5d1d1a712b
docs: more details about macOS API changes
...
Due to fork->spawn
2020-11-03 17:09:58 -08:00
James R. Barlow
6d5f8133e0
docs: show ifmain guard in example
2020-11-03 15:28:33 -08:00
James R. Barlow
13018d3d5c
ci: Extend test matrix to Python 3.9
2020-11-03 04:15:14 -08:00
James R. Barlow
14a85f9473
Fix pinned dependencies
v11.3.2
2020-11-03 04:12:47 -08:00
James R. Barlow
d22a1b3367
v11.3.2 release notes (2)
...
Since we never tagged it, fix other things.
2020-11-03 02:03:25 -08:00
James R. Barlow
b913e5dfef
ghostscript: don't repeat log in debug
...
Subprocess already does this for us.
2020-11-03 01:45:06 -08:00
James R. Barlow
dd8a5a4c72
Fix log domain names
...
ocrmypdf.subprocess.subprocess.ghostscript -> ocrmypdf.subprocess.ghostscript
2020-11-03 01:44:35 -08:00
James R. Barlow
36e9a54f02
Remove extraneous page rotation
...
This was added in commit b5ccbfd but seems to have been ill-advised.
2020-11-03 01:34:28 -08:00
James R. Barlow
3707af3b74
Change pdf.root to pdf.Root
2020-11-03 01:30:31 -08:00
James R. Barlow
ced7ad9164
unpaper: round off DPI
2020-11-03 01:14:57 -08:00
James R. Barlow
54bbbfdeb3
Fix UnboundLocalError when considering ImageMasks for optimization
...
Uncovered by test file in issue 667, although unrelated to that issue.
2020-11-03 01:08:14 -08:00
James R. Barlow
7f73a6ed1e
Some Python 3.9 fixes
2020-11-03 00:45:47 -08:00
James R. Barlow
dce206d3dc
Fix pre-commit for Py3.9
2020-11-03 00:20:25 -08:00
James R. Barlow
9304c856cf
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2020-11-02 02:47:36 -08:00
James R. Barlow
e5df98cbdf
v11.3.2 release notes
2020-11-02 02:43:32 -08:00
James R. Barlow
19bf3aeb00
api: improve typing
2020-11-02 02:33:34 -08:00
James R. Barlow
e86be0031c
unpaper: fix process output handling
...
With the ocrmypdf.subprocess wrapper, logging the output here
is redundant and loses the page number context.
2020-11-02 01:07:41 -08:00