James R. Barlow
71f0e7f545
v11.3.3 release notes
v11.3.3
2020-11-07 00:53:33 -08:00
James R. Barlow
895fddd85e
Replace most uses of universal_newlines with text
...
The parameters are equivalent but the latter is better named. Since
Python 3.6 doesn't support text= we use our wrapper to add it in that
place.
This is for subprocess.run.
2020-11-07 00:48:08 -08:00
James R. Barlow
5a59e4d543
unpaper: don't use universal_newlines=True
...
There's no specific reason to do this. We can log binary output equally
well.
2020-11-07 00:18:27 -08:00
James R. Barlow
b51abf2249
azure: Fix indentation mistake
2020-11-04 12:19:35 -08:00
James R. Barlow
6d3f9ff15a
api: rework ocr() slightly to simplify variable handling
2020-11-03 17:10:52 -08:00
James R. Barlow
5d1d1a712b
docs: more details about macOS API changes
...
Due to fork->spawn
2020-11-03 17:09:58 -08:00
James R. Barlow
6d5f8133e0
docs: show ifmain guard in example
2020-11-03 15:28:33 -08:00
James R. Barlow
13018d3d5c
ci: Extend test matrix to Python 3.9
2020-11-03 04:15:14 -08:00
James R. Barlow
14a85f9473
Fix pinned dependencies
v11.3.2
2020-11-03 04:12:47 -08:00
James R. Barlow
d22a1b3367
v11.3.2 release notes (2)
...
Since we never tagged it, fix other things.
2020-11-03 02:03:25 -08:00
James R. Barlow
b913e5dfef
ghostscript: don't repeat log in debug
...
Subprocess already does this for us.
2020-11-03 01:45:06 -08:00
James R. Barlow
dd8a5a4c72
Fix log domain names
...
ocrmypdf.subprocess.subprocess.ghostscript -> ocrmypdf.subprocess.ghostscript
2020-11-03 01:44:35 -08:00
James R. Barlow
36e9a54f02
Remove extraneous page rotation
...
This was added in commit b5ccbfd but seems to have been ill-advised.
2020-11-03 01:34:28 -08:00
James R. Barlow
3707af3b74
Change pdf.root to pdf.Root
2020-11-03 01:30:31 -08:00
James R. Barlow
ced7ad9164
unpaper: round off DPI
2020-11-03 01:14:57 -08:00
James R. Barlow
54bbbfdeb3
Fix UnboundLocalError when considering ImageMasks for optimization
...
Uncovered by test file in issue 667, although unrelated to that issue.
2020-11-03 01:08:14 -08:00
James R. Barlow
7f73a6ed1e
Some Python 3.9 fixes
2020-11-03 00:45:47 -08:00
James R. Barlow
dce206d3dc
Fix pre-commit for Py3.9
2020-11-03 00:20:25 -08:00
James R. Barlow
9304c856cf
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
2020-11-02 02:47:36 -08:00
James R. Barlow
e5df98cbdf
v11.3.2 release notes
2020-11-02 02:43:32 -08:00
James R. Barlow
19bf3aeb00
api: improve typing
2020-11-02 02:33:34 -08:00
James R. Barlow
e86be0031c
unpaper: fix process output handling
...
With the ocrmypdf.subprocess wrapper, logging the output here
is redundant and loses the page number context.
2020-11-02 01:07:41 -08:00
James R. Barlow
6425977998
unpaper: use pnm instead of png
...
Some users reported problems with PNG recently; try PNM.
Fixes #665
Fixes #667
2020-11-02 01:05:56 -08:00
James R. Barlow
d57df2d980
subprocess: support programs that write their messages to stdout
2020-11-02 01:00:59 -08:00
James R. Barlow
664d0c7969
Document configure_debug_logging
2020-11-02 00:59:00 -08:00
James R. Barlow
a354663ee1
Fix typo in API documentation
2020-11-02 00:58:28 -08:00
Graham Miln
b21b048ec4
Add macOS brew language support ( #615 )
...
Note `brew` command for installing additional languages on macOS.
2020-10-30 01:09:06 -07:00
James R. Barlow
709c65b41a
v11.3.1 release notes
v11.3.1
2020-10-27 23:11:11 -07:00
James R. Barlow
67f99c5bb7
Endorse pdfminer.six 20201018
2020-10-27 23:09:45 -07:00
James R. Barlow
d55e673d9c
Fix warning about --pdfa-image-compression argument at wrong times
...
Closes #663
2020-10-27 23:09:45 -07:00
James R. Barlow
21b90d2d14
Endorse pikepdf 2.x
2020-10-27 23:09:45 -07:00
Edward Betts
2def7e3392
Use % for percentage in string format ( #643 )
2020-10-27 23:09:14 -07:00
James R. Barlow
b0dcaa7512
v11.3.0 release notes
v11.3.0
2020-10-24 03:19:32 -07:00
James R. Barlow
e8285b1d10
Add test to confirm rasterize_pdf_page rotates correct
2020-10-24 03:10:59 -07:00
James R. Barlow
5ba56adb53
Fix page rotation issue (again)
...
Commit 1327ab3 introduced a fix for a regression, which was reported
in #581 , #634 . It appears that the actual cause of this issue was
default parameters to rasterize_pdf_page in pluggy not working as
expected, causing a default rotation=0 even when a rotation was needed.
As such the OCR image was generated with the wrong orientation,
causing the initial regression and fix in commit 1327ab3 .
Now that the real problem is identified, it's apparent that the logic
prior to 1327ab3 was found and we can revert to 1327ab3 since it fixes
all known cases including #658 .
This reverts 1327ab3 except for retaining improves to rotation output.
2020-10-24 02:45:21 -07:00
James R. Barlow
ca735278e0
setup: Version pluggy better
2020-10-24 02:35:41 -07:00
James R. Barlow
b5ccbfdf25
Fix hookspec of rasterize_pdf_page to remove default parameters
2020-10-24 02:35:18 -07:00
James R. Barlow
8c35d6e6e4
Fix debug log messages being suppressed from child processes
2020-10-22 02:20:06 -07:00
James R. Barlow
d1e0c81eda
Ensure worker_pdf is closed after gathering info in a thread
...
This is hacky, uses global state, but it does improve the situation for now.
2020-10-22 00:38:24 -07:00
James R. Barlow
10c8e4f8b4
Only create debug.log when running from command line
...
When used as a library ocrmypdf shouldn't make policy decisions, like where to
put a log file. Unsurprisingly, creating it causes problems for library users
because we deleted the temporary folder which held the log file and made no
effort to move it to a new location.
Also update the documentation to better described how an application should
handle this.
Closes #657
2020-10-20 01:29:36 -07:00
James R. Barlow
6be2242c21
Describe "OCR" step as "Image processing" when --tesseract-timeout=0
...
Fixes #647
2020-10-08 01:03:42 -07:00
James R. Barlow
204c9d6ae1
Fix inverted colors during JBIG2 optimization on paletted images
...
Fixes #640
v11.2.1
2020-10-07 04:08:50 -07:00
James R. Barlow
6eb393590b
v11.2.0 release notes
...
Change v11.1.3 to v11.2.0 since it contains functional changes.
v11.2.0
2020-10-06 03:24:31 -07:00
James R. Barlow
07c6654057
v11.1.3 release notes
2020-10-06 03:22:48 -07:00
James R. Barlow
4e15eb8d14
Fix image optimization discarding image masks and soft masks associated with PNGs
...
Fixes #648
2020-10-06 03:20:54 -07:00
James R. Barlow
8b01ab8ad2
Better type checking on ocrmypdf.ocr(plugins=...)
2020-10-05 15:02:34 -07:00
James R. Barlow
e0a522ad50
Document the example plugin
2020-10-05 15:01:44 -07:00
James R. Barlow
a1a8788c5a
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
v11.1.2
2020-09-29 02:46:27 -07:00
James R. Barlow
cccdc178c3
v11.1.2 release notes
2020-09-29 02:46:18 -07:00
James R. Barlow
4eacb3454f
hOCR: write text in correct order
...
Fixes #642
2020-09-29 02:45:11 -07:00