James R. Barlow
d55e673d9c
Fix warning about --pdfa-image-compression argument at wrong times
...
Closes #663
2020-10-27 23:09:45 -07:00
James R. Barlow
21b90d2d14
Endorse pikepdf 2.x
2020-10-27 23:09:45 -07:00
Edward Betts
2def7e3392
Use % for percentage in string format ( #643 )
2020-10-27 23:09:14 -07:00
James R. Barlow
b0dcaa7512
v11.3.0 release notes
v11.3.0
2020-10-24 03:19:32 -07:00
James R. Barlow
e8285b1d10
Add test to confirm rasterize_pdf_page rotates correct
2020-10-24 03:10:59 -07:00
James R. Barlow
5ba56adb53
Fix page rotation issue (again)
...
Commit 1327ab3 introduced a fix for a regression, which was reported
in #581 , #634 . It appears that the actual cause of this issue was
default parameters to rasterize_pdf_page in pluggy not working as
expected, causing a default rotation=0 even when a rotation was needed.
As such the OCR image was generated with the wrong orientation,
causing the initial regression and fix in commit 1327ab3 .
Now that the real problem is identified, it's apparent that the logic
prior to 1327ab3 was found and we can revert to 1327ab3 since it fixes
all known cases including #658 .
This reverts 1327ab3 except for retaining improves to rotation output.
2020-10-24 02:45:21 -07:00
James R. Barlow
ca735278e0
setup: Version pluggy better
2020-10-24 02:35:41 -07:00
James R. Barlow
b5ccbfdf25
Fix hookspec of rasterize_pdf_page to remove default parameters
2020-10-24 02:35:18 -07:00
James R. Barlow
8c35d6e6e4
Fix debug log messages being suppressed from child processes
2020-10-22 02:20:06 -07:00
James R. Barlow
d1e0c81eda
Ensure worker_pdf is closed after gathering info in a thread
...
This is hacky, uses global state, but it does improve the situation for now.
2020-10-22 00:38:24 -07:00
James R. Barlow
10c8e4f8b4
Only create debug.log when running from command line
...
When used as a library ocrmypdf shouldn't make policy decisions, like where to
put a log file. Unsurprisingly, creating it causes problems for library users
because we deleted the temporary folder which held the log file and made no
effort to move it to a new location.
Also update the documentation to better described how an application should
handle this.
Closes #657
2020-10-20 01:29:36 -07:00
James R. Barlow
6be2242c21
Describe "OCR" step as "Image processing" when --tesseract-timeout=0
...
Fixes #647
2020-10-08 01:03:42 -07:00
James R. Barlow
204c9d6ae1
Fix inverted colors during JBIG2 optimization on paletted images
...
Fixes #640
v11.2.1
2020-10-07 04:08:50 -07:00
James R. Barlow
6eb393590b
v11.2.0 release notes
...
Change v11.1.3 to v11.2.0 since it contains functional changes.
v11.2.0
2020-10-06 03:24:31 -07:00
James R. Barlow
07c6654057
v11.1.3 release notes
2020-10-06 03:22:48 -07:00
James R. Barlow
4e15eb8d14
Fix image optimization discarding image masks and soft masks associated with PNGs
...
Fixes #648
2020-10-06 03:20:54 -07:00
James R. Barlow
8b01ab8ad2
Better type checking on ocrmypdf.ocr(plugins=...)
2020-10-05 15:02:34 -07:00
James R. Barlow
e0a522ad50
Document the example plugin
2020-10-05 15:01:44 -07:00
James R. Barlow
a1a8788c5a
Merge branch 'master' of github.com:jbarlow83/OCRmyPDF
v11.1.2
2020-09-29 02:46:27 -07:00
James R. Barlow
cccdc178c3
v11.1.2 release notes
2020-09-29 02:46:18 -07:00
James R. Barlow
4eacb3454f
hOCR: write text in correct order
...
Fixes #642
2020-09-29 02:45:11 -07:00
Jimit Dholakia
82b8b41e80
docs: Add 'unpaper' optional dependency for Ubuntu 18.04 ( #639 )
2020-09-25 11:54:31 -07:00
James R. Barlow
581c5020ab
v11.1.1 release notes
v11.1.1
2020-09-25 00:28:38 -07:00
James R. Barlow
3ef8872a1e
pngquant driver: refactor, use streams instead of temporary files
2020-09-25 00:18:02 -07:00
James R. Barlow
28eec73eed
Tighten unpaper-args validation to exclude . and ..
...
Just in case
2020-09-25 00:18:02 -07:00
James R. Barlow
bfe4a5b329
Tidy a log message
2020-09-25 00:17:57 -07:00
James R. Barlow
29097837d6
Release notes typo
2020-09-19 00:49:36 -07:00
James R. Barlow
a40361db3c
Remove unpaper from macOS build
...
Homebrew seems to be having issues with its deps?
v11.1.0
2020-09-17 03:38:48 -07:00
James R. Barlow
8b29e3cbab
Merge commit '9a6cd95e5fe2826d40861229aaa0431b76e302e7'
2020-09-17 03:34:35 -07:00
James R. Barlow
b170be120b
v11.1.0 release notes
2020-09-17 03:21:06 -07:00
Suyash Behera
9a6cd95e5f
load zlib before liblept on windows ( #633 )
...
fixes #631
2020-09-17 03:14:42 -07:00
James R. Barlow
d464d3122e
Use img2pdf to create optimized PNG images
...
Fixes #629 , #620
2020-09-17 03:11:26 -07:00
James R. Barlow
1327ab37d4
Fix page rotation regression
...
Fixes #634 , #581
2020-09-17 02:57:00 -07:00
James R. Barlow
67553fc5c6
Display page numbers in log messages when grafting
2020-09-17 01:20:50 -07:00
James R. Barlow
306a903854
Remove unused function log_page_orientations
2020-09-17 01:20:02 -07:00
James R. Barlow
b93cf51c0f
Disable pikepdf mmap
...
Infrequently we can reproduce this error:
terminating with uncaught exception of type std::runtime_error: pybind11_object_dealloc(): Tried to deallocate unregistered instance!
The error is probably related to pybind11 issue #2252 and a bunch of
other related issues. Until that is resolved in pybind11 and pikepdf
we will disable the pikepdf mmap interface.
2020-09-16 23:48:55 -07:00
James R. Barlow
6b994221c6
Remove Python 3.7 from build since homebrew removed it
2020-09-16 23:44:18 -07:00
James R. Barlow
8b5b02e0d8
Expand documentation of filter_page_image
2020-09-14 14:36:17 -07:00
James R. Barlow
624df9bb23
Extend example plugin with example of mono conversion
2020-09-14 14:35:50 -07:00
James R. Barlow
fa06ea3600
v11.0.2 release notes
v11.0.2
2020-09-08 02:38:57 -07:00
James R. Barlow
31994258fb
metadata fixup: don't try to update original PDF's metadata with docinfo
2020-09-08 02:35:16 -07:00
James R. Barlow
1f15ecbca5
Add "Postprocessing" message as a hint for long Ghostscript runs
2020-09-08 02:34:10 -07:00
James R. Barlow
bcf5657e5c
Reorganize issue templates
2020-08-26 17:11:52 -07:00
jbarlow83
2ae028bf38
Update issue templates
2020-08-26 17:03:09 -07:00
James R. Barlow
b51a5887e5
v11.0.1 release notes
v11.0.1
2020-08-17 23:25:31 -07:00
James R. Barlow
fc523e837c
Clarify that the GPL-3 portion of pdfa.py was removed
...
Removal was in 8c90f7c97 .
pdfa.py now has no special licensing and falls unders the
"Files: *" clause of debian/copyright.
2020-08-17 23:23:31 -07:00
James R. Barlow
caeba76a61
Approve img2pdf 0.4 as it passes tests
2020-08-14 01:34:04 -07:00
James R. Barlow
cd35216f21
setup: blacklist pdfminer.six 20200720
...
NotAllowedError is going to removed
99f0c09869
2020-08-12 13:10:08 -07:00
James R. Barlow
e6a7b58863
Merge branch 'de-gpl'
v11.0.0
2020-08-12 12:20:38 -07:00
James R. Barlow
56184a762f
Issue template:Give stronger hints about sample input files
2020-08-12 12:12:37 -07:00