James R. Barlow
6376f77b8c
Refactor, remove trigonometry
2018-05-02 12:30:34 -07:00
James R. Barlow
e27e614ed9
Fixed rotation hard case
2018-05-02 01:32:11 -07:00
James R. Barlow
b0c04704a1
Fixed all but one rotation case
2018-05-02 01:24:21 -07:00
James R. Barlow
6bb6bf8323
Fix correction angle used from wrong page
2018-05-02 01:00:30 -07:00
James R. Barlow
e22fe8aefc
Silence debug messages
2018-05-01 23:51:54 -07:00
James R. Barlow
76276f61e5
Split out rotation related tests
2018-05-01 23:51:35 -07:00
James R. Barlow
bfd26e6ec6
Tests: confirm OCR layer copied
2018-05-01 23:16:41 -07:00
James R. Barlow
d787e1ea0f
ghostscript.py not saved in last commit
...
Given importance of last one, confirmed that when the file is saved all tests pass too.
Passing is invariant with this change.
2018-05-01 22:59:22 -07:00
James R. Barlow
b5d7e9cbb0
Fix all issues with rotations
...
All tests now pass
2018-05-01 22:50:20 -07:00
James R. Barlow
f3b6d9dcdf
Fix a comment about Tesseract behavior in certain versions
2018-05-01 21:31:09 -07:00
James R. Barlow
a9abe13185
Remove the old tesseract pdf_renderer
2018-05-01 17:31:34 -07:00
James R. Barlow
6b315e8315
Add ability to disable cache
2018-05-01 15:52:00 -07:00
James R. Barlow
37677de884
Fix regressions: pdfa.ps not used, PDF/A failures, handling of text layers with no font
2018-05-01 15:51:46 -07:00
James R. Barlow
c7387de325
Fix auto rotate
2018-05-01 15:18:28 -07:00
James R. Barlow
2495b1e038
Refactor find font, get test cases working again
2018-05-01 14:48:41 -07:00
James R. Barlow
073ee52ce7
Use hocr and weave; eliminate old combine layers and merge pages
2018-05-01 14:21:53 -07:00
James R. Barlow
54150a14e9
Further elimination of tesseract renderer special casing
...
We don't need to keep a "skip page" around anymore since
skipping means just not grafting on the text layer.
2018-05-01 13:36:20 -07:00
James R. Barlow
88ff091cce
Unify tesseract and sandwich renderer paths
...
Since the new weaving method copies the font and content
stream from the Tesseract PDF, it doesn't matter if Tesseract
happens to have an image or not.
If Tesseract is text-only capable we use that feature for efficiency,
but ignore the image either way.
2018-05-01 13:24:20 -07:00
James R. Barlow
e87a5776f1
Remove now-unnecessary code to rotate pages
...
Track only the decision to change rotation.
2018-05-01 13:01:25 -07:00
James R. Barlow
0806ce6406
Fix rotation for unsplit (modulo --rotate-pages)
2018-04-30 20:58:42 -07:00
James R. Barlow
6409894a71
feature/unsplit-try-imagerotate
2018-04-30 20:48:59 -07:00
James R. Barlow
e7286f6129
Unsplit now works with multipage, --force-ocr
2018-04-30 14:46:20 -07:00
James R. Barlow
2ab94b3151
unsplit: it's alive
...
First successful file output.
2018-04-28 01:57:41 -07:00
James R. Barlow
7ee90890ec
Add copying of essential information from Tesseract textonly
2018-04-27 23:19:08 -07:00
James R. Barlow
8d2a917676
Page unsplit, development
2018-04-25 21:56:43 -07:00
James R. Barlow
44b4afa534
Begin conversion from page splititng to page markers
2018-04-23 22:57:50 -07:00
James R. Barlow
775be3933c
Cherrypick merge_pages unification
2018-04-20 23:08:15 -07:00
Hugo
d761d80750
Use more standard __version__ rather than PILLOW_VERSION ( #257 )
2018-04-19 23:35:32 -07:00
James R. Barlow
0b10db91be
Fix regression: Disable Ghostscript JPEG passthrough entirely
v6.1.5
2018-04-17 17:00:24 -07:00
James R. Barlow
1a516b2af9
Fix regression: time stamp test suite failures
2018-04-17 16:59:21 -07:00
James R. Barlow
076363d78e
Disable JPEG passthrough for Ghostscript 9.23
...
Seems to corrupt JPEGs involved in image masks?
2018-04-17 16:31:03 -07:00
James R. Barlow
5fde214290
Update notes for v6.1.5
2018-04-17 15:23:35 -07:00
James R. Barlow
a620724d6a
Fix PDF/A validation failure due to timezone being omitted from /ModDate
2018-04-17 15:16:48 -07:00
James R. Barlow
7368399f8b
Clarify license of two test files - https://github.com/jbarlow83/OCRmyPDF/issues/254
2018-04-17 11:56:36 -07:00
James R. Barlow
34c78a892a
Fix list table for tests/resources
...
[ci skip]
2018-04-15 23:52:19 -07:00
James R. Barlow
9d28879505
Update Ubuntu 14.04 instructions
...
Closes #252
2018-04-14 17:30:33 -07:00
James R. Barlow
2482296e2b
hocr: avoid division by zero
...
Issue #253 - PDF that produces the error is not available, but if font_width
is zero, chances are the text is nonprinting characters, so suppress it.
2018-04-14 17:24:21 -07:00
James R. Barlow
7fc897e6dc
Fix NameError 'ghostscript'
v6.1.4
2018-04-12 21:24:05 -07:00
James R. Barlow
9b731d63b8
Set Ghostscript -sColorConversionStrategy the way old/new versions expect
2018-04-12 16:28:48 -07:00
James R. Barlow
10aa59f674
v6.1.4 fix test suite regression with Ghostscript 9.23
2018-04-12 15:16:54 -07:00
James R. Barlow
1f7837e7b1
v6.1.4 release notes update
2018-04-12 00:55:45 -07:00
James R. Barlow
ba0535e3fb
Update test cache to account for unpaper --layout none change
2018-04-12 00:48:21 -07:00
James R. Barlow
49fa7f6b5c
tesseract_cache: don't reveal host system file paths in manifest file
2018-04-12 00:47:28 -07:00
James R. Barlow
c95db246d4
v6.1.4 merge
2018-04-11 15:58:00 -07:00
James R. Barlow
1ba93371ce
docs: Update installation to reflect qpdf 7.0.0 requirement
2018-04-11 15:40:50 -07:00
James R. Barlow
fedbbdb575
Travis: compile qpdf from source
...
The older version in Travis's Ubuntu 14.04 can't pass the test suite anymore.
2018-04-11 15:40:45 -07:00
James R. Barlow
85ebba72bc
Fix setup.py syntax
2018-04-10 18:30:48 -07:00
James R. Barlow
b6cd436d5d
setup: Blacklist Pillow 5.1.0 on macos
...
https://github.com/python-pillow/Pillow/issues/3068
2018-04-10 18:15:37 -07:00
James R. Barlow
ec170c7e1e
Travis: use setup.py for requirements, don't override with .txt
2018-04-10 17:52:19 -07:00
James R. Barlow
3d69b46fca
Release notes
2018-04-10 15:53:02 -07:00