James R. Barlow
bf96171b65
Ignore whether or not textonly_pdf was used in cache
...
The difference doesn't matter in 7.0.0 anymore.
2018-06-23 02:58:26 -07:00
James R. Barlow
b81daf71d1
Regenerate test cache
2018-06-23 02:02:58 -07:00
James R. Barlow
faad1fc58a
Reactivate two tests that weren't using their fixtures properly
2018-06-23 01:54:09 -07:00
James R. Barlow
6f48181a56
Disable a pylint
2018-06-23 01:53:04 -07:00
James R. Barlow
807c8b0726
Trailing whitespace
2018-06-23 01:51:19 -07:00
James R. Barlow
b0dbaeafc5
Cleanup unused imports
2018-06-23 01:47:53 -07:00
James R. Barlow
2530d1791b
Fix several pylint errors and warnings
2018-06-23 00:54:22 -07:00
James R. Barlow
94150f414a
Remove qpdf.merge
...
We no longer need to merge pages this way. Much of the functionality
was there to implement page splitting without hitting ulimit which
will be fixed in qpdf > 8.0.2. The tests were expensive to run.
Also remove pytest-timeout since it breaks the Linux build.
2018-06-23 00:45:03 -07:00
James R. Barlow
76e7e8dbbb
Replace several uses of str(path) with fspath(path)
...
Helps make it more explicit. Did not do this to tests because use of paths
is more involved there.
2018-06-22 21:00:47 -07:00
James R. Barlow
9e765ddf46
Rename _optimize to optimize.py
2018-06-22 17:51:57 -07:00
James R. Barlow
73431d9761
Remove obsolete _naive_find_text
2018-06-13 14:00:50 -07:00
James R. Barlow
45cb4525cf
Remove other references to PyMuPDF
2018-06-13 01:02:53 -07:00
James R. Barlow
9608b22d34
Remove all uses of PyPDF2 except PDF/A check
...
Leave PDF/A check alone for now, since pikepdf has no equivalent.
2018-05-26 02:07:18 -07:00
James R. Barlow
78a686ecb4
Consider qpdf behavior on algo4 a pass
...
qpdf opens files with null user password, so do the same.
2018-05-25 00:33:31 -07:00
James R. Barlow
0a04a60f69
Document need for pdfinfo to be pickleable
2018-05-24 22:24:13 -07:00
James R. Barlow
68d8642988
Found out this test was extremely slow - no reason to actual use a large file
2018-05-24 22:22:51 -07:00
James R. Barlow
16f70ff054
Main changeset for pikepdf-based refactor pdfinfo
2018-05-24 22:22:01 -07:00
James R. Barlow
786a2ad65a
Make optimize test do a little more
2018-05-18 17:50:39 -07:00
James R. Barlow
0c279b01a4
Fix test failure on missing JobContext
2018-05-17 01:16:58 -07:00
James R. Barlow
3b820ffa7b
test_metadata: change from xfail to skipif without fitz
2018-05-17 00:14:57 -07:00
James R. Barlow
5e20d1d554
metadata: Fix failing test on __getitem__['/CreationDate']
2018-05-16 13:46:07 -07:00
James R. Barlow
6171de41bf
optimize: move a lot of image scanning code to pikepdf
2018-05-14 22:21:53 -07:00
James R. Barlow
3254315127
Update test cache
2018-05-11 12:19:50 -07:00
James R. Barlow
ca297fd26b
Update tests
2018-05-11 02:33:44 -07:00
James R. Barlow
72253d09fa
Add arguments to control optimization
2018-05-10 22:23:24 -07:00
James R. Barlow
24b0adfacc
Merge branch 'master' into develop
2018-05-10 20:54:55 -07:00
James R. Barlow
acc6698ab3
Make XML metadata test actually work
2018-05-10 20:37:10 -07:00
James R. Barlow
606d3e6aa1
Remove tests that exercise obsolete features (tesseract, -g)
2018-05-10 20:33:32 -07:00
James R. Barlow
687a7954d6
test_main: uses leptonica
2018-05-10 19:05:31 -07:00
James R. Barlow
abed8e034e
Add metadata preservation test from stash
2018-05-10 16:43:28 -07:00
James R. Barlow
b8f3ead541
Remove tesseract renderer entirely
...
Grafting lets us work with older Tesseract versions as if they could use
sandwich, so there is no point in keeping it. It's been deprecated for a
long time now anyway.
2018-05-10 14:06:13 -07:00
James R. Barlow
9226f8a5d1
Trap PDF/A-3 errors on old Ghostscript
2018-05-04 15:29:43 -07:00
James R. Barlow
7cf83c77ca
Merge branch 'feature/pdfa3'
2018-05-03 16:45:57 -07:00
James R. Barlow
8a9f174f63
Fix XMP validation issue with /CreationDate
...
Related to previous validation issue. If the /CreationDate had no
timezone, Ghostscript also creates invalid metadata. Work around this.
Also fix up PDF date decoding, and transcode dates to standardize them.
2018-05-03 16:30:20 -07:00
James R. Barlow
76276f61e5
Split out rotation related tests
2018-05-01 23:51:35 -07:00
James R. Barlow
bfd26e6ec6
Tests: confirm OCR layer copied
2018-05-01 23:16:41 -07:00
James R. Barlow
b5d7e9cbb0
Fix all issues with rotations
...
All tests now pass
2018-05-01 22:50:20 -07:00
James R. Barlow
a9abe13185
Remove the old tesseract pdf_renderer
2018-05-01 17:31:34 -07:00
James R. Barlow
6b315e8315
Add ability to disable cache
2018-05-01 15:52:00 -07:00
James R. Barlow
2131ad4670
Fix --remove-background error on PDFs with colormapped images
...
It's unclear how exactly a
colormapped image gets to this
spot given the tendency of other
image processing tools to flatten
such images, but someone made it happen, so now we make sure
the image is okay.
Closes #262
2018-04-27 17:21:01 -07:00
James R. Barlow
219fe2155b
test_pageinfo: remove duplicate import
2018-04-27 17:16:42 -07:00
James R. Barlow
0934905493
Don't suppress error message from config_notfound
...
Since it showed up in s390x bionic
2018-04-25 21:58:18 -07:00
James R. Barlow
df87e21c85
Add support for PDF/A-3
...
No ability to attach files however
2018-04-20 00:06:55 -07:00
Hugo
d761d80750
Use more standard __version__ rather than PILLOW_VERSION ( #257 )
2018-04-19 23:35:32 -07:00
James R. Barlow
0b10db91be
Fix regression: Disable Ghostscript JPEG passthrough entirely
2018-04-17 17:00:24 -07:00
James R. Barlow
1a516b2af9
Fix regression: time stamp test suite failures
2018-04-17 16:59:21 -07:00
James R. Barlow
7368399f8b
Clarify license of two test files - https://github.com/jbarlow83/OCRmyPDF/issues/254
2018-04-17 11:56:36 -07:00
James R. Barlow
34c78a892a
Fix list table for tests/resources
...
[ci skip]
2018-04-15 23:52:19 -07:00
James R. Barlow
10aa59f674
v6.1.4 fix test suite regression with Ghostscript 9.23
2018-04-12 15:16:54 -07:00
James R. Barlow
ba0535e3fb
Update test cache to account for unpaper --layout none change
2018-04-12 00:48:21 -07:00