Commit Graph

45 Commits

Author SHA1 Message Date
James R. Barlow
1976dc6f30 Fix issue #121 “pop from empty list” (content stream parsing error) 2017-01-26 17:24:40 -08:00
James R. Barlow
097a69d07f pageinfo: fix “decimal.InvalidOperation: quantize result has too many digits”
And add new test case for this.
2016-12-08 16:04:14 -08:00
James R. Barlow
949d2ff1c2 v4.3.1 release notes 2016-11-07 14:36:08 -08:00
James R. Barlow
cc9c0d819e Add test case for documents that get rotated incorrectly after deskew 2016-11-07 14:15:03 -08:00
James R. Barlow
fdd9b8b8ce Optimize some of the test resources to reduce file sizes
Mostly by reducing RGB -> monochrome and applying JBIG2 compression
2016-11-07 14:01:23 -08:00
James R. Barlow
a86805f0d9 Remove possibly non-free page from "multipage.pdf" 2016-10-27 15:56:43 -07:00
James R. Barlow
013c5a369f Replace redacted file with an OCR-able file 2016-10-07 12:45:22 -07:00
James R. Barlow
6baf8668a6 Replace with non-free file milk.pdf with free equivalent 2016-10-06 13:10:28 -07:00
James R. Barlow
4ba2962c56 Comment on non-free files 2016-10-05 16:48:16 -07:00
James R. Barlow
4dad09cc91 resources/README: replace the other large table with a list table 2016-10-05 16:38:51 -07:00
James R. Barlow
825c0f8b2a Note that milk.pdf is non-free, start using list-tables 2016-09-10 14:44:00 -07:00
James R. Barlow
9ca29c787b Update description of masks.pdf to reflect what it actually tests 2016-09-01 21:21:14 -07:00
James R. Barlow
bf89e38c69 Add milk.pdf test case 2016-08-31 11:42:21 -07:00
James R. Barlow
d25397e2b0 Add test case for PDFs with masks and stencil masks 2016-08-26 15:03:27 -07:00
James R. Barlow
fef35e4eb2 Fix handling of DPI for rare case of JPEG recompression after deskew/clean
This test is exercised by page 4 of multipage.pdf. If all images are
JPEGs, and one of deskew/clean removes DPI information, make sure that
we can get the right information back and that the DPI stays square.
2016-07-29 01:34:52 -07:00
James R. Barlow
8f77576dc4 Fix non-square image resolution for "hocr" case; use img2pdf 0.2.1
Tesseract renderer not immediately fixable.
2016-07-28 16:43:51 -07:00
jbarlow83
1bacf35a2c Update license information for encrypted_algo4.pdf 2016-06-24 14:25:15 -07:00
James R. Barlow
b4a734fc0d Test case for "algorithm 4" test
Algorithm 4 -> PDF version 1.6
2016-06-23 13:21:26 -07:00
James R. Barlow
f3e06b2dbd Add bookmarks to file for more testing 2016-02-29 00:05:07 -08:00
James R. Barlow
323b9a5f8e Add other missing files 2016-02-20 05:34:21 -08:00
James R. Barlow
cab381a339 Add JPEG 2000 test case 2016-02-20 05:13:19 -08:00
James R. Barlow
8246cc0538 Gracefully recover from tesseract's failure to process very large images
And test cases to check this
2016-02-20 04:53:23 -08:00
James R. Barlow
812fd745b6 Remove redundant line from resources 2016-02-16 14:29:56 -08:00
James R. Barlow
1224af1780 Update test resources to address files with unknown source
-Remove Test_Issue_28.pdf (inherited from fritz-hh, source unknown)
-Replace missing_docinfo.pdf (received from user, but it's a printout of
a website; unclear status, so created a new PDF with the same effect)
-Others are okay
2016-02-16 00:28:28 -08:00
James R. Barlow
265d2ce39b Better skewed image 2016-02-08 23:44:46 -08:00
James R. Barlow
3569c76c0f Also include cardinal.pdf 2016-02-08 15:23:04 -08:00
James R. Barlow
9058dedfbe New tests for ccitt, jbig2 encodings 2016-01-19 13:01:56 -08:00
James R. Barlow
102bd07019 Check for encrypted PDF and complain appropriately 2015-12-17 10:37:54 -08:00
James R. Barlow
fd4a227ccb Force this file to stop thinking it was modified 2015-09-13 17:53:01 -07:00
James R. Barlow
3d26257710 Add test cases for additional image formats 2015-08-28 04:51:11 -07:00
James R. Barlow
b376672dbc Bug fix: exception thrown if input PDF was missing DocumentInfo block 2015-08-24 01:23:30 -07:00
James R. Barlow
630e6cbf1e pip chokes on Unicode filenames? 2015-08-18 23:56:30 -07:00
James R. Barlow
cc161780df Replace fileinput with regular open-replace
fileinput is supposed to save time in these cases but it's not capable
of doing both in-place rewrites and working with a non-ascii encoding.
This was not noticed until characters outside of ASCII were picked up
by tesseract and saved in a HOCR file. Rework some surrounding code as
well and add multilingual test cases.
2015-08-18 23:27:50 -07:00
James R. Barlow
85af0f0d03 Add test case for blank PDF page 2015-08-14 00:46:50 -07:00
James R. Barlow
9247ea00bf Improve ruffus exception handling
ruffus swallows the return code if the process of handling an exception
we hit an error in ruffus' own code, which can happen.  So pick through
its error stack and find out if there's an interesting return code in
there.  Had to use eval() of all things.

Also suppress the stack trace for normal error conditions that don't
need one.
2015-08-11 02:19:46 -07:00
James R. Barlow
2744dafb74 New test case: ensure metadata is preserved from input to output 2015-08-05 17:09:38 -07:00
James R. Barlow
e35526192c More test cases 2015-07-28 03:02:35 -07:00
James R. Barlow
bea57bdded More test cases for other parameters 2015-07-28 02:31:18 -07:00
James R. Barlow
8aced0b6d3 More testing: JPEG 2015-07-27 00:25:43 -07:00
James R. Barlow
5440d988fc Make this PDF a whole image page
Originally it had a smaller image centred in a page, which is not quite
supported.
2015-07-26 18:32:50 -07:00
James R. Barlow
3684f278ed Add some pageinfo test cases; found problem with inline images 2015-07-26 15:24:42 -07:00
Jim Barlow
77bd35c3c7 Remove duplicate test folder 2015-07-25 01:00:40 -07:00
Jim Barlow
6d5d8be708 New test: check skew 2015-07-22 04:00:59 -07:00
Jim Barlow
ce2dbdf372 Add another test 2015-07-22 03:16:19 -07:00
Jim Barlow
ec8a35a7a6 Basic test cases 2015-07-22 02:59:25 -07:00