James R. Barlow
1976dc6f30
Fix issue #121 “pop from empty list” (content stream parsing error)
2017-01-26 17:24:40 -08:00
James R. Barlow
097a69d07f
pageinfo: fix “decimal.InvalidOperation: quantize result has too many digits”
...
And add new test case for this.
2016-12-08 16:04:14 -08:00
James R. Barlow
949d2ff1c2
v4.3.1 release notes
2016-11-07 14:36:08 -08:00
James R. Barlow
cc9c0d819e
Add test case for documents that get rotated incorrectly after deskew
2016-11-07 14:15:03 -08:00
James R. Barlow
fdd9b8b8ce
Optimize some of the test resources to reduce file sizes
...
Mostly by reducing RGB -> monochrome and applying JBIG2 compression
2016-11-07 14:01:23 -08:00
James R. Barlow
a86805f0d9
Remove possibly non-free page from "multipage.pdf"
2016-10-27 15:56:43 -07:00
James R. Barlow
013c5a369f
Replace redacted file with an OCR-able file
2016-10-07 12:45:22 -07:00
James R. Barlow
6baf8668a6
Replace with non-free file milk.pdf with free equivalent
2016-10-06 13:10:28 -07:00
James R. Barlow
4ba2962c56
Comment on non-free files
2016-10-05 16:48:16 -07:00
James R. Barlow
4dad09cc91
resources/README: replace the other large table with a list table
2016-10-05 16:38:51 -07:00
James R. Barlow
825c0f8b2a
Note that milk.pdf is non-free, start using list-tables
2016-09-10 14:44:00 -07:00
James R. Barlow
9ca29c787b
Update description of masks.pdf to reflect what it actually tests
2016-09-01 21:21:14 -07:00
James R. Barlow
bf89e38c69
Add milk.pdf test case
2016-08-31 11:42:21 -07:00
James R. Barlow
d25397e2b0
Add test case for PDFs with masks and stencil masks
2016-08-26 15:03:27 -07:00
James R. Barlow
fef35e4eb2
Fix handling of DPI for rare case of JPEG recompression after deskew/clean
...
This test is exercised by page 4 of multipage.pdf. If all images are
JPEGs, and one of deskew/clean removes DPI information, make sure that
we can get the right information back and that the DPI stays square.
2016-07-29 01:34:52 -07:00
James R. Barlow
8f77576dc4
Fix non-square image resolution for "hocr" case; use img2pdf 0.2.1
...
Tesseract renderer not immediately fixable.
2016-07-28 16:43:51 -07:00
jbarlow83
1bacf35a2c
Update license information for encrypted_algo4.pdf
2016-06-24 14:25:15 -07:00
James R. Barlow
b4a734fc0d
Test case for "algorithm 4" test
...
Algorithm 4 -> PDF version 1.6
2016-06-23 13:21:26 -07:00
James R. Barlow
f3e06b2dbd
Add bookmarks to file for more testing
2016-02-29 00:05:07 -08:00
James R. Barlow
323b9a5f8e
Add other missing files
2016-02-20 05:34:21 -08:00
James R. Barlow
cab381a339
Add JPEG 2000 test case
2016-02-20 05:13:19 -08:00
James R. Barlow
8246cc0538
Gracefully recover from tesseract's failure to process very large images
...
And test cases to check this
2016-02-20 04:53:23 -08:00
James R. Barlow
812fd745b6
Remove redundant line from resources
2016-02-16 14:29:56 -08:00
James R. Barlow
1224af1780
Update test resources to address files with unknown source
...
-Remove Test_Issue_28.pdf (inherited from fritz-hh, source unknown)
-Replace missing_docinfo.pdf (received from user, but it's a printout of
a website; unclear status, so created a new PDF with the same effect)
-Others are okay
2016-02-16 00:28:28 -08:00
James R. Barlow
265d2ce39b
Better skewed image
2016-02-08 23:44:46 -08:00
James R. Barlow
3569c76c0f
Also include cardinal.pdf
2016-02-08 15:23:04 -08:00
James R. Barlow
9058dedfbe
New tests for ccitt, jbig2 encodings
2016-01-19 13:01:56 -08:00
James R. Barlow
102bd07019
Check for encrypted PDF and complain appropriately
2015-12-17 10:37:54 -08:00
James R. Barlow
fd4a227ccb
Force this file to stop thinking it was modified
2015-09-13 17:53:01 -07:00
James R. Barlow
3d26257710
Add test cases for additional image formats
2015-08-28 04:51:11 -07:00
James R. Barlow
b376672dbc
Bug fix: exception thrown if input PDF was missing DocumentInfo block
2015-08-24 01:23:30 -07:00
James R. Barlow
630e6cbf1e
pip chokes on Unicode filenames?
2015-08-18 23:56:30 -07:00
James R. Barlow
cc161780df
Replace fileinput with regular open-replace
...
fileinput is supposed to save time in these cases but it's not capable
of doing both in-place rewrites and working with a non-ascii encoding.
This was not noticed until characters outside of ASCII were picked up
by tesseract and saved in a HOCR file. Rework some surrounding code as
well and add multilingual test cases.
2015-08-18 23:27:50 -07:00
James R. Barlow
85af0f0d03
Add test case for blank PDF page
2015-08-14 00:46:50 -07:00
James R. Barlow
9247ea00bf
Improve ruffus exception handling
...
ruffus swallows the return code if the process of handling an exception
we hit an error in ruffus' own code, which can happen. So pick through
its error stack and find out if there's an interesting return code in
there. Had to use eval() of all things.
Also suppress the stack trace for normal error conditions that don't
need one.
2015-08-11 02:19:46 -07:00
James R. Barlow
2744dafb74
New test case: ensure metadata is preserved from input to output
2015-08-05 17:09:38 -07:00
James R. Barlow
e35526192c
More test cases
2015-07-28 03:02:35 -07:00
James R. Barlow
bea57bdded
More test cases for other parameters
2015-07-28 02:31:18 -07:00
James R. Barlow
8aced0b6d3
More testing: JPEG
2015-07-27 00:25:43 -07:00
James R. Barlow
5440d988fc
Make this PDF a whole image page
...
Originally it had a smaller image centred in a page, which is not quite
supported.
2015-07-26 18:32:50 -07:00
James R. Barlow
3684f278ed
Add some pageinfo test cases; found problem with inline images
2015-07-26 15:24:42 -07:00
Jim Barlow
77bd35c3c7
Remove duplicate test folder
2015-07-25 01:00:40 -07:00
Jim Barlow
6d5d8be708
New test: check skew
2015-07-22 04:00:59 -07:00
Jim Barlow
ce2dbdf372
Add another test
2015-07-22 03:16:19 -07:00
Jim Barlow
ec8a35a7a6
Basic test cases
2015-07-22 02:59:25 -07:00