Commit Graph

708 Commits

Author SHA1 Message Date
James R. Barlow
6dbaebdc0c Merge branch 'master' into feature/drop-3.7 2022-09-15 23:00:27 -07:00
James R. Barlow
2e937dee9f Refactor cache manifest creation 2022-08-19 00:19:38 -07:00
James R. Barlow
f4155dca77 tests: convert all uses of multipage.pdf to fixture 2022-08-11 01:13:10 -07:00
James R. Barlow
545cd031b0 Replace public domain graph.pdf and derivates with licensed version 2022-08-11 01:09:00 -07:00
James R. Barlow
c5359bd990 jbig2 is from linn now 2022-08-06 15:36:20 -07:00
James R. Barlow
7f77308846 Remake palette.pdf using baiona-colormapped file 2022-08-06 15:35:47 -07:00
James R. Barlow
79db985181 Improve encryption tests; drop some public domain resources
Generate the encrypted files we need and remove special test files we retained for this.

Replace jbig2.pdf based on congress.jpg with version based on ccitt.pdf.
2022-08-06 14:37:45 -07:00
James R. Barlow
8412de9344 Merge branch 'master' into feature/drop-3.7 2022-08-06 14:04:23 -07:00
James R. Barlow
4104904a1e Document reason for suppress some third party deprecation warnings 2022-08-04 04:13:02 -07:00
James R. Barlow
1a0a797ca6 Remove our @deprecated decorator and use standard package 2022-08-04 04:00:25 -07:00
James R. Barlow
d591a3e059 resources readme: remove license and copyright info
Better to not repeat ourselves and present this info in example one place.
2022-08-04 03:42:22 -07:00
James R. Barlow
4b9ea40a0c spdx: move identifiers to files that support them
If the apparent license changed, take this commit as correct.
2022-08-04 03:26:54 -07:00
James R. Barlow
acc70036cc Set minimum Tesseract to 4.1.1 2022-08-02 15:20:29 -07:00
James R. Barlow
67773da309 Drop support for Ghostscript <9.50 2022-08-02 15:01:10 -07:00
James R. Barlow
8a3b82e364 Make Python 3.8 minimum requirement 2022-08-02 14:46:01 -07:00
James R. Barlow
580822a6a2 Fix Windows ghostscript path scanning 2022-08-02 14:39:23 -07:00
James R. Barlow
5fe3102e4e tests: new test to confirm correct printing of tesseract install advice 2022-08-01 12:31:37 -07:00
James R. Barlow
5b57520c98 tests: simplify some validation tests 2022-08-01 12:31:05 -07:00
James R. Barlow
30e4198f3a tests: fix test_validation when chi_sim not installed 2022-08-01 02:47:39 -07:00
James R. Barlow
ba372e5841 Reorganize validation to fix exception when Tesseract not installed
The existing logic would call an OCR plugin's get_languages function before
allowing the plugin to check if its dependencies were available. This caused
an exception if Tesseract was installed, when we were supposed to issue
an error message advising the user to install Tesseract.
2022-08-01 02:04:09 -07:00
James R. Barlow
80ed2117cc Change to SPDX license tracking 2022-07-28 01:10:07 -07:00
James R. Barlow
dc6f1a266a Modernize type annotations 2022-07-23 00:39:24 -07:00
James R. Barlow
a5efc4af9b unpaper: replace input pnm with png
Unpaper or its underlying libraries don't seem to accept pnms with an
odd integer width. Although it's not clear if this is the issue at all.

In any case, keeping the image a PNG works around the issue. unpaper
only accepted PNM input in the past, which is why we send it PNM.
Since it now accepts PNG, we might as well use PNG.

Unpaper can write PNG as output too, but this added a few seconds to
the test suite was not committed.

Related issues:

https://github.com/ocrmypdf/OCRmyPDF/issues/887

https://github.com/ocrmypdf/OCRmyPDF/issues/665

https://github.com/unpaper/unpaper/issues/82
2022-07-03 15:32:16 -07:00
James R. Barlow
61600111d3 test_pdfinfo: refactor by extracting fixtures 2022-06-18 16:29:57 -07:00
James R. Barlow
17a5b8b43c Refactor reporting of optimization failures 2022-06-13 01:30:15 -07:00
James R. Barlow
13d11e76e5 optimize plugin: solve linearization and "is optimization enabled?" issues 2022-06-13 00:59:41 -07:00
James R. Barlow
61069660a2 Move optimization options to plugin 2022-06-12 02:42:16 -07:00
James R. Barlow
3d4f80639d Remove test that is now always skipped 2022-06-12 00:31:01 -07:00
James R. Barlow
b17fb61389 Configure pylint in pyproject and delint 2022-06-12 00:30:44 -07:00
James R. Barlow
0ac15dd0b2 Suppress libxmp DeprecationWarning during test 2022-06-01 00:46:16 -07:00
James R. Barlow
33cdabaf65 tests: account for test that expected pngquant for windows 2022-05-26 13:52:22 -07:00
James R. Barlow
5d0cc0a092 tests: Extract some test fixtures for better clarity 2022-05-26 00:57:31 -07:00
James R. Barlow
6c427f82ea Add test case for corrupt ICC profiles 2022-05-26 00:41:19 -07:00
James R. Barlow
b00fe3dc5d pytest.skip() - remove kwarg entirely, to avoid breaking older pytest and not getting warns from newer pytest 2022-04-14 20:15:00 -07:00
James R. Barlow
e6aa3a4299 tests: explain why CacheOcrEngine needs lock 2022-04-05 16:16:51 -07:00
James R. Barlow
43302d7e12 Fix pytest.warns() on older pytest
Thanks @QuLogic
2022-04-05 16:02:50 -07:00
James Barlow
776ada6713 Upgrade pre-commit and associated tools; various lints 2022-04-03 20:53:01 -07:00
James Barlow
dfe31a2f6d Add lock to certain "with patch" cases
Switch to --use-threads seems to have broken tests that assumed they could
monkeypatch things. Although that's odd, since while we can have multiple
worker threads, we should never have
parallel tests in the same process.
2022-04-03 17:22:04 -07:00
James Barlow
0c43963d69 Fix pytest deprecation warnings 2022-04-03 13:30:58 -07:00
James Barlow
f29fe7f23e Fix Pillow deprecation warnings 2022-04-03 13:30:50 -07:00
James R. Barlow
13917c051c Disable oom killer test for --use-threads 2022-03-13 01:02:28 -08:00
James R. Barlow
514038d4ec optimize: recognize and produce [/FlateDecode /DCTDecode] images 2022-02-08 00:38:08 -08:00
James R. Barlow
3b406112d0 ghostscript: improve test coverage of error cases 2022-01-25 23:45:47 -08:00
James R. Barlow
2d0ac4707c Use better img2pdf settings where possible while supporting old versions
Fixes #894
2022-01-14 11:55:54 -08:00
James R. Barlow
ea69e868ed unpaper: issue warning if image too large to clean 2022-01-11 10:44:38 -08:00
James R. Barlow
ee21bf9ef6 Update cache 2021-12-13 20:45:30 -08:00
James R. Barlow
d48254d477 Fix issue with attempting to deskew a blank page on Tesseract 5
Closes #868
2021-12-10 21:48:09 -08:00
James R. Barlow
13af3252ff tests: simplify run_ocrmypdf API 2021-12-06 17:00:25 -08:00
James R. Barlow
6910c48b81 Fix test_outputtype_none on Windows and cleanup docs 2021-12-06 15:38:38 -08:00
James R. Barlow
e642dd4b35 Fix kill signal on Windows 2021-12-06 15:38:32 -08:00