James R. Barlow
67773da309
Drop support for Ghostscript <9.50
2022-08-02 15:01:10 -07:00
James R. Barlow
8a3b82e364
Make Python 3.8 minimum requirement
2022-08-02 14:46:01 -07:00
James R. Barlow
580822a6a2
Fix Windows ghostscript path scanning
2022-08-02 14:39:23 -07:00
James R. Barlow
5fe3102e4e
tests: new test to confirm correct printing of tesseract install advice
2022-08-01 12:31:37 -07:00
James R. Barlow
5b57520c98
tests: simplify some validation tests
2022-08-01 12:31:05 -07:00
James R. Barlow
30e4198f3a
tests: fix test_validation when chi_sim not installed
2022-08-01 02:47:39 -07:00
James R. Barlow
ba372e5841
Reorganize validation to fix exception when Tesseract not installed
...
The existing logic would call an OCR plugin's get_languages function before
allowing the plugin to check if its dependencies were available. This caused
an exception if Tesseract was installed, when we were supposed to issue
an error message advising the user to install Tesseract.
2022-08-01 02:04:09 -07:00
James R. Barlow
80ed2117cc
Change to SPDX license tracking
2022-07-28 01:10:07 -07:00
James R. Barlow
dc6f1a266a
Modernize type annotations
2022-07-23 00:39:24 -07:00
James R. Barlow
a5efc4af9b
unpaper: replace input pnm with png
...
Unpaper or its underlying libraries don't seem to accept pnms with an
odd integer width. Although it's not clear if this is the issue at all.
In any case, keeping the image a PNG works around the issue. unpaper
only accepted PNM input in the past, which is why we send it PNM.
Since it now accepts PNG, we might as well use PNG.
Unpaper can write PNG as output too, but this added a few seconds to
the test suite was not committed.
Related issues:
https://github.com/ocrmypdf/OCRmyPDF/issues/887
https://github.com/ocrmypdf/OCRmyPDF/issues/665
https://github.com/unpaper/unpaper/issues/82
2022-07-03 15:32:16 -07:00
James R. Barlow
61600111d3
test_pdfinfo: refactor by extracting fixtures
2022-06-18 16:29:57 -07:00
James R. Barlow
17a5b8b43c
Refactor reporting of optimization failures
2022-06-13 01:30:15 -07:00
James R. Barlow
13d11e76e5
optimize plugin: solve linearization and "is optimization enabled?" issues
2022-06-13 00:59:41 -07:00
James R. Barlow
61069660a2
Move optimization options to plugin
2022-06-12 02:42:16 -07:00
James R. Barlow
3d4f80639d
Remove test that is now always skipped
2022-06-12 00:31:01 -07:00
James R. Barlow
b17fb61389
Configure pylint in pyproject and delint
2022-06-12 00:30:44 -07:00
James R. Barlow
0ac15dd0b2
Suppress libxmp DeprecationWarning during test
2022-06-01 00:46:16 -07:00
James R. Barlow
33cdabaf65
tests: account for test that expected pngquant for windows
2022-05-26 13:52:22 -07:00
James R. Barlow
5d0cc0a092
tests: Extract some test fixtures for better clarity
2022-05-26 00:57:31 -07:00
James R. Barlow
6c427f82ea
Add test case for corrupt ICC profiles
2022-05-26 00:41:19 -07:00
James R. Barlow
b00fe3dc5d
pytest.skip() - remove kwarg entirely, to avoid breaking older pytest and not getting warns from newer pytest
2022-04-14 20:15:00 -07:00
James R. Barlow
e6aa3a4299
tests: explain why CacheOcrEngine needs lock
2022-04-05 16:16:51 -07:00
James R. Barlow
43302d7e12
Fix pytest.warns() on older pytest
...
Thanks @QuLogic
2022-04-05 16:02:50 -07:00
James Barlow
776ada6713
Upgrade pre-commit and associated tools; various lints
2022-04-03 20:53:01 -07:00
James Barlow
dfe31a2f6d
Add lock to certain "with patch" cases
...
Switch to --use-threads seems to have broken tests that assumed they could
monkeypatch things. Although that's odd, since while we can have multiple
worker threads, we should never have
parallel tests in the same process.
2022-04-03 17:22:04 -07:00
James Barlow
0c43963d69
Fix pytest deprecation warnings
2022-04-03 13:30:58 -07:00
James Barlow
f29fe7f23e
Fix Pillow deprecation warnings
2022-04-03 13:30:50 -07:00
James R. Barlow
13917c051c
Disable oom killer test for --use-threads
2022-03-13 01:02:28 -08:00
James R. Barlow
514038d4ec
optimize: recognize and produce [/FlateDecode /DCTDecode] images
2022-02-08 00:38:08 -08:00
James R. Barlow
3b406112d0
ghostscript: improve test coverage of error cases
2022-01-25 23:45:47 -08:00
James R. Barlow
2d0ac4707c
Use better img2pdf settings where possible while supporting old versions
...
Fixes #894
2022-01-14 11:55:54 -08:00
James R. Barlow
ea69e868ed
unpaper: issue warning if image too large to clean
2022-01-11 10:44:38 -08:00
James R. Barlow
ee21bf9ef6
Update cache
2021-12-13 20:45:30 -08:00
James R. Barlow
d48254d477
Fix issue with attempting to deskew a blank page on Tesseract 5
...
Closes #868
2021-12-10 21:48:09 -08:00
James R. Barlow
13af3252ff
tests: simplify run_ocrmypdf API
2021-12-06 17:00:25 -08:00
James R. Barlow
6910c48b81
Fix test_outputtype_none on Windows and cleanup docs
2021-12-06 15:38:38 -08:00
James R. Barlow
e642dd4b35
Fix kill signal on Windows
2021-12-06 15:38:32 -08:00
James R. Barlow
9de06f62ee
Use Python executors instead of pools
...
ProcessPool/ThreadPool don't have the ability to notice when a child worker
was terminated. ProcessPoolExecutor and ThreadPoolExecutor do notice and
provide better error messages.
Add tests to check.
2021-12-06 15:38:27 -08:00
James R. Barlow
8fdcb15b4e
tests: improve typing and remove some legacy code
2021-12-06 15:38:27 -08:00
James R. Barlow
4c1ff1086c
tess cache: don't include full platform - could be sensitive
2021-12-06 15:38:26 -08:00
James R. Barlow
f91faf9795
Add new argument --tesseract-thresholding to control tesseract thresholding where available
...
Also add missing test for --tesseract-oem
2021-12-06 15:38:14 -08:00
James R. Barlow
c75ff4687a
Turning on Ghostscript interpolation changes this test
...
Seems acceptable. We don't normally use Ghostscript to downsample PDFs
like is happening in this test.
2021-11-15 16:36:24 -08:00
James R. Barlow
acc9d58c39
Skip no language test for Tess 5
2021-11-13 01:37:27 -08:00
James R. Barlow
e3126d2806
Adjust test to support Tesseract 5 working harder to find its files
2021-11-13 01:16:35 -08:00
James R. Barlow
f51164aff8
Upgrade test version of pymupdf
2021-11-13 00:53:41 -08:00
James R. Barlow
6f58a14351
pdfa: remove deprecated pkg_resources based access and tests
2021-11-13 00:52:03 -08:00
James R. Barlow
7ba04267b1
Remove shims to support for old versions of pikepdf < 4
2021-11-13 00:43:20 -08:00
James R. Barlow
380b981763
Remove most Python 3.6 special casing
2021-11-13 00:27:48 -08:00
James R. Barlow
5abfb14c2a
Remove leptonica and cffi
2021-11-13 00:06:35 -08:00
James R. Barlow
036afc4d88
Update cache, related to previous apparently
2021-11-12 23:57:50 -08:00