Commit Graph

2602 Commits

Author SHA1 Message Date
James R. Barlow
c9bd87254e A few minor typing issues 2020-06-22 02:31:53 -07:00
James R. Barlow
f4cb424451 Support input/output streams at API level 2020-06-22 02:02:18 -07:00
James R. Barlow
fef14778d5 Fix missing f-string in log message 2020-06-22 01:17:16 -07:00
James R. Barlow
86ec63f215 Decouple plugin manager forking from PdfContext/Pagecontext 2020-06-22 01:16:59 -07:00
James R. Barlow
5b10ec9d39 jobcontext.PdfContext: remove dead code, add annotations 2020-06-22 00:34:58 -07:00
James R. Barlow
800c75c4e5 Bump requirements (mainly for Docker's benefit) 2020-06-21 01:58:53 -07:00
James R. Barlow
24d64b04c3 Update Docker to Ubuntu 20.04 and jbig2-latest 2020-06-21 01:48:31 -07:00
James R. Barlow
48e2750551 Fix some tests that were failing in Docker 2020-06-21 01:48:13 -07:00
James R. Barlow
e182c5f63e Update and sync .dockerignore, .gitignore
Also blacklist .* and whitelist the ones we want.
2020-06-21 01:25:59 -07:00
James R. Barlow
06d52326db Fix deleted path in .coveragerc 2020-06-21 01:24:23 -07:00
James R. Barlow
ebfe4f0d29 Fix issue #582 - PDF/A acquires title "Untitled" after conversion 2020-06-20 02:01:16 -07:00
James R. Barlow
ad22977c84 v10.1.1 release notes 2020-06-17 14:45:32 -07:00
James R. Barlow
6ac50646f0 Fix OMP_THREAD_LIMIT rounded down to 0 in some cases 2020-06-17 14:43:19 -07:00
James R. Barlow
24b6a4ad50 v10.1.0 notes v10.1.0 2020-06-16 00:55:28 -07:00
James R. Barlow
e802896d4d unpaper: use PNG input where possible
Unpaper accepts PNG as input now, so avoid generating a huge
temporary PPM file if we can. If we
must create a PNG, compress it lightly to keep our temp usage down.
2020-06-16 00:50:18 -07:00
James R. Barlow
0b5a20e593 coverage: ignore type checking 2020-06-15 15:55:39 -07:00
James R. Barlow
642998ead6 sync: refactor preprocess image filtering 2020-06-15 15:26:41 -07:00
James R. Barlow
698aab4f75 Add a lot of type annotations 2020-06-15 15:20:50 -07:00
James R. Barlow
34231ac667 sync: refactor intermediate image production 2020-06-15 15:02:28 -07:00
James R. Barlow
ddedf7cd2e For --clean-final, use same image as --clean if possible 2020-06-15 13:48:49 -07:00
James R. Barlow
9d127d354c docs: improve description of plugins 2020-06-15 12:51:49 -07:00
James R. Barlow
2d2a4894ab Some corrections to release notes 2020-06-15 12:51:28 -07:00
James R. Barlow
862861e3ca Fix error message in logging from repeated filtering
If logging somehow triggers PageNumberFilter multiple times, it would fail on the second occurrence.
2020-06-13 14:50:58 -07:00
James R. Barlow
892db88f0e test_two_languages: use narrower test v10.0.1 2020-06-12 14:33:02 -07:00
James R. Barlow
eeb44f78cc Fix tests that failed on other platforms from previous fix 2020-06-12 12:59:46 -07:00
James R. Barlow
863835f660 v10.0.1 release notes 2020-06-12 12:11:21 -07:00
James R. Barlow
393c5a9ea4 Fix error on -l lang1+lang2 2020-06-12 12:10:29 -07:00
James R. Barlow
c6b9a49cbb Fix tests that fail in CI v10.0.0 2020-06-10 17:08:00 -07:00
James R. Barlow
17a4831745 v10 release notes and dependencies 2020-06-10 14:27:47 -07:00
James R. Barlow
7caf1e85ff info: change "Scan" message 2020-06-10 12:11:37 -07:00
James R. Barlow
f59a757e8b info: tidy handling of content streams 2020-06-10 12:09:24 -07:00
James R. Barlow
872bafad4b Reinstate quick test for text/no text
Partial revert of commit 991db17
2020-06-10 12:00:52 -07:00
James R. Barlow
8599400445 Only do page analysis on pages we will do OCR on 2020-06-10 11:33:27 -07:00
James R. Barlow
b6eebadf05 Use pikepdf.open with block to manage PdfInfo 2020-06-10 11:32:46 -07:00
James R. Barlow
a4e88eb8f0 Simplify plugin_manager pickling 2020-06-10 00:41:19 -07:00
James R. Barlow
f6257c2183 subprocess: lru_cache version checks 2020-06-10 00:32:06 -07:00
James R. Barlow
64891c2fc3 Pre-release delinting 2020-06-09 15:27:14 -07:00
James R. Barlow
fe156db41d Merge branch 'release/v10' into trialmerge 2020-06-09 15:12:56 -07:00
James R. Barlow
0f942fb714 Rename ocrmypdf.exec -> ocrmypdf._exec 2020-06-09 14:59:09 -07:00
James R. Barlow
be8ca589d4 Move ocrmypdf.exec.run and friends to ocrmypdf.subprocess 2020-06-09 14:53:10 -07:00
James R. Barlow
3b6f6782f0 Remove tesseract_env, --tesseract-env 2020-06-09 00:39:53 -07:00
James R. Barlow
21c0e045cb Remove _OCRMYPDF_TEST_PATH environment variable 2020-06-09 00:30:13 -07:00
James R. Barlow
ebbf68bd08 The big payoff: abolishing spoofing machinery 2020-06-09 00:08:20 -07:00
James R. Barlow
2059e916da Convert all ghostscript spoofs to test plugins 2020-06-09 00:00:25 -07:00
James R. Barlow
c22f245606 Plugins must return not-None if they intend to stop builtin 2020-06-08 23:48:45 -07:00
James R. Barlow
7b9025f397 Convert generate_pdfa to plugin 2020-06-08 22:28:38 -07:00
James R. Barlow
b109445215 Move Ghostscript rasterize_pdf to plugin 2020-06-08 17:10:27 -07:00
James R. Barlow
fd1cd8e50a docs: explain --rotate-pages-threshold 2020-06-08 07:46:55 -07:00
James R. Barlow
c6c70c2171 docs: Ubuntu 20.04 install instructions 2020-06-08 07:42:13 -07:00
James R. Barlow
a9a473f2e5 Convert all tesseract cache usages to plugin 2020-06-05 17:55:18 -07:00