OCRmyPDF

mirror of https://github.com/ocrmypdf/OCRmyPDF.git synced 2026-05-07 22:24:43 -04:00

Author	SHA1	Message	Date
Hugo	d761d80750	Use more standard __version__ rather than PILLOW_VERSION (#257 )	2018-04-19 23:35:32 -07:00
James R. Barlow	0b10db91be	Fix regression: Disable Ghostscript JPEG passthrough entirely	2018-04-17 17:00:24 -07:00
James R. Barlow	1a516b2af9	Fix regression: time stamp test suite failures	2018-04-17 16:59:21 -07:00
James R. Barlow	7368399f8b	Clarify license of two test files - https://github.com/jbarlow83/OCRmyPDF/issues/254	2018-04-17 11:56:36 -07:00
James R. Barlow	34c78a892a	Fix list table for tests/resources [ci skip]	2018-04-15 23:52:19 -07:00
James R. Barlow	10aa59f674	v6.1.4 fix test suite regression with Ghostscript 9.23	2018-04-12 15:16:54 -07:00
James R. Barlow	ba0535e3fb	Update test cache to account for unpaper --layout none change	2018-04-12 00:48:21 -07:00
James R. Barlow	49fa7f6b5c	tesseract_cache: don't reveal host system file paths in manifest file	2018-04-12 00:47:28 -07:00
James R. Barlow	7a1cd39b21	Fix creation date metadata lost from input Closes #247	2018-04-02 17:53:39 -07:00
James R. Barlow	4f6bffb477	Update copyrights	2018-03-31 11:54:38 -07:00
James R. Barlow	8d9be43c60	test_bookmarks_preserved won't raise ImportError any more Due to trapping this in ocrmypdf.lib	2018-03-28 23:22:55 -07:00
James R. Barlow	40ef4f0bbe	Add new argument --skip-repair to skip the repair step	2018-03-28 00:54:58 -07:00
James R. Barlow	5becfcf8ea	Refactor fitz ImportError trap	2018-03-27 21:38:02 -07:00
James R. Barlow	a9bd494cc0	Merge branch 'optional-fitz'	2018-03-27 13:36:33 -07:00
James R. Barlow	6a4df78bc0	Add _naive_find_text to search for text when fitz is not available	2018-03-27 13:36:17 -07:00
James R. Barlow	530eae3898	Fix test_main missing file_claims_pdfa	2018-03-26 15:33:53 -07:00
James R. Barlow	3e444f6a90	Make fitz optional	2018-03-26 13:22:09 -07:00
James R. Barlow	45dbff6401	Fix table of contents not preserved in PDF/A	2018-03-26 02:23:19 -07:00
James R. Barlow	bc56b8e058	Move metadata tests to new test_metadata	2018-03-26 01:49:25 -07:00
James R. Barlow	746969207a	Remove deprecated --pdf-renderer tess4, which was renamed to sandwich Should have been cut in v6.0.0	2018-03-26 01:17:22 -07:00
James R. Barlow	230d301268	conftest: py3.5 path issue	2018-03-25 00:52:45 -07:00
James R. Barlow	a2d00f5f1d	tess cache: fix tess3 error for -psm instead of --psm	2018-03-25 00:43:02 -07:00
James R. Barlow	8c1c61f207	test cache: fix Path + str error	2018-03-25 00:02:03 -07:00
James R. Barlow	77476965ae	test cache: use .bin extension, fix .gitignore .gitattributes	2018-03-24 23:54:16 -07:00
James R. Barlow	ca51514046	Add test cache	2018-03-24 23:50:41 -07:00
James R. Barlow	8975b72a01	Fix test_testonly_pdf generating an output file in pwd	2018-03-24 22:34:35 -07:00
James R. Barlow	874ec6a87f	Add missing fixture to test_unpaper	2018-03-24 22:24:14 -07:00
James R. Barlow	909eaeeead	spoof: Allow tesseract cache to share cache Previous incarnation was only suitable for generating a local cache where the suite was executed repeatedly. Now the cache ignores differences, so it can be checked into Github and shared.	2018-03-24 22:17:36 -07:00
James R. Barlow	c138161fae	Tests: more cleanup	2018-03-24 15:35:57 -07:00
James R. Barlow	e48590d66c	Refactor out unpaper-specific tests	2018-03-24 15:21:44 -07:00
James R. Barlow	5b1c8541fc	Review some skipped tests to make sure reasons still valid	2018-03-24 15:13:23 -07:00
James R. Barlow	e5e011021b	Remove the OCRMYPDF_program environment variables Really, this was just replicating the functionality of the PATH environment variable, and users probably do that anyway.	2018-03-24 15:09:08 -07:00
James R. Barlow	11d74dea09	Remove the OCRMYPDF_program environment variables Really, this was just replicating the functionality of the PATH environment variable, and users probably do that anyway.	2018-03-24 15:07:02 -07:00
James R. Barlow	6756016572	Add license notice to all files Source files to GPL3 Exceptions: -tests/spoof/* to MIT -hocrtransform.py -_unicodefun.py Test resources to CC BY-SA 4.0 except when otherwise noted. Add GPL license.	2018-03-24 02:33:24 -07:00
James R. Barlow	d700154e0e	Fix regressions after --skip-text improvements	2018-03-24 02:24:45 -07:00
James R. Barlow	8159cc6b88	Skip one test that fails for qpdf 8.0.[0,1], due to qpdf regression	2018-03-09 07:57:22 -08:00
James R. Barlow	4046766ca5	Fix Python 3.5 test suite failure on symlinks Did not account for API difference in pathlib	2018-03-02 16:57:46 -08:00
James R. Barlow	74ca736333	Issue #223 : improve text of encrypted PDF error message	2018-02-27 15:08:22 -08:00
James R. Barlow	8ab8132411	lint: unused variables, wildcard imports	2018-02-24 12:48:52 -08:00
James R. Barlow	45c7bd9a60	lint: Remove shebangs from non-executable files	2018-02-24 12:38:58 -08:00
James R. Barlow	e7bcb95635	Fix pylint errors	2018-02-24 11:59:01 -08:00
James R. Barlow	3de83627a9	Handle output to /dev/null or directory (#219 ) Previously we threw an exception if the output name was a directory (only after doing OCR) and would trigger a PermissionError on trying to flip permission bits of /dev/null due to shutil.copyfile implementation. Instead of copying file use shutil.copyfileobj which should also respect umask etc.	2018-02-19 22:15:07 -08:00
James R. Barlow	a9da839c39	Add vector-only PDF test case	2018-02-08 00:17:35 -08:00
James R. Barlow	1dfc32d7e6	Preserve "text as curves" vector content Never updated the checking logic to deal with a pure vector file with no text that needs an OCR layer. This is doable, so allow it.	2018-02-07 16:05:48 -08:00
James R. Barlow	019513696b	Ghostscript spoof scripts did not report their --version correctly	2018-01-10 17:08:14 -08:00
James R. Barlow	ad7a4476db	hugemono.pdf needs --max-image-mpixels to pass with Pillow 5.0	2018-01-10 16:55:18 -08:00
James R. Barlow	4812b20fb2	Fix tesseract_noop.py generating wrong size of output PDF in tests This caused trouble before with test_deskew	2018-01-10 16:35:31 -08:00
James R. Barlow	882fc2257c	Add --max-image-mpixels argument to support Pillow 5.0	2018-01-10 15:43:59 -08:00
James R. Barlow	91b42cbfa8	Fix issue in sandwich renderer when skipping OCR on a rotated and deskewed page If OCR is skipped due to --tesseract-timeout or similar, and the skip page is rotated with /Rotate, and the skip page was deskewed or had other image processing, then the skip page was created with the wrong dimensions causing the output page to be cropped.	2018-01-09 00:17:53 -08:00
James R. Barlow	da11fd17ee	qpdf dummy: needs to return version now	2017-11-29 14:35:37 -08:00

1 2 3 4 5 ...

268 Commits