From 467b7f016337e014bdb45cacc7c8ee186209aede Mon Sep 17 00:00:00 2001 From: "James R. Barlow" Date: Thu, 26 Jan 2017 12:29:11 -0800 Subject: [PATCH] Update docs for eventual v4.4 release --- RELEASE_NOTES.rst | 17 +++++++++++++++++ docs/index.rst | 1 + docs/installation.rst | 2 +- docs/security.rst | 2 +- 4 files changed, 20 insertions(+), 2 deletions(-) diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst index 7545e4bf..19a1cece 100644 --- a/RELEASE_NOTES.rst +++ b/RELEASE_NOTES.rst @@ -3,6 +3,23 @@ RELEASE NOTES OCRmyPDF uses `semantic versioning `_. +v4.4: +===== + +- Tesseract 4.00 is now supported on an experimental basis. + + + A new rendering option ``--pdf-renderer tess4`` exploits Tesseract 4's new text-only output PDF mode. See the documentation on PDF Renderers for details. + + The ``--tesseract-oem`` argument allows control over the Tesseract 4 OCR + engine mode. + + Fixed poor performance with Tesseract 4.00 on Linux + +- Fixed an issue that caused corruption of output to stdout in some cases +- Removed test for Pillow JPEG and PNG support, as the minimum supported version of Pillow now enforces this +- Significant code reorganization to make OCRmyPDF re-entrant and improve performance. All changes should be backward compatible for the v4.x series. + + + However, OCRmyPDF's dependency "ruffus" is not re-entrant, so no Python API is available. Scripts should continue to use the command line interface. + + v4.3.5: ======= diff --git a/docs/index.rst b/docs/index.rst index 7e0ce7b6..91ffd39e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -20,6 +20,7 @@ Contents: installation languages cookbook + renderers security errors diff --git a/docs/installation.rst b/docs/installation.rst index 224c2262..1473357c 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -262,7 +262,7 @@ where /c/Users/sampleuser is a Unix representation of the Windows path C:\\Users Installing HEAD revision from sources ------------------------------------- -If you have ``git`` and ``python3.4`` or ``python3.5`` installed, you can install from source. When the ``pip`` installer runs, +If you have ``git`` and Python 3.4 or newer installed, you can install from source. When the ``pip`` installer runs, it will alert you if dependencies are missing. To install the HEAD revision from sources in the current Python 3 environment: diff --git a/docs/security.rst b/docs/security.rst index 53ad59bc..b42a9ae2 100644 --- a/docs/security.rst +++ b/docs/security.rst @@ -14,7 +14,7 @@ PDF is a rich, complex file format. The official PDF 1.7 specification, ISO 3200 In short, PDFs `may contain viruses `_. -This `article `_ describes a method which allows potentially hostile PDFs to be viewed and rasterized safely in a disposable virtual machine. A trusted PDF created in this manner is converted to images and loses all information making it searchable. OCRmyPDF could be used restore searchability. +This `article `_ describes a high-paranoia method which allows potentially hostile PDFs to be viewed and rasterized safely in a disposable virtual machine. A trusted PDF created in this manner is converted to images and loses all information making it searchable and losing all compression. OCRmyPDF could be used restore searchability. How OCRmyPDF processes PDFs ---------------------------