Fix numerous documentation build problems

This commit is contained in:
James R. Barlow
2025-01-03 12:23:42 -08:00
parent cfebf1dc8b
commit 74a84b6ae9
6 changed files with 24 additions and 37 deletions

View File

@@ -244,34 +244,34 @@ As of June 2024, the Tesseract page segmentation modes are:
+-----+----------------------------------------------------------------------------------+
| ID | Description |
+=====+==================================================================================+
| 0 | Orientation and script detection (OSD) only. |
| 0 | Orientation and script detection (OSD) only. |
+-----+----------------------------------------------------------------------------------+
| 1 | Automatic page segmentation with OSD. |
| 1 | Automatic page segmentation with OSD. |
+-----+----------------------------------------------------------------------------------+
| 2 | Automatic page segmentation, but no OSD, or OCR. (not implemented) |
+-----+----------------------------------------------------------------------------------+
| 3 | Fully automatic page segmentation, but no OSD. (Default) |
+-----+----------------------------------------------------------------------------------+
| 4 | Assume a single column of text of variable sizes. |
| 4 | Assume a single column of text of variable sizes. |
+-----+----------------------------------------------------------------------------------+
| 5 | Assume a single uniform block of vertically aligned text. |
| 5 | Assume a single uniform block of vertically aligned text. |
+-----+----------------------------------------------------------------------------------+
| 6 | Assume a single uniform block of text. |
| 6 | Assume a single uniform block of text. |
+-----+----------------------------------------------------------------------------------+
| 7 | Treat the image as a single text line. |
| 7 | Treat the image as a single text line. |
+-----+----------------------------------------------------------------------------------+
| 8 | Treat the image as a single word. |
| 8 | Treat the image as a single word. |
+-----+----------------------------------------------------------------------------------+
| 9 | Treat the image as a single word in a circle. |
| 9 | Treat the image as a single word in a circle. |
+-----+----------------------------------------------------------------------------------+
| 10 | Treat the image as a single character. |
| 10 | Treat the image as a single character. |
+-----+----------------------------------------------------------------------------------+
| 11 | Sparse text. Find as much text as possible in no particular order. |
| 11 | Sparse text. Find as much text as possible in no particular order. |
+-----+----------------------------------------------------------------------------------+
| 12 | Sparse text with OSD. |
+-----+----------------------------------------------------------------------------------+
| 13 | Raw line. Treat the image as a single text line, bypassing hacks that are |
| | Tesseract-specific. |
| | Tesseract-specific. |
+-----+----------------------------------------------------------------------------------+
Modes 0, 1, 2, and 12 (all of those that enable orientation and script detection)

View File

@@ -9,27 +9,12 @@ API reference
This page summarizes the rest of the public API. Generally speaking this
should be mainly of interest to plugin developers.
ocrmypdf
========
ocrmypdf.api
============
.. autoclass:: ocrmypdf.PageContext
.. automodule:: ocrmypdf.api
:members:
.. autoclass:: ocrmypdf.PdfContext
:members:
.. autoclass:: ocrmypdf.Verbosity
:members:
:undoc-members:
.. autofunction:: ocrmypdf.configure_logging
.. autofunction:: ocrmypdf.ocr
.. autofunction:: ocrmypdf.pdf_to_hocr
.. autofunction:: ocrmypdf.hocr_to_ocr_pdf
ocrmypdf.exceptions
===================

View File

@@ -92,6 +92,7 @@ if on_rtd:
MOCK_MODULES = [
'pikepdf',
'pikepdf.canvas',
'pikepdf.models',
'pikepdf.models.metadata',
]
@@ -108,7 +109,7 @@ version = '.'.join(release.split('.')[:2])
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
language = 'en'
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
@@ -158,19 +159,18 @@ todo_include_todos = False
# -- Options for HTML output ----------------------------------------------
import sphinx_rtd_theme
import sphinx_rtd_theme # noqa: F401
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
html_theme_options = {'display_version': False}
html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
# html_theme_path = []
@@ -198,7 +198,7 @@ html_theme_options = {'display_version': False}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# html_static_path = ['_static']
# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied

View File

@@ -35,7 +35,7 @@ execute the image:
docker run hello-world
.. list-table:: Docker images
:width: 30 20 50
:widths: 30 20 50
:header-rows: 1
* - Image

View File

@@ -68,8 +68,8 @@ to what languages it should search for. Multiple languages can be
requested using either ``-l eng+fra`` (English and French) or
``-l eng -l fra``.
Archlinux
------
Arch Linux
----------
.. code-block:: bash

View File

@@ -2,6 +2,8 @@
..
.. SPDX-License-Identifier: CC-BY-SA-4.0
.. _security:
===================
PDF security issues
===================