docs: Improve ocrmypdf.api

This commit is contained in:
James R. Barlow
2025-11-10 15:58:47 -08:00
parent 2f72f8e94a
commit a385cd967d
2 changed files with 6 additions and 5 deletions

3
.gitignore vendored
View File

@@ -46,4 +46,5 @@ docs/_templates/
docs/Makefile
src/ocrmypdf/_version.py
.idea/
.idea/
.aider*

View File

@@ -9,14 +9,14 @@ OCR operations programmatically without using the command line interface.
Main Functions:
ocr(): The primary function for OCR processing. Takes an input PDF or image
file and produces an OCR'd PDF with searchable text.
configure_logging(): Set up logging to match the command line interface
behavior, with support for progress bars and colored output.
Experimental Functions:
_pdf_to_hocr(): Extract text from PDF pages and save as hOCR files for
manual editing before final PDF generation.
_hocr_to_ocr_pdf(): Convert hOCR files back to a searchable PDF after
manual text corrections.
@@ -26,10 +26,10 @@ at a time. For parallel processing, use multiple Python processes.
Example:
import ocrmypdf
# Configure logging (optional)
ocrmypdf.configure_logging(ocrmypdf.Verbosity.default)
# Perform OCR
ocrmypdf.ocr('input.pdf', 'output.pdf', language='eng')