Comprehensive test coverage for the new hocrtransform components:
- test_ocr_element.py: Tests for BoundingBox, Baseline, FontInfo,
OcrElement dataclass methods (iter_by_class, find_by_class,
get_text_recursive, words/lines/paragraphs properties)
- test_hocr_parser.py: Tests for parsing hOCR files including
page/paragraph/line/word extraction, RTL text, rotated text,
different line types (header, caption), font info, and edge cases
- test_pdf_renderer.py: Tests for PDF rendering including text
extraction verification, page sizing, multi-line content,
text direction, baseline handling, textangle rotation, word breaks,
debug options, and image overlay
Also fixes x_font regex pattern to not capture trailing semicolons.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>