4075 Commits

Author SHA1 Message Date
James R. Barlow
c540967429 docs: Update release notes 2025-12-23 15:44:44 -08:00
James R. Barlow
195344d307 Reinstate "Work around Ghostscript 10.6.0 JPEG encoding issue by forcing optimization.""
This reverts commit fc30cb8903.
It turns out that both fixes were necessary.
2025-12-23 15:41:34 -08:00
James R. Barlow
de63d6eac9 Merge remote-tracking branches 'origin/dependabot/github_actions/actions/download-artifact-7', 'origin/dependabot/github_actions/actions/upload-artifact-6', 'origin/dependabot/github_actions/sigstore/gh-action-sigstore-python-3.2.0' and 'origin/dependabot/github_actions/actions/checkout-6' 2025-12-23 15:06:50 -08:00
James R. Barlow
6ada11ddae docs: Update release notes 2025-12-23 15:05:49 -08:00
James R. Barlow
fc30cb8903 Revert "Work around Ghostscript 10.6.0 JPEG encoding issue by forcing optimization."
This reverts commit f4c6c8121b.

The issue is now resolved by correcting the encoidng issue directly.
2025-12-23 15:03:51 -08:00
James R. Barlow
01a3706281 docs: Add release notes for v16.13.0 2025-12-23 15:01:22 -08:00
James R. Barlow
e613db6a82 Fix Ghostscript 10.6 JPEG corruption by repairing truncated images
Ghostscript 10.6 has a bug that truncates JPEG data by 1-15 bytes.
This adds detection and repair by comparing output images to input
images and restoring the original bytes when truncation is detected.

- Add warning when GS 10.6+ is used with PDF/A output
- Add _repair_gs106_jpeg_corruption() to fix damaged JPEGs after
  Ghostscript processing
- Add unit tests for the repair function
2025-12-23 14:56:24 -08:00
James R. Barlow
742a4bac17 Make rotation test more robust 2025-12-23 11:20:57 -08:00
James R. Barlow
4c1ef0b471 Also process art and bleed boxes 2025-12-23 11:20:41 -08:00
James R. Barlow
eace567f7b Test and fix page box issues 2025-12-23 11:19:51 -08:00
dependabot[bot]
cdf956ffc4 Bump actions/download-artifact from 6 to 7
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 6 to 7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v6...v7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-15 10:02:30 +00:00
dependabot[bot]
c6b21d4dea Bump actions/upload-artifact from 5 to 6
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-15 10:02:24 +00:00
dependabot[bot]
f673da9ab9 Bump sigstore/gh-action-sigstore-python from 3.1.0 to 3.2.0
Bumps [sigstore/gh-action-sigstore-python](https://github.com/sigstore/gh-action-sigstore-python) from 3.1.0 to 3.2.0.
- [Release notes](https://github.com/sigstore/gh-action-sigstore-python/releases)
- [Changelog](https://github.com/sigstore/gh-action-sigstore-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sigstore/gh-action-sigstore-python/compare/v3.1.0...v3.2.0)

---
updated-dependencies:
- dependency-name: sigstore/gh-action-sigstore-python
  dependency-version: 3.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-08 10:02:38 +00:00
rugk
8d715c4157 docs: fix and clarify podman usage instructions (#1601)
* docs: fix and clarify podman usage instructions

* the full reference `jbarlow83/ocrmypdf-alpine` as in the other commands may fix an issue if you do not have `ocrmypdf` already downloaded locally
* also clarified the command at the end for usage when SELinux is enabled

* docs: clarify difference between SeLinux and rootless user mapping
2025-12-01 13:07:09 -08:00
dependabot[bot]
0f3c7765aa Bump actions/checkout from 5 to 6
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-24 11:31:33 +00:00
Chris Mayo
9dbce33ee6 Update Changelog URL (#1597)
Renamed in:
d1a45e4a ("Convert remaining rst -> md", 2025-04-17)
2025-11-16 23:10:48 -08:00
James R. Barlow
54ce09496c v16.12.0 release notes v16.12.0 2025-11-11 13:48:06 -08:00
James R. Barlow
f4c6c8121b Work around Ghostscript 10.6.0 JPEG encoding issue by forcing optimization.
Not an ideal fix, but it improves an issue affecting numerous users.

Fixes #1585.
2025-11-10 17:01:02 -08:00
James R. Barlow
057eaff36d Skip devnull testing on Windows
No longer seems to work - Windows Server 2025 change, perhaps? Doesn't really matter.
2025-11-10 16:57:30 -08:00
James R. Barlow
b88d63bdf7 Add Python 3.14 to test matrix 2025-11-10 16:10:01 -08:00
James R. Barlow
a385cd967d docs: Improve ocrmypdf.api 2025-11-10 15:58:47 -08:00
James R. Barlow
2f72f8e94a ghostscript: Disable subset fonts
For at least the PDF associated with this issue, disabling subset
fonts prevents Ghostscript from mangling the encoding when it is usable but not well-formed.

Fixes #1592
2025-11-10 15:58:14 -08:00
James R. Barlow
ee47e986f3 docs: Improve module-level docstring for OCRmyPDF Python API
Co-authored-by: aider (anthropic/claude-sonnet-4-20250514) <aider@aider.chat>
2025-11-10 10:33:26 -08:00
James R. Barlow
e44063da15 Update Dockerfile versions
tesseract-ocr/alex-p does not have a Tesseract 5 for Ubuntu 25.10 so we use 25.04 for now.

Ubuntu 25.04 gets us Ghostscript 10.05 which avoids issues in older versions.

Remove comment about now-legacy Alpine versions not working properly. Alpine provides Ghostscript 10.05.1.

Fixes #1587,
2025-11-09 15:20:55 -08:00
James R. Barlow
abc2d41e2d Require recent pikepdf to fix check_pdf_syntax issue 2025-10-29 11:40:51 -07:00
James R. Barlow
38d60ea89b optimize: don't put flate on large jpegs unless compression is high
Putting flate on very large JPEGs can cause performance problems in PDF viewers, subjectively anyway.
2025-10-29 11:39:20 -07:00
James R. Barlow
35ec90af44 Merge remote-tracking branches 'origin/dependabot/github_actions/sigstore/gh-action-sigstore-python-3.1.0', 'origin/dependabot/github_actions/actions/upload-artifact-5' and 'origin/dependabot/github_actions/actions/download-artifact-6' 2025-10-28 13:40:08 -07:00
James R. Barlow
aa1cc8ae04 Update packages 2025-10-27 17:07:14 -07:00
dependabot[bot]
eaceb66030 Bump sigstore/gh-action-sigstore-python from 3.0.1 to 3.1.0
Bumps [sigstore/gh-action-sigstore-python](https://github.com/sigstore/gh-action-sigstore-python) from 3.0.1 to 3.1.0.
- [Release notes](https://github.com/sigstore/gh-action-sigstore-python/releases)
- [Changelog](https://github.com/sigstore/gh-action-sigstore-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sigstore/gh-action-sigstore-python/compare/v3.0.1...v3.1.0)

---
updated-dependencies:
- dependency-name: sigstore/gh-action-sigstore-python
  dependency-version: 3.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-27 11:08:50 +00:00
dependabot[bot]
b1dcc2c445 Bump actions/upload-artifact from 4 to 5
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 5.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-27 11:02:08 +00:00
dependabot[bot]
ab3855af48 Bump actions/download-artifact from 5 to 6
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-27 10:47:34 +00:00
James R. Barlow
5c6cc4031f Merge remote-tracking branch 'origin/dependabot/github_actions/astral-sh/setup-uv-7' 2025-10-25 12:10:01 -07:00
James R. Barlow
f181307e50 v16.11.1 release notes v16.11.1 2025-10-16 10:59:13 +02:00
James R. Barlow
b213efb030 Account for new deskew output error message from recent Tesseract
Fixes #1576
2025-10-16 09:50:03 +02:00
James R. Barlow
f59e68911f Drop macos-13 (now unsupported by Apple) 2025-10-13 15:10:28 +02:00
dependabot[bot]
9605656a2f Bump astral-sh/setup-uv from 6 to 7
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6 to 7.
- [Release notes](https://github.com/astral-sh/setup-uv/releases)
- [Commits](https://github.com/astral-sh/setup-uv/compare/v6...v7)

---
updated-dependencies:
- dependency-name: astral-sh/setup-uv
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-13 10:40:31 +00:00
James R. Barlow
599fb1a1f6 Fix test_semfree (skip Python 3.14)
This feature is now deprecated and won't be fixed for Python 3.14. Instead we just use threads on platforms that don't support semaphores.

Closes #1558
2025-09-14 13:02:33 -07:00
James R. Barlow
9a2c0cf6ff v16.11.0 release notes v16.11.0 2025-09-12 00:08:11 -07:00
James R. Barlow
414d80fc16 Deprecate semfree and don't auto activate it
Instead the standard executor will fall back to threads.

semfree caused test failures  with Py3.14:
https://github.com/ocrmypdf/OCRmyPDF/issues/1558

In retrospect and with emerging Python tech like freethreading, semfree is becoming less necessary. We can use threads for the time being.

A consequence is that performance may be lower on Lambda and Termux when we are using threads and not shelling out work.
2025-09-11 17:13:04 -07:00
James R. Barlow
7ca4ae4e16 Merge branch 'feature/pdfa-naming' 2025-09-11 16:37:53 -07:00
James R. Barlow
7e7e2f2e91 Raw value in pdfa XML block uses upper case codes, so account for this 2025-09-08 12:46:26 -07:00
clach04
d07231a7aa Doc typo plugins.md (#1568) 2025-09-08 12:07:51 -07:00
dependabot[bot]
0e831db9f4 Bump actions/setup-python from 5 to 6 (#1569)
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 12:07:28 -07:00
5HT2
650ca1c65b docs: Update screencast demo output to have corrected references to PDF/A compliance levels
See a7b0c0df6c for more information
2025-08-31 20:54:08 +01:00
5HT2
a7b0c0df6c fix(src): Refactor CLI help references to PDF/A compliance levels
Please see [RFC8118 4.](https://datatracker.ietf.org/doc/html/rfc8118#section-4) for examples regarding the PDF/A compliance naming scheme.
Please see [RFC8118 [ISOPDFA]](https://datatracker.ietf.org/doc/html/rfc8118#ref-ISOPDFA) for more complete information regarding the PDF/A compliance naming scheme.
2025-08-31 20:37:41 +01:00
5HT2
d735791524 fix(src): Refactor valid_part_conforms for PDF/A compliance levels 2025-08-31 20:32:30 +01:00
dependabot[bot]
66308c2813 Bump actions/download-artifact from 4 to 5 (#1557)
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 5.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-18 13:43:34 -07:00
dependabot[bot]
d81de57bbc Bump actions/checkout from 4 to 5 (#1560)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-18 13:43:10 -07:00
Alina Bürge
a9a8b39dba Fix the use of the plugin_manager argument (#1555) 2025-08-18 13:00:39 -07:00
Stuart Henderson
fd5b8132ae add OpenBSD info to readme (#1554) 2025-08-18 12:49:21 -07:00