James R. Barlow
5acf21651f
ruff lint and format
2026-01-13 01:50:57 -08:00
James R. Barlow
4c7086c609
Replace typer with cyclopts CLI library in misc scripts
...
Migrate watcher.py and pdf_text_diff.py from typer to cyclopts for
CLI argument parsing. Update pyproject.toml to reflect the dependency
change in the watcher optional feature.
2026-01-13 00:43:14 -08:00
James R. Barlow
16c2604a07
Remove lossy JBIG2 support, retain lossless JBIG2 only
...
Lossy JBIG2 has been removed due to well-documented risks of character
substitution errors (e.g., 6/8 confusion). The --jbig2-lossy and
--jbig2-page-group-size arguments are now deprecated and ignored with
a warning.
Changes:
- Remove jbig2_lossy and jbig2_page_group_size from OCROptions
- Simplify optimize.py to use single-image JBIG2 encoding only
(no symbol dictionaries/JBIG2Globals)
- Remove convert_group() from jbig2enc.py
- Deprecate CLI args with warnings for backward compatibility
- Update documentation to explain lossless-only JBIG2
2025-12-23 02:45:07 -08:00
James R. Barlow
1f493ba789
refactor: post-AI code cleanup
2025-12-21 12:21:47 -08:00
5HT2
650ca1c65b
docs: Update screencast demo output to have corrected references to PDF/A compliance levels
...
See a7b0c0df6c for more information
2025-08-31 20:54:08 +01:00
James R. Barlow
4fc0c3a0d5
Add watcher test, such as it is
2025-08-13 01:04:58 -07:00
PunkPangolin
ee3da07710
Add appstream metainfo file + screenshot ( #1462 )
...
* Add io.ocrmypdf.ocrmypdf.metainfo.xml
* Create sample_screenshot.png
* Better screenshot
* Add screenshot to metainfo
* Move into /misc/flatpak
* Add screenshot URL
* Add icon and categories to metainfo
* Use installed icon instead of remote
* Add keywords to metainfo, change summary closer to Flathub Guildelines
2025-05-27 00:42:47 -07:00
James R. Barlow
6f16d0130a
Clarify that ocrmypdf-compare is a testing tool
2025-04-15 00:03:14 -07:00
James R. Barlow
d84c47816c
webservice: promote pages to primary option
2025-04-06 01:07:47 -07:00
James R. Barlow
6de6749062
webservice: fix download button downloads wrong file
2025-02-26 18:42:50 -08:00
James R. Barlow
b5bc1d209c
Remove ttyd
2025-02-26 14:53:13 -08:00
James R. Barlow
f02353686d
s/input/output
2025-01-04 12:18:07 -08:00
James R. Barlow
073a434ab3
Fix webservice interactions with Docker
2025-01-04 12:09:32 -08:00
James R. Barlow
55e7177dbe
Present similar interface in webservice.py
2025-01-04 01:04:58 -08:00
James R. Barlow
36c82e0659
Add debugging helper scripts
2025-01-01 18:03:15 -08:00
James R. Barlow
dd6ed4c5f8
Switch to streamlit based web app
2025-01-01 17:26:22 -08:00
James R. Barlow
a1b8113d56
Add bisect script
2024-11-08 11:09:13 -08:00
James R. Barlow
d5ff7f7db9
batch: fix issues flagged by ruff
2024-05-21 01:52:57 -07:00
James R. Barlow
579cef3649
watcher: Ensure output files are .pdf
2024-05-21 01:51:30 -07:00
James R. Barlow
065bddbc6c
Reformat with ruff format
2024-04-07 00:25:32 -07:00
NilsRo
feeb9f213f
batch example: added archive, small corrections and optimizations ( #1277 )
...
* Added archive, small corrections
Added a function to archive originals and avoid calling ocrmypdf if they are still is PDF/A.
* Added Copyright
2024-03-18 13:22:24 -07:00
James R. Barlow
8d30cff4ef
Undo future annotations from watcher.py till Typer fixes its issue
...
Fixes #1258
2024-02-20 19:14:39 -08:00
James R. Barlow
3a3635f7f9
Python 3.10 cleanup, manual fixes
2024-02-14 12:48:17 -08:00
James R. Barlow
f69267bb67
watcher: restore ability to read json from file or command line string
2023-11-07 18:05:29 -08:00
James R. Barlow
55566d9830
Fix watcher.py kwarg error
2023-11-05 13:58:24 -08:00
James R. Barlow
52d99732b1
Fix mistakes with watcher loglevel handling
2023-10-28 00:47:40 -07:00
James R. Barlow
c6be3ba076
watcher: Improve parameter validation
2023-10-20 20:11:00 -07:00
James R. Barlow
0565cb0b10
misc/watcher.py: use Typer and dotenv to improve ease of use
2023-10-20 19:56:39 -07:00
James R. Barlow
dc49906704
Improve wait_for_file_ready loop
2023-10-20 19:55:50 -07:00
James R. Barlow
0388c23ae7
Merge branch 'feature/jbig2thresh' into v15
2023-09-21 00:07:05 -07:00
James R. Barlow
be12f7a728
Make fish completion a bit smarter
2023-09-20 14:45:22 -07:00
James R. Barlow
e3c813fc67
Added support for changing color conversion strategy
2023-09-20 01:08:15 -07:00
James R. Barlow
330352aeed
Update completions for jbig2 threshold
2023-09-17 14:47:46 -07:00
Srikar Sundaram
4bee7355e9
Change skip-ocr to skip-text ( #1146 )
2023-09-14 17:22:34 -07:00
James R. Barlow
a6ce35b13a
Add argument to override digital signatures
2023-08-12 01:31:36 -07:00
James R. Barlow
e44a57aec0
Try a screencast/terminal demo
2023-06-20 00:48:42 -07:00
James R. Barlow
33b70be7d5
ruff: more fixes, mainly missing docstrings
2023-04-14 02:16:38 -07:00
James R. Barlow
4924b11b6b
Additional ruff fixes
2023-04-14 01:25:16 -07:00
James R. Barlow
9b8d14d16e
Accept most of ruff's delinting
2023-04-14 00:45:34 -07:00
comzine
2685f910b1
watcher: added setting RETRIES_LOADING_FILE to avoid giving up to early ( #1063 )
2023-01-25 17:36:54 -08:00
Doug Rinckes
d09f61d4fe
log completion message ( #1044 )
...
This logs the "done" message if neither delete nor archive options are set.
2022-12-14 17:24:41 -08:00
James R. Barlow
7da4e6ca7f
Address some linter warnings
2022-09-21 00:05:12 -07:00
James R. Barlow
4b9ea40a0c
spdx: move identifiers to files that support them
...
If the apparent license changed, take this commit as correct.
2022-08-04 03:26:54 -07:00
James R. Barlow
80ed2117cc
Change to SPDX license tracking
2022-07-28 01:10:07 -07:00
James R. Barlow
dc6f1a266a
Modernize type annotations
2022-07-23 00:39:24 -07:00
Julius Bullinger
7cabbb125f
watcher: Add an option to archive processed originals ( #951 )
...
* watcher: Add an option to archive processed originals
This adds a feature from existing OCRmyPDF watchdog Docker containers like meyay/ocrmypdf-batch and unze/ocrmypdf-watchdog. With this option, the input directory can be kept clean from already processed files, without losing the originals.
* docs: Improve watcher.py's Docker parameters documentation
2022-06-17 15:17:03 -07:00
James Barlow
776ada6713
Upgrade pre-commit and associated tools; various lints
2022-04-03 20:53:01 -07:00
James R. Barlow
0323738ada
ocrmypdf.fish: fix indents
...
[ci skip]
2021-12-06 15:38:27 -08:00
FPille
aae5591f7e
Update ocrmypdf.bash completion
...
Squashed commit of the following:
commit 974de2e8ccad7fd34694f2c3a7a17c64bb52cdab
Merge: a8d7f969 ee04aa72
Author: James R. Barlow <james@purplerock.ca >
Date: Sat Dec 4 20:22:50 2021 -0800
Merge branch 'update_bash-completion' of git://github.com/FPille/OCRmyPDF into FPille-update_bash-completion
commit ee04aa7225
Author: FPille <f.pille@gmail.com >
Date: Thu Oct 14 11:09:23 2021 +0200
update
commit 76f64537aa
Author: FPille <f.pille@gmail.com >
Date: Thu Oct 14 11:04:10 2021 +0200
updated and descriptions for arguments and choices added
deprecated arguments removed
bug fix: typo "_init_completion" instead of "_init_completions"
commit de9b93e852
Merge: c23374de 42713b77
Author: Frank <50119297+FPille@users.noreply.github.com >
Date: Thu Oct 14 08:08:11 2021 +0200
Merge branch 'jbarlow83:master' into master
commit c23374de81
Merge: 40b2ebcb c409fa58
Author: Frank <50119297+FPille@users.noreply.github.com >
Date: Wed May 26 20:31:00 2021 +0200
Merge branch 'jbarlow83:master' into master
commit 40b2ebcb37
Merge: 79c84eef 7e388f59
Author: Frank <50119297+FPille@users.noreply.github.com >
Date: Sat Jun 1 11:09:07 2019 +0200
Merge pull request #1 from jbarlow83/master
update master
2021-12-06 15:38:26 -08:00
James R. Barlow
f91faf9795
Add new argument --tesseract-thresholding to control tesseract thresholding where available
...
Also add missing test for --tesseract-oem
2021-12-06 15:38:14 -08:00