From ed632ae366bb44a075a0ece91cc597cb200040d7 Mon Sep 17 00:00:00 2001 From: "James R. Barlow" Date: Sun, 19 Jun 2022 01:01:33 -0700 Subject: [PATCH] docs: update batch to avoid suggesting Docker volumes [ci skip] --- docs/batch.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/batch.rst b/docs/batch.rst index 9003f7b1..9ceef054 100644 --- a/docs/batch.rst +++ b/docs/batch.rst @@ -42,12 +42,15 @@ place, and printing each filename in between runs: find . -printf '%p\n' -name '*.pdf' -exec ocrmypdf '{}' '{}' \; -Alternatively, with a docker container (mounts a volume to the container -where the PDFs are stored): +Alternatively, with a Docker container and streaming the file through +standard input and output: .. code-block:: bash - find . -printf '%p\n' -name '*.pdf' -exec docker run --rm -v : jbarlow83/ocrmypdf '/{}' '/{}' \; + find . -name '*.pdf' -print0 | xargs -0 | while read pdf; do + pdfout=$(mktemp) + docker run --rm -i jbarlow83/ocrmypdf - - <$pdf >$pdfout && cp $pdfout $pdf + done This only runs one ``ocrmypdf`` process at a time. This variation uses ``find`` to create a directory list and ``parallel`` to parallelize runs