mirror of
https://github.com/ocrmypdf/OCRmyPDF.git
synced 2026-05-04 12:48:02 -04:00
Debian now has a few disadvantages: -there is no convenient PPA for Debian tesseract 4.0, but there is for Ubuntu -Ubuntu sets locale to UTF-8 automatically removing the need to do this All three ocrmypdf docker images are now based on a common Ubuntu 16.10 image, derived from the one used to build ocrmypdf-tess4. -polyglot now differs from -tess4 only by opting into the tess4 PPA. Both Ubuntu 16.10 and Debian stretch use tesseract 3.04.01 now making the sharp.ttf patch unnecessary. /etc/apt/sources has been unused for a while now both have newer Ghostscripts.
19 lines
500 B
Docker
19 lines
500 B
Docker
# OCRmyPDF polyglot
|
|
#
|
|
# VERSION 4.4.2
|
|
FROM jbarlow83/ocrmypdf:latest
|
|
MAINTAINER James R. Barlow <jim@purplerock.ca>
|
|
|
|
USER root
|
|
|
|
# Update system and install our dependencies
|
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
|
tesseract-ocr-all
|
|
|
|
RUN apt-get autoremove -y && apt-get clean -y
|
|
|
|
USER docker
|
|
|
|
# Must use array form of ENTRYPOINT
|
|
# Non-array form does not append other arguments, because that is "intuitive"
|
|
ENTRYPOINT ["/application/docker-wrapper.sh"] |