Commit Graph

219 Commits

Author SHA1 Message Date
Dmitry Smirnov
8d928d525f codespell: spelling corrections
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2019-11-13 08:15:00 +11:00
Peter Hunt
dcf3c742b1 Split up create config handling of namespaces and security
As it stands, createconfig is a huge struct. This works fine when the only caller is when we create a container with a fully created config. However, if we wish to share code for security and namespace configuration, a single large struct becomes unweildy, as well as difficult to configure with the single createConfigToOCISpec function.

This PR breaks up namespace and security configuration into their own structs, with the eventual goal of allowing the namespace/security fields to be configured by the pod create cli, and allow the infra container to share this with the pod's containers.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-11-07 21:23:23 -05:00
OpenShift Merge Robot
3ec9ee090e Merge pull request #4466 from giuseppe/notmpcopyup
mount: add new options nocopyup|copyup for tmpfs
2019-11-07 21:23:54 +01:00
Giuseppe Scrivano
4e5e9dbec2 mount: add new options nocopyup|copyup for tmpfs
add a way to disable tmpcopyup for tmpfs.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-11-07 18:24:02 +01:00
Jakub Filak
2497b6c77b podman: add support for specifying MAC
I basically copied and adapted the statements for setting IP.

Closes #1136

Signed-off-by: Jakub Filak <jakub.filak@sap.com>
2019-11-06 16:22:19 +01:00
Giuseppe Scrivano
b8514ca6f3 namespaces: by default create cgroupns on cgroups v2
change the default on cgroups v2 and create a new cgroup namespace.

When a cgroup namespace is used, processes inside the namespace are
only able to see cgroup paths relative to the cgroup namespace root
and not have full visibility on all the cgroups present on the
system.

The previous behaviour is maintained on a cgroups v1 host, where a
cgroup namespace is not created by default.

Closes: https://github.com/containers/libpod/issues/4363

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-11-05 17:29:01 +01:00
Valentin Rothberg
11c282ab02 add libpod/config
Refactor the `RuntimeConfig` along with related code from libpod into
libpod/config.  Note that this is a first step of consolidating code
into more coherent packages to make the code more maintainable and less
prone to regressions on the long runs.

Some libpod definitions were moved to `libpod/define` to resolve
circular dependencies.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-31 17:42:37 +01:00
Valentin Rothberg
fb5367f295 seccomp: use github.com/seccomp/containers-golang
Use the github.com/seccomp/containers-golang library instead of the
docker package.  The docker package has changed and silently broke
on F31.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-10-30 11:43:29 +01:00
Nalin Dahyabhai
a4a70b4506 bump containers/image to v5.0.0, buildah to v1.11.4
Move to containers/image v5 and containers/buildah to v1.11.4.

Replace an equality check with a type assertion when checking for a
docker.ErrUnauthorizedForCredentials in `podman login`.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2019-10-29 13:35:18 -04:00
OpenShift Merge Robot
674dc2bc75 Merge pull request #4228 from giuseppe/detect-no-systemd-session
rootless: detect no system session with --cgroup-manager=systemd
2019-10-24 01:20:25 +02:00
Matthew Heon
57eaea9539 Image volumes should not be mounted noexec
This matches Docker more closely, but retains the more important
protections of nosuid/nodev.

Fixes #4318

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-23 12:09:22 -04:00
Giuseppe Scrivano
13fe146840 rootless: detect no system session with --cgroup-manager=systemd
if the cgroup manager is set to systemd, detect if dbus is available,
otherwise fallback to --cgroup-manager=cgroupfs.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-23 09:26:54 +02:00
Matthew Heon
0d623914d0 Add support for anonymous volumes to podman run -v
Previously, when `podman run` encountered a volume mount without
separate source and destination (e.g. `-v /run`) we would assume
that both were the same - a bind mount of `/run` on the host to
`/run` in the container. However, this does not match Docker's
behavior - in Docker, this makes an anonymous named volume that
will be mounted at `/run`.

We already have (more limited) support for these anonymous
volumes in the form of image volumes. Extend this support to
allow it to be used with user-created volumes coming in from the
`-v` flag.

This change also affects how named volumes created by the
container but given names are treated by `podman run --rm` and
`podman rm -v`. Previously, they would be removed with the
container in these cases, but this did not match Docker's
behaviour. Docker only removed anonymous volumes. With this patch
we move to that model as well; `podman run -v testvol:/test` will
not have `testvol` survive the container being removed by `podman
rm -v`.

The sum total of these changes let us turn on volume removal in
`--rm` by default.

Fixes: #4276

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-17 13:18:17 -04:00
OpenShift Merge Robot
a8993bab78 Merge pull request #4233 from mheon/fix_cc
Allow giving path to Podman for cleanup command
2019-10-12 19:26:37 +02:00
Matthew Heon
f00e1e0223 Allow giving path to Podman for cleanup command
For non-Podman users of Libpod, we don't want to force the exit
command to use ARGV[0], which probably does not support a cleanup
command.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-10-11 14:28:41 -04:00
OpenShift Merge Robot
9f1f4ef19e Merge pull request #4235 from giuseppe/no-pids-cgroupfs
rootless: do not set PIDs limit if --cgroup-manager=cgroupfs
2019-10-11 16:45:46 +02:00
Giuseppe Scrivano
5036b6a9fb rootless: do not set PIDs limit if --cgroup-manager=cgroupfs
even if the system is using cgroups v2, rootless is not able to setup
limits when the cgroup-manager is not systemd.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-11 13:34:51 +02:00
Giuseppe Scrivano
3ba3e1c751 systemd: expect full path /usr/sbin/init
"init" is a quite common name for the command executed in a container
image and Podman ends up using the systemd mode also when not
required.

Be stricter on enabling the systemd mode and not enable it
automatically when the basename is "init" but expect the full path
"/usr/sbin/init".

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-09 23:38:45 +02:00
OpenShift Merge Robot
c817ea1b33 Merge pull request #4032 from rhatdan/pids-limit
Setup a reasonable default for pids-limit 4096
2019-10-07 15:01:27 -07:00
Daniel J Walsh
118cf1fc63 Setup a reasonable default for pids-limit 4096
CRI-O defaults to 1024 for the maximum pids in a container.  Podman
should have a similar limit. Once we have a containers.conf, we can
set the limit in this file, and have it easily customizable.

Currently the documentation says that -1 sets pids-limit=max, but -1 fails.
This patch allows -1, but also indicates that 0 also sets the max pids limit.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-10-04 16:09:13 -04:00
Miloslav Trmač
d3f59bedb3 Update c/image to v4.0.1 and buildah to 1.11.3
This requires updating all import paths throughout, and a matching
buildah update to interoperate.

I can't figure out the reason for go.mod tracking
	github.com/containers/image v3.0.2+incompatible // indirect
((go mod graph) lists it as a direct dependency of libpod, but
(go list -json -m all) lists it as an indirect dependency),
but at least looking at the vendor subdirectory, it doesn't seem
to be actually used in the built binaries.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2019-10-04 20:18:23 +02:00
Giuseppe Scrivano
4ad2cd54a1 rootless: allow cgroupfs manager on cgroups v2
if there are no resources specified, make sure the OCI resources block
is empty so that the OCI runtime won't complain.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-02 17:07:14 +02:00
Matthew Heon
d89414b1f0 Handle conflict between volumes and --read-only-tmpfs
When a named volume is mounted on any of the tmpfs filesystems
created by read-only tmpfs, it caused a conflict that was not
resolved prior to this.

Fixes BZ1755119

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-24 15:57:17 -04:00
Gabi Beyer
5813c8246e rootless: Rearrange setup of rootless containers
In order to run Podman with VM-based runtimes unprivileged, the
network must be set up prior to the container creation. Therefore
this commit modifies Podman to run rootless containers by:
  1. create a network namespace
  2. pass the netns persistent mount path to the slirp4netns
     to create the tap inferface
  3. pass the netns path to the OCI spec, so the runtime can
     enter the netns

Closes #2897

Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
2019-09-24 11:01:28 +02:00
Matthew Heon
720d8c9e3f Clean destination paths during mount generation
We identify and resolve conflicts in paths using destination path
matches. We require exact matches, largely for performance
reasons (we use maps to efficiently access, keyed by
destination). This usually works fine, until you get mounts that
are targetted at /output and /output/ - the same path, but not
the same string.

Use filepath.Clean() aggressively to try and solve this.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-19 11:09:59 -04:00
OpenShift Merge Robot
799aa7022b Merge pull request #4034 from rhatdan/relabel
Add 'relabel' to --mount options
2019-09-17 13:02:23 +02:00
Daniel J Walsh
405ef9bc56 Add 'relabel' to --mount options
Currently if a user specifies a --mount option, their is no way to tell SELinux
to relabel the mount point.

This patch addes the relabel=shared and relabel=private options.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-09-16 09:56:43 -04:00
Danila Kiver
c06661f041 Check for rootless before checking cgroups version in spec_test.
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-09-15 21:28:13 +03:00
Danila Kiver
8ac57b48e1 Skip spec_test for rootless envs without cgroup v2.
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-09-14 00:22:16 +03:00
Matthew Heon
c2284962c7 Add support for launching containers without CGroups
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-10 10:52:37 -04:00
OpenShift Merge Robot
ab44484bec Merge pull request #3876 from mheon/fix_mount_flags
Allow suid, exec, dev mount options to cancel nosuid/noexec/nodev
2019-09-04 22:43:41 +02:00
Giuseppe Scrivano
759ca2cfc6 spec: provide custom implementation for getDevices
provide an implementation for getDevices that skip unreadable
directories for the current user.

Based on the implementation from runc/libcontainer.

Closes: https://github.com/containers/libpod/issues/3919

Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-09-02 13:27:47 +02:00
Giuseppe Scrivano
b101a8d366 spec: do not set devices cgroup when rootless
eBPF requires to be root in the init namespace.

Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
2019-09-02 13:03:20 +02:00
Giuseppe Scrivano
ba1c57030f rootless: bind mount devices instead of creating them
when running in rootless mode, --device creates a bind mount from the
host instead of specifying the device in the OCI configuration.  This
is required as an unprivileged user cannot use mknod, even when root
in a user namespace.

Closes: https://github.com/containers/libpod/issues/3905

Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
2019-09-02 13:03:19 +02:00
Matthew Heon
96812dc490 Fix addition of mount options when using RO tmpfs
For read-only containers set to create tmpfs filesystems over
/run and other common destinations, we were incorrectly setting
mount options, resulting in duplicate mount options.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Matthew Heon
5bdd97f77f Set base mount options for bind mounts from base system
If I mount, say, /usr/bin into my container - I expect to be able
to run the executables in that mount. Unconditionally applying
noexec would be a bad idea.

Before my patches to change mount options and allow exec/dev/suid
being set explicitly, we inferred the mount options from where on
the base system the mount originated, and the options it had
there. Implement the same functionality for the new option
handling.

There's a lot of performance left on the table here, but I don't
know that this is ever going to take enough time to make it worth
optimizing.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Matthew Heon
d45595d9cc Don't double-process tmpfs options
We already process the options on all tmpfs filesystems during
final addition of mounts to the spec. We don't need to do it
before that in parseVolumes.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Matthew Heon
02264d597f Add support for 'exec', 'suid', 'dev' mount flags
Previously, we explicitly set noexec/nosuid/nodev on every mount,
with no ability to disable them. The 'mount' command on Linux
will accept their inverses without complaint, though - 'noexec'
is counteracted by 'exec', 'nosuid' by 'suid', etc. Add support
for passing these options at the command line to disable our
explicit forcing of security options.

This also cleans up mount option handling significantly. We are
still parsing options in more than one place, which isn't good,
but option parsing for bind and tmpfs mounts has been unified.

Fixes: #3819
Fixes: #3803

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Ashley Cui
2eda50cb31 Remove --tmpfs size default
Docker has unlimited tmpfs size where Podman had it set to 64mb. Should be standard between the two.
Remove noexec default

Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
2019-08-14 09:42:33 -04:00
Giuseppe Scrivano
0ecf0aa1b8 storage: drop unused geteuid check
it is always running with euid==0 at this point.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-08-12 12:30:20 +02:00
OpenShift Merge Robot
5701fe6689 Merge pull request #3744 from mheon/fix_command
When populating CMD, do not include Entrypoint
2019-08-08 14:32:27 +02:00
OpenShift Merge Robot
8776a577bf Merge pull request #3738 from mheon/mount_opts_bools
Allow --ro=[true|false] with mount flag
2019-08-08 14:20:29 +02:00
Matthew Heon
c0a124ea89 Allow --ro=[true|false] with mount flag
The 'podman run --mount' flag previously allowed the 'ro' option
to be specified, but was missing the ability to set it to a bool
(as is allowed by docker). Add that. While we're at it, allow
setting 'rw' explicitly as well.

Fixes #2980

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-07 10:03:01 -04:00
Peter Hunt
a602e44e74 refer to container whose namespace we share
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-07 09:53:39 -04:00
Peter Hunt
a87fb78dd1 Properly share UTS namespaces in a pod
Sharing a UTS namespace means sharing the hostname. Fix situations where a container in a pod didn't properly share the hostname of the pod.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-07 08:44:08 -04:00
Matthew Heon
28b545d04c When populating CMD, do not include Entrypoint
Previously, we use CreateConfig's Command to populate container
Command (which is used as CMD for Inspect and Commit).
Unfortunately, CreateConfig's Command is the container's full
command, including a prepend of Entrypoint - so we duplicate
Entrypoint for images that include it.

Maintain a separate UserCommand in CreateConfig that does not
include the entrypoint, and use that instead.

Fixes #3708

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-06 16:11:42 -04:00
baude
97b84dedf3 Revert "rootless: Rearrange setup of rootless containers"
This reverts commit 80dcd4bebc.

Signed-off-by: baude <bbaude@redhat.com>
2019-08-06 09:51:38 -05:00
OpenShift Merge Robot
e2f38cdaa4 Merge pull request #3310 from gabibeyer/rootlessKata
rootless: Rearrange setup of rootless containers ***CIRRUS: TEST IMAGES***
2019-08-05 14:26:04 +02:00
OpenShift Merge Robot
e3240daa47 Merge pull request #3551 from mheon/fix_memory_leak
Fix memory leak with exit files
2019-08-02 03:44:43 +02:00
Matthew Heon
6bbeda6da5 Pass on events-backend config to cleanup processes
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-01 12:37:24 -04:00