Commit Graph

192 Commits

Author SHA1 Message Date
Danila Kiver
c06661f041 Check for rootless before checking cgroups version in spec_test.
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-09-15 21:28:13 +03:00
Danila Kiver
8ac57b48e1 Skip spec_test for rootless envs without cgroup v2.
Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
2019-09-14 00:22:16 +03:00
Matthew Heon
c2284962c7 Add support for launching containers without CGroups
This is mostly used with Systemd, which really wants to manage
CGroups itself when managing containers via unit file.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-09-10 10:52:37 -04:00
OpenShift Merge Robot
ab44484bec Merge pull request #3876 from mheon/fix_mount_flags
Allow suid, exec, dev mount options to cancel nosuid/noexec/nodev
2019-09-04 22:43:41 +02:00
Giuseppe Scrivano
759ca2cfc6 spec: provide custom implementation for getDevices
provide an implementation for getDevices that skip unreadable
directories for the current user.

Based on the implementation from runc/libcontainer.

Closes: https://github.com/containers/libpod/issues/3919

Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-09-02 13:27:47 +02:00
Giuseppe Scrivano
b101a8d366 spec: do not set devices cgroup when rootless
eBPF requires to be root in the init namespace.

Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
2019-09-02 13:03:20 +02:00
Giuseppe Scrivano
ba1c57030f rootless: bind mount devices instead of creating them
when running in rootless mode, --device creates a bind mount from the
host instead of specifying the device in the OCI configuration.  This
is required as an unprivileged user cannot use mknod, even when root
in a user namespace.

Closes: https://github.com/containers/libpod/issues/3905

Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
2019-09-02 13:03:19 +02:00
Matthew Heon
96812dc490 Fix addition of mount options when using RO tmpfs
For read-only containers set to create tmpfs filesystems over
/run and other common destinations, we were incorrectly setting
mount options, resulting in duplicate mount options.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Matthew Heon
5bdd97f77f Set base mount options for bind mounts from base system
If I mount, say, /usr/bin into my container - I expect to be able
to run the executables in that mount. Unconditionally applying
noexec would be a bad idea.

Before my patches to change mount options and allow exec/dev/suid
being set explicitly, we inferred the mount options from where on
the base system the mount originated, and the options it had
there. Implement the same functionality for the new option
handling.

There's a lot of performance left on the table here, but I don't
know that this is ever going to take enough time to make it worth
optimizing.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Matthew Heon
d45595d9cc Don't double-process tmpfs options
We already process the options on all tmpfs filesystems during
final addition of mounts to the spec. We don't need to do it
before that in parseVolumes.

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Matthew Heon
02264d597f Add support for 'exec', 'suid', 'dev' mount flags
Previously, we explicitly set noexec/nosuid/nodev on every mount,
with no ability to disable them. The 'mount' command on Linux
will accept their inverses without complaint, though - 'noexec'
is counteracted by 'exec', 'nosuid' by 'suid', etc. Add support
for passing these options at the command line to disable our
explicit forcing of security options.

This also cleans up mount option handling significantly. We are
still parsing options in more than one place, which isn't good,
but option parsing for bind and tmpfs mounts has been unified.

Fixes: #3819
Fixes: #3803

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-28 14:28:18 -04:00
Ashley Cui
2eda50cb31 Remove --tmpfs size default
Docker has unlimited tmpfs size where Podman had it set to 64mb. Should be standard between the two.
Remove noexec default

Signed-off-by: Ashley Cui <ashleycui16@gmail.com>
2019-08-14 09:42:33 -04:00
Giuseppe Scrivano
0ecf0aa1b8 storage: drop unused geteuid check
it is always running with euid==0 at this point.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-08-12 12:30:20 +02:00
OpenShift Merge Robot
5701fe6689 Merge pull request #3744 from mheon/fix_command
When populating CMD, do not include Entrypoint
2019-08-08 14:32:27 +02:00
OpenShift Merge Robot
8776a577bf Merge pull request #3738 from mheon/mount_opts_bools
Allow --ro=[true|false] with mount flag
2019-08-08 14:20:29 +02:00
Matthew Heon
c0a124ea89 Allow --ro=[true|false] with mount flag
The 'podman run --mount' flag previously allowed the 'ro' option
to be specified, but was missing the ability to set it to a bool
(as is allowed by docker). Add that. While we're at it, allow
setting 'rw' explicitly as well.

Fixes #2980

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-07 10:03:01 -04:00
Peter Hunt
a602e44e74 refer to container whose namespace we share
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-07 09:53:39 -04:00
Peter Hunt
a87fb78dd1 Properly share UTS namespaces in a pod
Sharing a UTS namespace means sharing the hostname. Fix situations where a container in a pod didn't properly share the hostname of the pod.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2019-08-07 08:44:08 -04:00
Matthew Heon
28b545d04c When populating CMD, do not include Entrypoint
Previously, we use CreateConfig's Command to populate container
Command (which is used as CMD for Inspect and Commit).
Unfortunately, CreateConfig's Command is the container's full
command, including a prepend of Entrypoint - so we duplicate
Entrypoint for images that include it.

Maintain a separate UserCommand in CreateConfig that does not
include the entrypoint, and use that instead.

Fixes #3708

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-06 16:11:42 -04:00
baude
97b84dedf3 Revert "rootless: Rearrange setup of rootless containers"
This reverts commit 80dcd4bebc.

Signed-off-by: baude <bbaude@redhat.com>
2019-08-06 09:51:38 -05:00
OpenShift Merge Robot
e2f38cdaa4 Merge pull request #3310 from gabibeyer/rootlessKata
rootless: Rearrange setup of rootless containers ***CIRRUS: TEST IMAGES***
2019-08-05 14:26:04 +02:00
OpenShift Merge Robot
e3240daa47 Merge pull request #3551 from mheon/fix_memory_leak
Fix memory leak with exit files
2019-08-02 03:44:43 +02:00
Matthew Heon
6bbeda6da5 Pass on events-backend config to cleanup processes
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-08-01 12:37:24 -04:00
Daniel J Walsh
e7aca5568a Use buildah/pkg/parse volume parsing rather then internal version
We share this code with buildah, so we should eliminate the podman
version.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-08-01 03:48:16 -04:00
Gabi Beyer
80dcd4bebc rootless: Rearrange setup of rootless containers
In order to run Podman with VM-based runtimes unprivileged, the
network must be set up prior to the container creation. Therefore
this commit modifies Podman to run rootless containers by:
  1. create a network namespace
  2. pass the netns persistent mount path to the slirp4netns
     to create the tap inferface
  3. pass the netns path to the OCI spec, so the runtime can
     enter the netns

Closes #2897

Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
2019-07-30 23:28:52 +00:00
Daniel J Walsh
141c7a5165 Vendor in buildah 1.9.2
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2019-07-30 16:48:18 -04:00
Giuseppe Scrivano
1d72f651e4 podman: support --userns=ns|container
allow to join the user namespace of another container.

Closes: https://github.com/containers/libpod/issues/3629

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-25 23:04:55 +02:00
baude
db826d5d75 golangci-lint round #3
this is the third round of preparing to use the golangci-lint on our
code base.

Signed-off-by: baude <bbaude@redhat.com>
2019-07-21 14:22:39 -05:00
OpenShift Merge Robot
2254a35d3a Merge pull request #3593 from giuseppe/rootless-privileged-devices
rootless: add host devices with --privileged
2019-07-18 19:50:22 +02:00
Giuseppe Scrivano
350ede1eeb rootless: add rw devices with --privileged
when --privileged is specified, add all the devices that are usable by
the user.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730773

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-18 17:07:50 +02:00
Giuseppe Scrivano
0b57e77d7c libpod: support for cgroup namespace
allow a container to run in a new cgroup namespace.

When running in a new cgroup namespace, the current cgroup appears to
be the root, so that there is no way for the container to access
cgroups outside of its own subtree.

By default it uses --cgroup=host to keep the previous behavior.

To create a new namespace, --cgroup=private must be provided.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-18 10:32:25 +02:00
Matthew Heon
c91bc31570 Populate inspect with security-opt settings
We can infer no-new-privileges. For now, manually populate
seccomp (can't infer what file we sourced from) and
SELinux/Apparmor (hard to tell if they're enabled or not).

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-07-17 16:48:38 -04:00
Matthew Heon
1e3e99f2fe Move the HostConfig portion of Inspect inside libpod
When we first began writing Podman, we ran into a major issue
when implementing Inspect. Libpod deliberately does not tie its
internal data structures to Docker, and stores most information
about containers encoded within the OCI spec. However, Podman
must present a CLI compatible with Docker, which means it must
expose all the information in 'docker inspect' - most of which is
not contained in the OCI spec or libpod's Config struct.

Our solution at the time was the create artifact. We JSON'd the
complete CreateConfig (a parsed form of the CLI arguments to
'podman run') and stored it with the container, restoring it when
we needed to run commands that required the extra info.

Over the past month, I've been looking more at Inspect, and
refactored large portions of it into Libpod - generating them
from what we know about the OCI config and libpod's (now much
expanded, versus previously) container configuration. This path
comes close to completing the process, moving the last part of
inspect into libpod and removing the need for the create
artifact.

This improves libpod's compatability with non-Podman containers.
We no longer require an arbitrarily-formatted JSON blob to be
present to run inspect.

Fixes: #3500

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-07-17 16:48:38 -04:00
Giuseppe Scrivano
2f0ed531c7 spec: rework --ulimit host
it seems enough to not specify any ulimit block to maintain the host
limits.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-17 13:01:21 +02:00
OpenShift Merge Robot
9d87945005 Merge pull request #3563 from giuseppe/fix-single-mapping-rootless
spec: fix userns with less than 5 gids
2019-07-12 22:31:37 +02:00
OpenShift Merge Robot
62352b280b Merge pull request #3537 from QiWang19/volumeabs
fix bug convert volume host path to absolute
2019-07-12 22:12:21 +02:00
Giuseppe Scrivano
d74db186a8 spec: fix userns with less than 5 gids
when the container is running in a user namespace, check if gid=5 is
available, otherwise drop the option gid=5 for /dev/pts.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-12 11:35:03 +02:00
OpenShift Merge Robot
2b64f88446 Merge pull request #3491 from giuseppe/rlimit-host
podman: add --ulimit host
2019-07-11 21:35:37 +02:00
baude
e053e0e05e first pass of corrections for golangci-lint
Signed-off-by: baude <bbaude@redhat.com>
2019-07-10 15:52:17 -05:00
Qi Wang
f50f91079a fix bug convert volume host path to absolute
fix #3504 If --volume host:dest host is not a named volume, convert the host to a absolute directory path.

Signed-off-by: Qi Wang <qiwan@redhat.com>
2019-07-10 10:26:57 -04:00
Giuseppe Scrivano
fb88074e68 podman: add --ulimit host
add a simple way to copy ulimit values from the host.

if --ulimit host is used then the current ulimits in place are copied
to the container.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-08 19:22:54 +02:00
Giuseppe Scrivano
825506d8f8 spec: move cgo stuff to their own file
so it can build without cgo since seccomp requires it.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-07-02 16:41:03 +02:00
Giuseppe Scrivano
5d25a4793d util: drop IsCgroup2UnifiedMode and use it from cgroups
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-26 13:17:04 +02:00
baude
dd81a44ccf remove libpod from main
the compilation demands of having libpod in main is a burden for the
remote client compilations.  to combat this, we should move the use of
libpod structs, vars, constants, and functions into the adapter code
where it will only be compiled by the local client.

this should result in cleaner code organization and smaller binaries. it
should also help if we ever need to compile the remote client on
non-Linux operating systems natively (not cross-compiled).

Signed-off-by: baude <bbaude@redhat.com>
2019-06-25 13:51:24 -05:00
Matthew Heon
30f24bb760 Add tests for cached and delegated mounts
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-19 09:57:33 -04:00
Matthew Heon
8e5b294ac3 Allow (but ignore) Cached and Delegated volume options
These are only used on OS X Docker, and ignored elsewhere - but
since they are ignored, they're guaranteed to be safe everywhere,
and people are using them.

Fixes: #3340

Signed-off-by: Matthew Heon <matthew.heon@pm.me>
2019-06-18 17:02:20 -04:00
OpenShift Merge Robot
90e3c9002b Merge pull request #3328 from mheon/storage_opts_for_cleanup
When creating exit command, pass storage options on
2019-06-15 00:18:53 +02:00
Matthew Heon
b2bdbf331e When creating exit command, pass storage options on
We made changes earlier that empty storage options when setting
storage driver explicitly. Unfortunately, this breaks rootless
cleanup commands, as they lose the fuse-overlayfs mount program
path.

Fix this by passing along the storage options to the cleanup
process.

Also, fix --syslog, which was broken a while ago (probably when
we broke up main to add main_remote).

Fixes #3326

Signed-off-by: Matthew Heon <mheon@redhat.com>
2019-06-13 15:19:17 -04:00
Giuseppe Scrivano
23efe4cb81 storage: support --mount type=bind,bind-nonrecursive
add support for not recursive bind mounts.

Closes: https://github.com/containers/libpod/issues/3314

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-13 15:32:45 +02:00
Giuseppe Scrivano
97f4818ce1 storage: fix typo
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-06-13 11:29:07 +02:00