Commit Graph

1236 Commits

Author SHA1 Message Date
OpenShift Merge Robot
d09edd2820 Merge pull request #19043 from dgibson/fix19021
pasta: Remove some leftover code from pasta bats tests
2023-06-29 16:22:30 +02:00
David Gibson
e4efd709d9 Revert^3 "pasta: Use two connections instead of three in TCP range forward tests"
This reverts commit c2a24abc0d, which
itself reverted 1c08f2edac, which
reverted e33f4e0bc7.

The original e33f4e0bc7 "pasta: Use two connections instead of three
in TCP range forward tests" was a workaround to avoid intermittent
errors in CI where the pasta networking port range forwarding tests
would fail.  It was reverted and unreverted when we thought we'd fixed
the problem, but that turned out not to be the case.

We're now much more confident that we've genuinely found and fixed (or
at least, worked around) the underlying problem, so we revert it again.

Link: https://github.com/containers/podman/issues/17287

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-29 16:15:02 +10:00
David Gibson
17cd5aecbb pasta: Workaround occasional socat failures in CI
With a number of the port range forwarding tests, we've seen occasional
failures where the sending socat fails with an EINTR on connect().  This
was mitigated by e33f4e0bc7 "pasta: Use two connections instead of three
in TCP range forward tests" (which has been reverted and un-reverted
several times).  However, this did not eliminate the problem, for example
see [0].

For the failing tests we are using the socat address "EXEC:printf x" to
make socat invoke printf(1) to generate a single byte of data to transfer.
Closer analysis shows that the SIGCHLD as the printf process ends is
occasionally intersecting with the connect() call causing this failure.

This is arguably a bug in socat, to not handle this race one way or
another.  However, we can easily workaround the problem by using a
temporary file with the data to transfer, rather than invoking printf every
time.  Do this, to avoid the flakiness of these tests.

[0]
https://github.com/containers/podman/issues/17287#issuecomment-1611855165

Closes: https://github.com/containers/podman/issues/17287

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-29 15:53:13 +10:00
David Gibson
13c7d05cc1 pasta: Remove some leftover code from pasta bats tests
https://github.com/containers/podman/pull/19021 fixed bugs with the pasta
networking tests not working on hosts with multiple interfaces.  Alas, the
patch left in some stale code that generates spurious error messages for
the IPv6 case.  This is sort of harmless - later code overrides what's done
here and the tests can pass anyway.  However if a test fails for some other
reason it means we get a misleading irrelevant error message.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-29 12:51:44 +10:00
Daniel J Walsh
b6e636cbe2 Remove 'inspecting object' from inspect errors
This is just useless noise and gets us closer to what
Docker returns.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-06-28 08:19:37 -04:00
OpenShift Merge Robot
be49741dc7 Merge pull request #19021 from dgibson/bug19007
pasta: Fix pasta tests to work on hosts with multiple interfaces
2023-06-28 13:06:43 +02:00
David Gibson
fe8355be7f pasta: Fix pasta tests to work on hosts with multiple interfaces
At various points the pasta bats tests need to know the name of the
interface that pasta will use by default, and the host addresses it will
use by default.  Currently we use the pre-existing helper functions
ether_get_name and ipv[46]_get_addr_global to retreive that.

However, those just pick the first non-loopback interface or address, which
may not be the one that pasta uses if there are multiple connected host
interfaces.

Replace those helpers with local ones which examine the routing table to
more closely match pasta's internal logic about which interface to select.
This allows the tests to run successfully on a host with multiple
interfaces.

Closes: https://github.com/containers/podman/issues/19007

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-28 13:12:33 +10:00
Daniel J Walsh
bcb89fc8b2 Fix readonly=false failure
There was a huge cut and paste of mount options which were not constent
in parsing tmpfs, bind and volume mounts.  Consolidated into a single
function to guarantee all parse the same.

Fixes: https://github.com/containers/podman/issues/18995

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-06-27 16:57:21 -04:00
Paul Holzinger
6eaf8a271d tests: fix "Storing signatures" check
After[1] c/image no longer prints "Storing signatures" so we should
not check for it.

[1] https://github.com/containers/image/pull/2001

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-27 18:04:42 +02:00
OpenShift Merge Robot
c2d46acdea Merge pull request #18980 from vrothberg/bz-2216700
make image listing more resilient
2023-06-26 22:42:37 +02:00
OpenShift Merge Robot
68f71f49d6 Merge pull request #19002 from giuseppe/skip-devices-userns
specgen: raise error with --device-cgroup-rule in a userns
2023-06-26 22:34:54 +02:00
Giuseppe Scrivano
0220f33384 specgen, rootless: raise error with --device-cgroup-rule
we were silently ignoring --device-cgroup-rule in rootless mode.  Make
sure an error is returned if the user tries to use it.

Closes: https://github.com/containers/podman/issues/18698

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-06-26 17:36:55 +02:00
Valentin Rothberg
db37d66cd1 make image listing more resilient
Handle more TOCTOUs operating on listed images.  Also pull in
containers/common/pull/1520 and containers/common/pull/1522 which do the
same on the internal layer tree.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2216700
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-26 16:34:26 +02:00
Ed Santiago
dde6bcbca3 system tests: add and use _prefetch
Add new _prefetch helper for fetching and caching images.
Use it in a few places, most importantly 120-load.bats
where our teardown() now runs 'rmi -af'.

Reason: in #17911 we discovered that podman save + load do
not actually preserve the image: annotations and other metadata
are lost. This means that a test which runs after 120-load.bats
is operating on a different $IMAGE than a test which runs before.

This is not a problem except in very obscure corner cases, like
one fixed in #18542, but it seems irresponsible to just handwave
that issue away

The _prefetch function uses skopeo for fetching and saving
images, because skopeo preserves digests and metadata.

[Side note for posterity: I tried amending basic_setup() to
always rmi -a + prefetch, instead of the current images -a +
rmi unwanted ones. That slowed down system tests by 10 minutes,
presumably because loads are much slower than queries. I reverted
that change and am documenting it as a reminder of why we do things
the way we do.]

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-06-26 06:51:01 -06:00
Valentin Rothberg
1398cbce8a container wait: support health states
Support two new wait conditions, "healthy" and "unhealthy".  This
further paves the way for integrating sdnotify with health checks which
is currently being tracked in #6160.

Fixes: #13627
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-23 14:16:32 +02:00
Ed Santiago
00292ae1c4 systests: test instrumentation
for #18514: if we get a timeout in teardown(), run and show
the output of podman system locks

for #18831: if we hit unmount/EINVAL, nothing will ever work
again, so signal all future tests to skip.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-06-21 11:12:32 -06:00
Valentin Rothberg
47e0557d57 auto update: restart instead of stop+start
Commit f131eaa74a changed restart to a stop+start motivated by
comments in the systemd man pages that restart behaves different than
stop+start, for instance, that it keeps certain resources open and
treats timers differently.  Yet, the actually fix for #17607 in the very
same commit was dealing with an ENOENT of the CID file on container
removal.

As it turns out in in #18926, changing to stop+start regressed on
restarting dependencies when auto updating a systemd unit.  Hence, move
back to using restart to make sure that dependent systemd units are
restarted as well.

An alternative could be recommending to use `BindsTo=` in Quadlet files
but this seems less common than `Requires=` and hence more risky to
cause issues on user sites.

Fixes: #18926
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-19 09:37:21 +02:00
OpenShift Merge Robot
719e3228b1 Merge pull request #18900 from Luap99/pasta
pasta: use code from c/common
2023-06-16 02:40:07 -04:00
Paul Holzinger
5ffbfd937d pasta: use code from c/common
The code was moved to c/common so use that instead. Also add tests for
the new pasta_options config field. However there is one outstanding
problem[1]: pasta rejects most options when set more than once. Thus it is
impossible to overwrite most of them on the cli. If we cannot fix this
in pasta I need to make further changes in c/common to dedup the
options.

[1] https://archives.passt.top/passt-dev/895dae7d-3e61-4ef7-829a-87966ab0bb3a@redhat.com/

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-15 16:14:49 +02:00
Lokesh Mandvekar
3efaffae43 New command: podmansh
This commit creates a new command `podmansh` command which can be used by
administrators to provide a confined shell to their users.

The user will only have access to the volumes and capabilities for that
user.

Co-authored-by: Paul Holzinger <pholzing@redhat.com>
Co-authored-by: Daniel Walsh <dwalsh@redhat.com>
Co-authored-by: Petr Lautrbach <lautrbach@redhat.com>
Co-authored-by: Ed Santiago <santiago@redhat.com>

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2023-06-15 08:14:12 -04:00
Daniel J Walsh
c28a43efd7 Verify podman pull dup image only prints id once
Fixes: #18647

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-06-13 22:07:29 -04:00
Toshiki Sonoda
6f821634ad libpod: Podman info output more network information
podman info prints the network information about binary path,
package version, program version and DNS information.

Fixes: #18443

Signed-off-by: Toshiki Sonoda <sonoda.toshiki@fujitsu.com>
2023-06-13 11:19:29 +09:00
Valentin Rothberg
faa2689dcd 250-systemd.bats: remove outdated comment
Remove an outdated comment on the absence of exit-code propagation when
running K8s workloads in systemd.  The `podman-kube@` systemd template
is using default restart policy of the system.  The exit-code
propagation is tested in other tests, so we can keep the logic as is.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-06-12 13:18:50 +02:00
Ed Santiago
992093ae91 filters: better handling of id=
For filter=id=XXX (containers, pods) and =ctr-ids=XXX (pods):

  if XXX is only hex characters, treat it as a PREFIX
  otherwise, treat it as a REGEX

Add tests. Update documentation. And fix an incorrect help message.

Fixes: #18471

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-06-07 05:29:06 -06:00
OpenShift Merge Robot
eec15a108a Merge pull request #18657 from arizvisa/GH-18120
Added the "--out" parameter and fixed an issue with "--noout" which prevented stdout from being written to.
2023-06-05 14:34:21 -04:00
Stefano Brivio
cf9bc25bbc pasta: Test handling of unknown protocols
Test that pasta generates a sensible error message if asked to forward a
protocol it doesn't understand.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2023-06-05 19:32:54 +10:00
OpenShift Merge Robot
a7e23d341d Merge pull request #18756 from Luap99/tz
libpod: fix timezone handling
2023-06-01 14:16:20 -04:00
OpenShift Merge Robot
50f934587f Merge pull request #18758 from Luap99/systemd-restart
test/system: quadlet use correct systemd restart policy
2023-06-01 07:52:02 -04:00
Ed Santiago
543b809495 systests: fixes for coping with extra systemd image
We _usually_ have only one image in store, $IMAGE, but it's
perfectly fine to also have $SYSTEMD_IMAGE also. Fix a few
tests so they can handle that condition.

And, cleanup:
 - remove a no-longer-useful test ("podman load NEWNAME",
   functionality that was removed 2+ years ago in #8877)
 - reorder some tests in the image-mount test, to make
   them safer and easier to understand
 - use no-such-image, not no-such-container, in image-mount test.
   Computer don't care, but this human felt confused for a sec.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-06-01 11:04:31 +02:00
Paul Holzinger
34c258b419 libpod: fix timezone handling
The current way of bind mounting the host timezone file has problems.
Because /etc/localtime in the image may exist and is a symlink under
/usr/share/zoneinfo it will overwrite the targetfile. That confuses
timezone parses especially java where this approach does not work at
all. So we end up with an link which does not reflect the actual truth.

The better way is to just change the symlink in the image like it is
done on the host. However because not all images ship tzdata we cannot
rely on that either. So now we do both, when tzdata is installed then
use the symlink and if not we keep the current way of copying the host
timezone file in the container to /etc/localtime.

Also note that we need to rebuild the systemd image to include tzdata in
order to test this as our images do not contain the tzdata by default.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2149876

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-06-01 11:04:13 +02:00
OpenShift Merge Robot
249f0463eb Merge pull request #18721 from Cydox/fix-ulimit-pr
fix ulimit issue
2023-05-31 16:53:49 -04:00
Paul Holzinger
4173f942f1 test/system: quadlet use correct systemd restart policy
Systemd doesn't support `never` and logs a warning, systemd uses no as
default so we do not have to specify it at all.

Check systemd.service(5) for the systemd docs.

Fixes #18743

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-05-31 18:50:16 +02:00
OpenShift Merge Robot
b9bdfea8e7 Merge pull request #18752 from edsantiago/log_k8s_race
systests: minimize race-condition window
2023-05-31 10:23:30 -04:00
OpenShift Merge Robot
0d7702bd93 Merge pull request #18744 from edsantiago/quadlet_race
systests: fix race in quadlet tests
2023-05-31 10:20:52 -04:00
Ed Santiago
0372bf4bdd systests: minimize race-condition window
Reduce sleep-loop time in logs test, from 1s to 0.1s,
to make 'podman stop' take effect more quickly. With 1s,
and testing with 1s resolution, we get flakes.

Fixes: #17826

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-05-31 06:38:17 -06:00
Ed Santiago
1a34e1f855 systests: fix improper backgrounding of run_podman
run_podman cannot be backgrounded. Use $PODMAN instead.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-05-31 06:20:35 -06:00
Jan Hendrik Farr
f097728891 set max ulimits for rootless on each start
Signed-off-by: Jan Hendrik Farr <github@jfarr.cc>
2023-05-31 09:20:31 +00:00
Ed Santiago
72d4cede29 systests: fix race in quadlet tests
The new exit-code propagation test is racy: 'podman wait' can
fail if the service container has already been cleaned up by
systemd.

Solution: run the inspect and wait tests opportunistically, i.e.,
only if those commands succeed. If they fail, confirm that they
fail with ENOSUCHCONTAINER. This may silently lose us some
coverage ... but none of it is important. The important
test, systemctl final status, remains.

Also, as drive-bys:
 - add a FIXME comment documenting another race condition
   that I'm not bothering to fix right now

 - give distinct names to unit files, for readability in
   test failures

Fixes: #18732

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-05-30 13:38:51 -06:00
Paul Holzinger
370e1132ce completion: fix panic in simplePathJoinUnix()
When we do path completion in images a user could try to complete a
simple relative path, e.g. podman run $IMAGE e... should complete to etc
if this path exists in the image. Right now we panic in this case as the
current check didn't account for an empty string in simplePathJoinUnix().
In such a case return the path directly because we can not alter what
the user typed on the cli and must return a path without slash as well
in order for the shell to suggest the completion.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2209809

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-05-30 16:09:19 +02:00
OpenShift Merge Robot
e7dc5074a3 Merge pull request #18681 from Luap99/reexec-signals
pkg/rootless: correctly handle proxy signals on reexec
2023-05-27 17:19:58 -04:00
Paul Holzinger
6bc52c9c5e pkg/rootless: correctly handle proxy signals on reexec
There are quite a lot of places in podman were we have some signal
handlers, most notably libpod/shutdown/handler.go.

However when we rexec we do not want any of that and just send all
signals we get down to the child obviously. So before we install our
signal handler we must first reset all others with signal.Reset().

Also while at it fix a problem were the joinUserAndMountNS() code path
would not forward signals at all. This code path is used when you have
running containers but the pause process was killed.

Fixes #16091
Given that signal handlers run in different goroutines parallel it would
explain why it flakes sometimes in CI. However to my understanding this
flake can only happen when the pause process is dead before we run the
podman command. So the question still is what kills the pause process?

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
2023-05-25 16:48:15 +02:00
Valentin Rothberg
29f7c494ee Quadlet: kube: use ExecStopPost
Use ExecStopPost instead of ExecStop to make sure containers, pods, etc.
are all cleaned up even in case of an error.

Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-25 14:46:35 +02:00
Valentin Rothberg
6487d9c11a Quadlet: kube: add ExitCodePropagation field
Add a new field `ExitCodePropagation` field to allow for configuring the
newly added functionality of controlling how the main PID of a kube
service exits.

Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-25 14:46:35 +02:00
Valentin Rothberg
08b0d93ea3 kube play: exit-code propagation
Implement means for reflecting failed containers (i.e., those having
exited non-zero) to better integrate `kube play` with systemd.  The
idea is to have the main PID of `kube play` exit non-zero in a
configurable way such that systemd's restart policies can kick in.

When using the default sdnotify-notify policy, the service container
acts as the main PID to further reduce the resource footprint.  In that
case, before stopping the service container, Podman will lookup the exit
codes of all non-infra containers.  The service will then behave
according to the following three exit-code policies:

 - `none`: exit 0 and ignore containers (default)
 - `any`: exit non-zero if _any_ container did
 - `all`: exit non-zero if _all_ containers did

The upper values can be passed via a hidden `kube play
--service-exit-code-propagation` flag which can be used by tests and
later on by Quadlet.

In case Podman acts as the main PID (i.e., when at least one container
runs with an sdnotify-policy other than "ignore"), Podman will continue
to wait for the service container to exit and reflect its exit code.

Note that this commit also fixes a long-standing annoyance of the
service container exiting non-zero.  The underlying issue was that the
service container had been stopped with SIGKILL instead of SIGTERM and
hence exited non-zero.  Fixing that was a prerequisite for the exit-code
propagation to work but also improves the integration of `kube play`
with systemd and hence Quadlet with systemd.

Jira: issues.redhat.com/browse/RUN-1776
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
2023-05-25 14:46:34 +02:00
OpenShift Merge Robot
688e6dbef1 Merge pull request #18640 from HirazawaUi/add-pasta-to-podman-info
podman: Add pasta to podman info
2023-05-25 06:55:04 -04:00
binghongtao
977b3cdbf6 podman: Add pasta to podman info
[NO NEW TESTS NEEDED]

Fixes: #18561

Signed-off-by: binghongtao <695097494plus@gmail.com>
2023-05-25 00:39:52 +08:00
Ed Santiago
373919ca0a Revert "test/system/255-auto-update.bats: add debug logs"
RHEL gating tests failing, because (sigh) journalctl doesn't
work rootless on RHEL.

I think the flake is fixed anyway, so we don't need this.

This reverts commit ba141adce4.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-05-24 07:41:57 -06:00
OpenShift Merge Robot
acad53ad64 Merge pull request #18653 from edsantiago/unlinkat-ebusy-bail
TEMPORARY(?) instrumentation for unlinkat-ebusy
2023-05-23 06:36:11 -04:00
OpenShift Merge Robot
ca7d0128b2 Merge pull request #18619 from vyasgun/pr/events-volume-name
fix: event --filter volume=vol-name should compare the event name with volume name
2023-05-23 02:42:57 -04:00
Ed Santiago
94c65a659c TEMPORARY(?) instrumentation for unlinkat-ebusy
Instrument system tests in hopes of tracking down #17216,
the unlinkat-ebusy-hosed flake.

Oh, also, timestamp.awk: timestamps have always been UTC, but
add a 'Z' to make it unambiguous.

Signed-off-by: Ed Santiago <santiago@redhat.com>
2023-05-22 10:34:37 -06:00