This is an optimized version of ostree_repo_prune() specialized for
archive mode repos. It is faster and uses less memory so that we can
prune larger repos (like flathub) in a realistic timeframe.
The primary reason it is faster is that it creates and uses a
`.commitmeta2` file for each commit, containing information about what
objects are reachable from that commit. This means incremental prunes
need only traverse over newly created commits.
Secondly, it uses the variant parser compiled accessors for the
various GVariants that are involved in the prune which is quite a bit
faster, especially if the repo is very large.
It also merges the scan-for-all-objects and prune-unreachable objects
phases, which means that we don't have to allocate a hashtable for
all the objects in the entire repo saving a lot of memory.
To save memory the hashtable of reachable objects, which can be quite
big on a big repo, points to a custom, very compact format for object
names.
Additionally it does the scanning for reachable objects twice, first
with a shared lock and then again (if anything changed) it with an
exclusive lock. This allows us to avoid using an exclusive lock during
the slowest part of the prune.
Unfortunately there are currently no public APIs for the ostree repo
locks. We really need to take an exclusive lock during the whole prune
or we parallel modifications (say a commit) might get their newly
written objects deleted. To work around this we have a minimal custom
implementation of an exclusive lock. Once the public API is available
we can start using that.
I created a repo with a lot of small commits to test this. It has 9M,
and pruning with depth=10 deletes 2M of them.
The original performance looks like:
Finding reachable objects: 287 seconds
Pruning unreachable: 69 seconds
Just using the pregenerated reachable data:
Finding reachable objects: 15 seconds
Pruning unreachable: 69 seconds
The final optimized prune (using pregenerated data):
Finding reachable objects: 12 seconds
Pruning unreachable: 51 seconds
The above are with the page caches cleaned, on a second run the performance
increase is even more noticeable.
As a comparison to the above, finding the reachable objects in the
actual flathub repo took 22 hours, but with the pregenerated reachable data
only 39 minutes.
scan-build points out that bytes isn't read after it is assigned. While
this is not actually true (scan-build doesn't understand
__attribute__((__cleanup__)), which frees bytes), it's true that we
should ideally have an assertion here.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build complained that rest_argv_start could be used uninitialized,
because it can't see that rest_argc >= 2 implies that rest_argv_start
got initialized at the same time rest_argc was set. Make this easier
to understand.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build detected that res was written but never read. Presumably
the use of ref here (carried over from the previous test) is a
copy/paste error.
Signed-off-by: Simon McVittie <smcv@collabora.com>
We always set match_len before using it, discarding the result of this
assignment. Detected by scan-build.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This hasn't done anything useful since 0978826c: it just takes a
new ref to the installation, and then releases that ref without doing
anything with it. Detected by scan-build.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build has a lot of false positives for this codebase because it
doesn't understand __attribute__((__cleanup__)) or GLib's GError
convention, but it seems to have been right about these.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build detected that mark_op_resolved() can be called with
op->resolved_commit == commit, in which case we incorrectly freed the
string before allocating the new copy.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build detected that response_size is uninitialized here, presumably
a typo for response_data_size.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The intention here seems to have been that failing to close the http
stream provokes a warning but does not make the function fail, but we
were setting the wrong error, resulting in a NULL dereference if closing
the http stream somehow fails.
Signed-off-by: Simon McVittie <smcv@collabora.com>
In D-Bus, handles are defined to be unsigned, but in GVariant, for some
reason they're signed. Make sure they aren't negative, which could
result in a NULL dereference for fds.
A handle used in the conventional way will never legitimately be
negative (in GVariant's interpretation) or have its high bit set
(in D-Bus' interpretation), because file descriptors are signed 32-bit
integers, so an array of distinct file descriptors can never be long
enough for the distinction between signed and unsigned to matter.
In practice fds are limited by the kernel to several orders of
magnitude fewer than that anyway.
Fixes: 3ebf371f "run: Allow caller to replace /app and/or /usr"
Signed-off-by: Simon McVittie <smcv@collabora.com>
Like $XDG_RUNTIME_DIR/app/$FLATPAK_ID, this is shared between all
instances of the app, except for subsandboxed instances created by
flatpak-spawn --sandbox or equivalent. Unlike
$XDG_RUNTIME_DIR/app/$FLATPAK_ID, it does not exist at an equivalent
path on the host and in the sandboxed app.
Resolves: https://github.com/flatpak/flatpak/issues/4120
Signed-off-by: Simon McVittie <smcv@collabora.com>
If XDG_RUNTIME_DIR is under app control, as it will be with #4120, we
don't want to be mounting pieces of filesystem directly into it, because
that will mean that the app could create a symlink that will cause us
to create a mount point for it at the target of the symlink, potentially
elsewhere in the host filesystem.
Instead, we mount them in /run/flatpak, which is a per-instance
directory entirely controlled by Flatpak; and then create (relative)
symlinks in XDG_RUNTIME_DIR, pointing into /run/flatpak.
In this commit, we still know that the XDG_RUNTIME_DIR is a
per-instance tmpfs, so we can safely create the symlinks using
the --symlink option. In a subsequent commit this will change to
creating them in a shared XDG_RUNTIME_DIR, if any.
Signed-off-by: Simon McVittie <smcv@collabora.com>
If it doesn't start with wayland- or contains a directory separator,
then it's probably not something we want to be creating in the
sandbox's XDG_RUNTIME_DIR.
Signed-off-by: Simon McVittie <smcv@collabora.com>
There's no real reason why this has to be in the XDG_RUNTIME_DIR: it's
located via environment variable AT_SPI_BUS_ADDRESS.
Signed-off-by: Simon McVittie <smcv@collabora.com>
There's no real reason why this has to be in the XDG_RUNTIME_DIR:
nothing looks for it via XDG_RUNTIME_DIR, it's located via environment
variable SSH_AUTH_SOCK.
Signed-off-by: Simon McVittie <smcv@collabora.com>
There's no real reason why this needs to be in XDG_RUNTIME_DIR: nothing
relies on it being there, and applications find it via environment
variable XAUTHORITY.
Signed-off-by: Simon McVittie <smcv@collabora.com>
It's a bit simpler to get a per-app XDG_RUNTIME_DIR safely if we avoid
putting this in there. Nothing relies on it being in the
XDG_RUNTIME_DIR.
Signed-off-by: Simon McVittie <smcv@collabora.com>
If we call this after flatpak_run_add_app_info_args(), then the
garbage-collection code will have a chance to run, cleaning up after a
previous instance of the same app.
In a previous implementation of #4093 that also implemented #4120, we
had to allocate the per-app directory this early to avoid shadowing the
XDG_RUNTIME_DIR allocated in flatpak_run_add_app_info_args(), but I'm
taking a different approach to that now.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Similar to /tmp, applications might well use /dev/shm as an IPC
rendezvous between instances, which wouldn't have worked without
--device=shm until now.
Because /dev/shm has specific characteristics (in particular it's
meant to always be a tmpfs), we offload the actual storage into a
subdirectory of the real /dev/shm. Because /dev/shm is a shared
directory between all uids, we have to be extra-careful how we
do this, which is why the test coverage here is important.
This is done on an opt-in basis because of its extra complexity.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This allows apps that use /tmp as an IPC rendezvous point, such as those
that embed Chromium-derived browsers, to communicate between instances;
this would not previously have worked without --filesystem=/tmp, which
is a significant weakening of the sandbox.
It also allows /tmp to be shared with subsandboxes (if they are not
sandboxed more strictly).
The temporary directory is actually created in XDG_RUNTIME_DIR,
to avoid it becoming visible to unrelated apps that happen to have
--filesystem=/tmp.
Signed-off-by: Simon McVittie <smcv@collabora.com>
A subsequent commit will need to look at the FlatpakExports before
we are ready to append their arguments to the FlatpakBwrap.
Signed-off-by: Simon McVittie <smcv@collabora.com>
flatpak_context_get_exports_full() previously copied the interface of
flatpak_context_export(), which appended entries to a caller-supplied
GString, but it's a more GLib-style API if we use an "out" argument.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This combines the functionality of flatpak_context_get_exports() and its
open-coded version in flatpak_context_append_bwrap_filesystem().
Signed-off-by: Simon McVittie <smcv@collabora.com>
If we want to provide a per-app-ID XDG_RUNTIME_DIR (#4120) or a
per-app-ID /tmp or /dev/shm (#4093) then we'll need somewhere to put
them. Unlike $XDG_RUNTIME_DIR/app/$FLATPAK_ID, this should be somewhere
that is *not* accessible to the app, so that we can trust its contents.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Previously, this only had to consider two situations: either an instance
is still running (alive), or it is not (dead).
When we start sharing directories between all instances of a particular
app-ID (#4120, #4093), we'll also need to consider whether instances
share an app-ID, expanding the test to three situations: either an
instance is still running (alive), or it has exited but shares its
app-ID with a different instance that is still running (the app is
alive but the instance is dead, abbreviated here as alive_dead),
or it has exited and does not share its app-ID with any running
instances (dead).
Signed-off-by: Simon McVittie <smcv@collabora.com>
If an argument takes a value, and the value is empty, then it's
misleading to quote `{"--foo", "--empty", "", "--bar"}` as
`--foo --empty --bar`. It's better to get `--foo --empty '' --bar`.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The output might be written to the pipe by `flatpak --help` and/or read
from the pipe by `head -2` in more than one batch. If `head -2` reads
the first two lines before `flatpak --help` has written everything, it
will exit, causing the pipe to have no process at the read end. This
results in `flatpak --help` being killed by `SIGPIPE` next time it tries
to write to the pipe, because it has not opted out of this behaviour
(as shell tools usually shouldn't).
We're running under `set -o pipefail`, so this causes a nonzero exit
status that makes the test fail. Worse, this failure is intermittent,
because `head -2` *usually* doesn't exit until `flatpak --help` has
already written out everything it is going to write - it depends on
the precise behaviour of read(), write() and kernel scheduling.
We know that `flatpak --help` output is not *that* long, so it's OK
for `flatpak --help` not to be terminated early: we can send it all
into an intermediate file, and then run `head` on the file.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The pressure-vessel container tool in Steam will want to use this, to
replace /usr with a Steam Runtime container supplied by the Steam CDN,
instead of using the same Flatpak runtime that is used to run the Steam
client and non-containerized games.
If a custom /usr is used, the "official" Flatpak runtime is still the
one reflected in the metadata. It is also mounted at /run/parent,
with all its extensions, so that pressure-vessel has the option of using
its graphics drivers (by populating the custom /usr with symlinks into
/run/parent and/or /run/host).
When doing this, we need to put an empty directory on /app, because
the real /app expects to be run on top of the real runtime. It would
also be reasonable to substitute a custom replacement for /app, so
I've included support for that too.
Partially addresses #3797.
Signed-off-by: Simon McVittie <smcv@collabora.com>
When we add a way to specify a different /usr for a subsandbox, we'll
want to mount the "official" runtime elsewhere and avoid adding it
to the LD_LIBRARY_PATH.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This has no practical effect (assuming environment variables are unique),
but it makes it easier to find an environment variable of interest
in a very long bwrap command-line.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This makes them easier to deal with when debugging. Otherwise, it's easy
for the bundled arguments to wrap across 50 or more lines, and with
linebreaks in arbitrary positions that becomes very hard to read.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The only functional change here is that we consistently use
flatpak_get_real_xdg_runtime_dir(), instead of a mixture of
the versions with and without realpath().
Signed-off-by: Simon McVittie <smcv@collabora.com>