This is an optimized version of ostree_repo_prune() specialized for
archive mode repos. It is faster and uses less memory so that we can
prune larger repos (like flathub) in a realistic timeframe.
The primary reason it is faster is that it creates and uses a
`.commitmeta2` file for each commit, containing information about what
objects are reachable from that commit. This means incremental prunes
need only traverse over newly created commits.
Secondly, it uses the variant parser compiled accessors for the
various GVariants that are involved in the prune which is quite a bit
faster, especially if the repo is very large.
It also merges the scan-for-all-objects and prune-unreachable objects
phases, which means that we don't have to allocate a hashtable for
all the objects in the entire repo saving a lot of memory.
To save memory the hashtable of reachable objects, which can be quite
big on a big repo, points to a custom, very compact format for object
names.
Additionally it does the scanning for reachable objects twice, first
with a shared lock and then again (if anything changed) it with an
exclusive lock. This allows us to avoid using an exclusive lock during
the slowest part of the prune.
Unfortunately there are currently no public APIs for the ostree repo
locks. We really need to take an exclusive lock during the whole prune
or we parallel modifications (say a commit) might get their newly
written objects deleted. To work around this we have a minimal custom
implementation of an exclusive lock. Once the public API is available
we can start using that.
I created a repo with a lot of small commits to test this. It has 9M,
and pruning with depth=10 deletes 2M of them.
The original performance looks like:
Finding reachable objects: 287 seconds
Pruning unreachable: 69 seconds
Just using the pregenerated reachable data:
Finding reachable objects: 15 seconds
Pruning unreachable: 69 seconds
The final optimized prune (using pregenerated data):
Finding reachable objects: 12 seconds
Pruning unreachable: 51 seconds
The above are with the page caches cleaned, on a second run the performance
increase is even more noticeable.
As a comparison to the above, finding the reachable objects in the
actual flathub repo took 22 hours, but with the pregenerated reachable data
only 39 minutes.
We always set match_len before using it, discarding the result of this
assignment. Detected by scan-build.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build has a lot of false positives for this codebase because it
doesn't understand __attribute__((__cleanup__)) or GLib's GError
convention, but it seems to have been right about these.
Signed-off-by: Simon McVittie <smcv@collabora.com>
scan-build detected that mark_op_resolved() can be called with
op->resolved_commit == commit, in which case we incorrectly freed the
string before allocating the new copy.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The intention here seems to have been that failing to close the http
stream provokes a warning but does not make the function fail, but we
were setting the wrong error, resulting in a NULL dereference if closing
the http stream somehow fails.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Like $XDG_RUNTIME_DIR/app/$FLATPAK_ID, this is shared between all
instances of the app, except for subsandboxed instances created by
flatpak-spawn --sandbox or equivalent. Unlike
$XDG_RUNTIME_DIR/app/$FLATPAK_ID, it does not exist at an equivalent
path on the host and in the sandboxed app.
Resolves: https://github.com/flatpak/flatpak/issues/4120
Signed-off-by: Simon McVittie <smcv@collabora.com>
If XDG_RUNTIME_DIR is under app control, as it will be with #4120, we
don't want to be mounting pieces of filesystem directly into it, because
that will mean that the app could create a symlink that will cause us
to create a mount point for it at the target of the symlink, potentially
elsewhere in the host filesystem.
Instead, we mount them in /run/flatpak, which is a per-instance
directory entirely controlled by Flatpak; and then create (relative)
symlinks in XDG_RUNTIME_DIR, pointing into /run/flatpak.
In this commit, we still know that the XDG_RUNTIME_DIR is a
per-instance tmpfs, so we can safely create the symlinks using
the --symlink option. In a subsequent commit this will change to
creating them in a shared XDG_RUNTIME_DIR, if any.
Signed-off-by: Simon McVittie <smcv@collabora.com>
If it doesn't start with wayland- or contains a directory separator,
then it's probably not something we want to be creating in the
sandbox's XDG_RUNTIME_DIR.
Signed-off-by: Simon McVittie <smcv@collabora.com>
There's no real reason why this has to be in the XDG_RUNTIME_DIR: it's
located via environment variable AT_SPI_BUS_ADDRESS.
Signed-off-by: Simon McVittie <smcv@collabora.com>
There's no real reason why this has to be in the XDG_RUNTIME_DIR:
nothing looks for it via XDG_RUNTIME_DIR, it's located via environment
variable SSH_AUTH_SOCK.
Signed-off-by: Simon McVittie <smcv@collabora.com>
There's no real reason why this needs to be in XDG_RUNTIME_DIR: nothing
relies on it being there, and applications find it via environment
variable XAUTHORITY.
Signed-off-by: Simon McVittie <smcv@collabora.com>
It's a bit simpler to get a per-app XDG_RUNTIME_DIR safely if we avoid
putting this in there. Nothing relies on it being in the
XDG_RUNTIME_DIR.
Signed-off-by: Simon McVittie <smcv@collabora.com>
If we call this after flatpak_run_add_app_info_args(), then the
garbage-collection code will have a chance to run, cleaning up after a
previous instance of the same app.
In a previous implementation of #4093 that also implemented #4120, we
had to allocate the per-app directory this early to avoid shadowing the
XDG_RUNTIME_DIR allocated in flatpak_run_add_app_info_args(), but I'm
taking a different approach to that now.
Signed-off-by: Simon McVittie <smcv@collabora.com>
Similar to /tmp, applications might well use /dev/shm as an IPC
rendezvous between instances, which wouldn't have worked without
--device=shm until now.
Because /dev/shm has specific characteristics (in particular it's
meant to always be a tmpfs), we offload the actual storage into a
subdirectory of the real /dev/shm. Because /dev/shm is a shared
directory between all uids, we have to be extra-careful how we
do this, which is why the test coverage here is important.
This is done on an opt-in basis because of its extra complexity.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This allows apps that use /tmp as an IPC rendezvous point, such as those
that embed Chromium-derived browsers, to communicate between instances;
this would not previously have worked without --filesystem=/tmp, which
is a significant weakening of the sandbox.
It also allows /tmp to be shared with subsandboxes (if they are not
sandboxed more strictly).
The temporary directory is actually created in XDG_RUNTIME_DIR,
to avoid it becoming visible to unrelated apps that happen to have
--filesystem=/tmp.
Signed-off-by: Simon McVittie <smcv@collabora.com>
A subsequent commit will need to look at the FlatpakExports before
we are ready to append their arguments to the FlatpakBwrap.
Signed-off-by: Simon McVittie <smcv@collabora.com>
flatpak_context_get_exports_full() previously copied the interface of
flatpak_context_export(), which appended entries to a caller-supplied
GString, but it's a more GLib-style API if we use an "out" argument.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This combines the functionality of flatpak_context_get_exports() and its
open-coded version in flatpak_context_append_bwrap_filesystem().
Signed-off-by: Simon McVittie <smcv@collabora.com>
If we want to provide a per-app-ID XDG_RUNTIME_DIR (#4120) or a
per-app-ID /tmp or /dev/shm (#4093) then we'll need somewhere to put
them. Unlike $XDG_RUNTIME_DIR/app/$FLATPAK_ID, this should be somewhere
that is *not* accessible to the app, so that we can trust its contents.
Signed-off-by: Simon McVittie <smcv@collabora.com>
If an argument takes a value, and the value is empty, then it's
misleading to quote `{"--foo", "--empty", "", "--bar"}` as
`--foo --empty --bar`. It's better to get `--foo --empty '' --bar`.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The pressure-vessel container tool in Steam will want to use this, to
replace /usr with a Steam Runtime container supplied by the Steam CDN,
instead of using the same Flatpak runtime that is used to run the Steam
client and non-containerized games.
If a custom /usr is used, the "official" Flatpak runtime is still the
one reflected in the metadata. It is also mounted at /run/parent,
with all its extensions, so that pressure-vessel has the option of using
its graphics drivers (by populating the custom /usr with symlinks into
/run/parent and/or /run/host).
When doing this, we need to put an empty directory on /app, because
the real /app expects to be run on top of the real runtime. It would
also be reasonable to substitute a custom replacement for /app, so
I've included support for that too.
Partially addresses #3797.
Signed-off-by: Simon McVittie <smcv@collabora.com>
When we add a way to specify a different /usr for a subsandbox, we'll
want to mount the "official" runtime elsewhere and avoid adding it
to the LD_LIBRARY_PATH.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This has no practical effect (assuming environment variables are unique),
but it makes it easier to find an environment variable of interest
in a very long bwrap command-line.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This makes them easier to deal with when debugging. Otherwise, it's easy
for the bundled arguments to wrap across 50 or more lines, and with
linebreaks in arbitrary positions that becomes very hard to read.
Signed-off-by: Simon McVittie <smcv@collabora.com>
The only functional change here is that we consistently use
flatpak_get_real_xdg_runtime_dir(), instead of a mixture of
the versions with and without realpath().
Signed-off-by: Simon McVittie <smcv@collabora.com>
This localizes knowledge of the internal structure of
$XDG_RUNTIME_DIR/.flatpak into the flatpak-instance module.
Signed-off-by: Simon McVittie <smcv@collabora.com>
This is a step towards being able to build Flatpak using Meson, which
is becoming widely available even in LTS distributions. Meson's
built-in support for subprojects expects to find them in ./subprojects
at top level.
Signed-off-by: Simon McVittie <smcv@collabora.com>
For reasons unknown, libarchive appears to generate broken gnutar format
tar archives when the archive contains files that are larger than 2 GB.
This commit switches to the pax format to work this around.
This should be a better default as it also removes 256 char filename
length limitation and matches what other libraries are doing, e.g.
Python 3.8 switched to the pax format by default as well.
See https://pagure.io/fedora-infrastructure/issue/9840
We sometimes set a custom per-thread mainloop because and then spin it
manually to fake a sync call on a thread using async calls. Primarily
this happens with the soup streaming calls. In this case, eventually
we finish the main loop iteration (because, say, the download is done)
so we stop iterating the mainloop and return from the fake sync code.
However, that might not necessarily be the only thing queued on the
main context. I ran into a situation where it seems like libsoup did
some call to a thread-pool during the async call, and the next time i
used soup aync everything froze. It looks like there is some threaded
soup service that returned a response on the old context, and since
that never got handled (since that context is now dead) it now doesn't
work.
To solve this situation we're now iterating the main context until
there are no pending sources before killing the main context.
We're calling async soup APIs with SOUP_SESSION_USE_THREAD_CONTEXT
set, which means that libsoup async APIs will run async callbacks on
the loop of the thread-default main context. We then manually spin
this main context, because we're supposed to look like a sync call and
the async stuff is just internally.
This is not really right, because normally there isn't any custom
mainloop context registred, which means we're spinning the main thread
context on some other thread, as well as queuing soup sorces on
it. This can't be any good!
Rather than doing this we actually create and push our own main
context that we then spin isolated from the default mainloop.
Previously, the polkit query was always interactive, even if the
`FlatpakDir` was operating in non-interactive mode (for example, for a
background update in gnome-software). Make the interactivity match the
interactivity of the `FlatpakDir`.
Do the same for the `mct_manager_get_app_filter()` call, although this
is less important since under normal conditions it will never prompt the
user.
This should hopefully stop polkit prompts appearing periodically when
background updates are being done while logged in as a non-privileged
user with parental controls set to prevent application installation.
Signed-off-by: Philip Withnall <pwithnall@endlessos.org>
This is either a malicious/compromised app trying to do an attack, or
a mistake that will break handling of %f, %u and so on. Either way,
if we refuse to export the .desktop file, resulting in installation
failing, then it makes the rejection more obvious than quietly
removing the magic tokens.
Signed-off-by: Simon McVittie <smcv@collabora.com>
If we add new features analogous to file forwarding later, we might
find that we need a different magic token. Let's reserve the whole
@@* namespace so we can call it @@something-else.
Signed-off-by: Simon McVittie <smcv@collabora.com>
When the portal's Spawn method is used with the environment cleared,
it's very likely that the "flatpak run" that ends up being run will be
in an environment without UTF-8 support.
If one of the files or directories we try to expose to the sub-sandbox
contains UTF-8/non-ASCII characters, then "flatpak run" would fail with:
error: Invalid byte sequence in conversion input
This is caused by GOption trying to parse the --filesystem option for
flatpak, as, when using the G_OPTION_ARG_CALLBACK argument type, GOption
will split the option name from its value, and try to convert the value
to UTF-8. Which will fail because there's no UTF-8.
It won't however do that if we tell the option parser that the value is
a filename using G_OPTION_FLAG_FILENAME, so set it.