mirror of
https://github.com/syncthing/syncthing.git
synced 2026-03-24 09:13:36 -04:00
fix(systemd): add back chown allowed syscalls (#10605)
fix(systemd): Add back chown allowed syscalls IFF the user enables the `syncOwnership` feature AND sets `AmbientCapabilities=CAP_CHOWN CAP_FOWNER` as the docs in https://docs.syncthing.net/users/autostart.html#permissions state, THEN syncthing needs to use the `chown` syscall. PR #10421 added a comprehensive sandbox that breaks `syncOwnership`. In PR #10602 we fixed one part, which is expanding the default `CapabilityBoundingSet` (see the PR for details). But there's a very subtle bug that this PR fixes. PR #10421 sets the following properties: SystemCallFilter=@system-service SystemCallFilter=~@privileged io_uring_enter io_uring_register io_uring_setup (Systemd merges `SystemCallFilter` values; we had to set the property twice because to negate syscalls, the whole list has to start with `~`.) The goal was to allow all syscalls in the `@system-service` set, BUT disallow any `@privileged` syscalls and the `io_uring*` syscalls. But the sets are not disjoint; `chown` is in both `@system-service` and in `@privileged`, so it is removed from the allow list by the second property value. This property is also parsed in a very peculiar way. From systemd docs: > If you specify both types of this option (i.e. allow-listing and > deny-listing), the first encountered will take precedence and will > dictate the default action (termination or approval of a system call). > Then the next occurrences of this option will add or delete the listed > system calls from the set of the filtered system calls, depending of its > type and the default action. (For example, if you have started with an > allow list rule for read() and write(), and right after it add a deny > list rule for write(), then write() will be removed from the set.) Not only does the order of `SystemCallFilter` properties matter (later ones can undo effects of prior ones), but the _type_ of the _first_ property sets the overall behavior of the syscall filter: if the first `SystemCallFilter` value is an allow list, then all syscalls that are not specified are disallowed by default (and reverse if the first value is a deny list). Of course, this is completely different from how other allow/deny lists are implemented in systemd; for example, `IPAddress[Allow|Deny]` properties don't work like this at all. >:( Since this complexity has bit us once, we're removing the additional deny list of syscalls and sticking with just `SystemCallFilter=@system-service`. This leaves some privileged syscalls in the allow list. Other options would require entering the "deny list by default" mode and deny lists are less secure than allow lists in general because they have to be maintained (the kernel always adds new syscalls). The rest of the sandbox (capability bounds) should be sufficient. Fixes #10603 Signed-off-by: Val Markovic <val@markovic.io>
This commit is contained in:
@@ -128,13 +128,6 @@ ProcSubset=pid
|
||||
# System call allow-list. `@system-service` is a systemd-provided category that
|
||||
# allows common syscalls needed for system services.
|
||||
SystemCallFilter=@system-service
|
||||
# Explicitly disallow @privileged syscalls. Syncthing fails to start if we also
|
||||
# disallow @resources (which `systemd-analyze` is unhappy about).
|
||||
# Also disallow io_uring syscalls which are as of 2025 a significant source of
|
||||
# kernel exploits.
|
||||
# We do not include io_uring_enter2 because it's just a wrapper for
|
||||
# io_uring_enter and systemd issues a warning.
|
||||
SystemCallFilter=~@privileged io_uring_enter io_uring_register io_uring_setup
|
||||
# Return EPERM when a disallowed syscall is made instead of killing the process.
|
||||
SystemCallErrorNumber=EPERM
|
||||
# Digits from left to right; disallow creation of files with:
|
||||
|
||||
Reference in New Issue
Block a user