We've always, since the introduction of conflicts, had the policy that
deletes lose against any other change, for safety's sake. This is a
problem, however, because it means the sort order of versions is not a
total order.
That is, given two versions `A` and `B` that are currently in conflict,
we will sort them in a given order (let's say `A, B`, so `A < B` for
ordering purposes: we say "A wins over B" or "A is newer than B") and
consider the first in the list the winner. The loser (who has `B` on
disk) will process the conflict at some point and move the file to a
conflict copy and announce `A'` as the resolved conflict. The winner
(with `A` on disk) doesn't do anything.
However, if `A` is deleted the ordering changes. We still have `A < B`
and, of course, `Adel < A` (this is not even a conflict, just linear
order). In most sane systems this would imply the ordering `Adel < A <
B`, however in our case we in fact have `B < Adel` because any version
wins over a deleted one, so there is no logical ordering at all of the
files at this point. `Adel < A < B < Adel ???` In practice the deleted
version may end up at the head or the tail of the list, depending on the
order we do the compares.
Hence, at this point, "whatever" happens and it's not guaranteed to make
any sense. 😬
I propose that we resolve this my simply letting deletes be versions
like anything else and maintain a total ordering based on just version
vectors with the existing tie breakers like always. That means a delete
can win in a conflict situation, and the result should be that the file
is moved to a conflict copy on the losing device. I think this retains
the data safety to almost the same degree as previously, while removing
probably an entire class of strange out of sync bugs...
---
(A potential wrinkle here is that, ideally, we wouldn't even create the
conflict copy when the delete and the losing version represent the same
data -- same as when we handle normal modification conflicts. However,
the deleted FileInfo doesn't carry any information on what the contents
were, so we can't do that right now. A possible future extension would
be to carry the block list hash of the deleted data in the deleted
FileInfo and use that for this purpose, but I don't want to complicate
this PR with that. The block list hash itself also isn't a
protocol-defined thing at the moment, it's something implementation
dependent that we just use locally.)
This makes a couple of backwards compatible changes to the
ClusterConfig:
- Remove the `ignore_permissions` and `ignore_delete` booleans which
we've never read or used for anything
- Remove the `disable_temp_indexes` boolean and option entirely. We did
use this one, and about 1% of users have set the option. The only thing
it does is inhibits sending of periodical DownloadProgress messages
while downloading data, which is a minuscule bandwidth optimisation
given that we're already sending data at the time.
- Change the `read_only` boolean (which indicated send-only folders) to
an enum `FolderType`, where the values zero and one match the existing
usage. Again, we don't actually use this value, but I can see that we
might want to and then it makes more sense for it to be more
comprehensive.
- Change the `paused` boolean to an enum `StopReason`, where zero
indicates not stopped and one indicates paused, exactly the same wire
representation as previously but leaves space for additional stop
reasons (errors etc).
While it doesn't hurt, it's unnecessary since the big protobuf
modernisation, that also introduced types separate from the generated
ones for internal use. Those fields are already dropped when converting
to the wire in protocol.
* main:
feat(config): expose folder and device info as metrics (fixes#9519) (#10148)
chore: add issue types to GitHub issue templates
build: remove schedule from PR metadata job
chore(protocol): only allow enc. password changes on cluster config (#10145)
chore(protocol): don't start connection routines a second time (#10146)
In practice we already always call SetPassword and ClusterConfig
together. However it's not just "sensible" to do that, it's required: If
the passwords change, the remote device needs to know about that to
check that the enc. setup is valid/consistent (e.g. tokens match,
folder-type is appropriate, ...).
And with the passwords set later, there's no point in adding them as
part of creating a new connection.
This is a "followup" (if one can call it that 4 years later :) ) to
resp. fix for the following commit:
924b96856f
Co-authored-by: Jakob Borg <jakob@kastelo.net>
* main:
refactor: use slices package for sorting (#10136)
build: handle multiple general release notes
build: no need to build on the branches that just trigger tags
* main:
build: use specific token for pushing release tags
fix(gui): update `uncamel()` to handle strings like 'IDs' (fixes#10128) (#10131)
refactor: use slices package for sort (#10132)
build: process for automatic release tags (#10133)
chore(gui, man, authors): update docs, translations, and contributors
The sort package is still used in places that were not trivial to
change. Since Go 1.21 slices package can be uswed for sort. See
https://go.dev/doc/go1.21#slices
### Purpose
Make some progress with the migration to a more up-to-date syntax.
* main:
fix(syncthing): ensure both config and data dirs exist at startup (fixes#10126) (#10127)
fix(versioner): fix perms of created folders (fixes#9626) (#10105)
refactor: use slices.Contains to simplify code (#10121)
The copier routine refactor resulted in bad buffer pool handling,
putting a buffer back into the pool twice. This simplifies and removes
the danger prone Upgrade() method.
Flattened the copier code more. Also removing and moving some
parameters/return values to simplify things. Generally rely less on
return values, e.g. by handling errors right away and using `state` to
do the right thing (e.g. abort on failure).
Supposed to be a refactor without any behaviour changes, except for
fixing a tiny regression on folder order: We used to try copying from
the same folder first, but lost that property at some point (also sent a
PR fixing only that, I'd merge that first making this refactor only).
Where `folderFilesystems` and `folders` is built, there's a comment
spelling out the purpose: To have the same folder first, as that's the
most likely to get hits. Plus a copy is possibly more efficient than
from another folder, e.g. if that's on a different filesystem. We lost
that behaviour during some unrelated change.
(Also sneaking in a comment fix on yesterdays change.)
This is a draft because I haven't adjusted all the tests yet, I'd like
to get feedback on the change overall first, before spending time on
that.
In my opinion the main win of this change is in it's lower complexity
resp. fewer moving parts. It should also be faster as it only does one
query instead of two, but I have no idea if that's practically
relevant.
This also mirrors the v1 DB, where a block map key had the name
appended. Not that this is an argument for the change, it was mostly
reassuring me that I might not be missing something key here
conceptually (I might still be of course, please tell me :) ).
And the change isn't mainly intrinsically motivated, instead it came
up while fixing a bug in the copier. And the nested nature of that code
makes the fix harder, and "un-nesting" it required me to understand
what's happening. This change fell out of that.
This adds a simple delay to the process for starting the pull, by
default one second. In practice this means we're likely to wait for
initial index transfer, or multiple messages sent as part of a larger
change. This is better because we're more likely to have the whole
change for the purpose of handling renames etc, and also it's more
efficient to do one larger puller iteration instead of multiple while
also processing changes.
It does however introduce a certain amount of delay into the sync
process, so it can be tuned down or turned off entirely.
This changes the database structure to use one database per folder, with
a small main database to coordinate. Reverts the prior change to buffer
all files in memory when pulling, meaning there is now a phase where the
WAL file will grow significantly, at least for initial sync of folders
with many directories.
---------
Co-authored-by: bt90 <btom1990@googlemail.com>
* main:
fix(config): properly apply defaults when reading folder configuration (#10034)
chore(model): add metric for total number of conflicts (#10037)
build: replace underscore in Debian version (#10032)
Switch the database from LevelDB to SQLite, for greater stability and
simpler code.
Co-authored-by: Tommy van der Vorst <tommy@pixelspark.nl>
Co-authored-by: bt90 <btom1990@googlemail.com>
We've had weak/rolling hashing in the code for quite a while. It was a
popular request for a while, based on the belief that rsync does this
and we should too. However, the benefit is quite small; we save on
average about 0.8% of transferred blocks over the population as a whole:
<img width="974" alt="Screenshot 2025-03-28 at 17 09 02"
src="https://github.com/user-attachments/assets/bbe10dea-f85e-4043-9823-7cef1220b4a2"
/>
This would be fine if the cost was comparably low, however the downside
of attempting rolling hash matching is that we (by default) do a
complete file read on the destination in order to look for matches
before we starting pulling blocks for the file. For any larger file this
means a sometimes long, I/O-intensive pause before the file starts
syncing, for usually no benefit.
I propose we simply rip off the bandaid and save the effort.
### Purpose
This is a [new function](https://pkg.go.dev/slices@go1.21.0#Contains)
added in the go1.21 standard library, which can make the code more
concise and easy to read.
### Testing
Describe what testing has been done, and how the reviewer can test the
change
if new tests are not included.
### Screenshots
If this is a GUI change, include screenshots of the change. If not,
please
feel free to just delete this section.
### Documentation
If this is a user visible change (including API and protocol changes),
add a link here
to the corresponding pull request on https://github.com/syncthing/docs
or describe
the documentation changes necessary.
## Authorship
Your name and email will be added automatically to the AUTHORS file
based on the commit metadata.
Signed-off-by: dashangcun <907225865@qq.com>
Co-authored-by: Jakob Borg <jakob@kastelo.net>
Currently, this just results in a very ambiguous `setting metadata: lookup
failed` while it could report what it's looking up and why it failed
(not found, etc).
This was broken in the Protobuf refactor. The reason is that with the
previous library, generated types would have a JSON marshalling method
that would automatically return strings for enums. In the current
library you need to use the jsonpb marshaller for that, but these are
hand crafted structs so we can't do that. The easy solution is to just
use strings directly, since this is an API-only type anyway.
At a high level, this is what I've done and why:
- I'm moving the protobuf generation for the `protocol`, `discovery` and
`db` packages to the modern alternatives, and using `buf` to generate
because it's nice and simple.
- After trying various approaches on how to integrate the new types with
the existing code, I opted for splitting off our own data model types
from the on-the-wire generated types. This means we can have a
`FileInfo` type with nicer ergonomics and lots of methods, while the
protobuf generated type stays clean and close to the wire protocol. It
does mean copying between the two when required, which certainly adds a
small amount of inefficiency. If we want to walk this back in the future
and use the raw generated type throughout, that's possible, this however
makes the refactor smaller (!) as it doesn't change everything about the
type for everyone at the same time.
- I have simply removed in cold blood a significant number of old
database migrations. These depended on previous generations of generated
messages of various kinds and were annoying to support in the new
fashion. The oldest supported database version now is the one from
Syncthing 1.9.0 from Sep 7, 2020.
- I changed config structs to be regular manually defined structs.
For the sake of discussion, some things I tried that turned out not to
work...
### Embedding / wrapping
Embedding the protobuf generated structs in our existing types as a data
container and keeping our methods and stuff:
```
package protocol
type FileInfo struct {
*generated.FileInfo
}
```
This generates a lot of problems because the internal shape of the
generated struct is quite different (different names, different types,
more pointers), because initializing it doesn't work like you'd expect
(i.e., you end up with an embedded nil pointer and a panic), and because
the types of child types don't get wrapped. That is, even if we also
have a similar wrapper around a `Vector`, that's not the type you get
when accessing `someFileInfo.Version`, you get the `*generated.Vector`
that doesn't have methods, etc.
### Aliasing
```
package protocol
type FileInfo = generated.FileInfo
```
Doesn't help because you can't attach methods to it, plus all the above.
### Generating the types into the target package like we do now and
attaching methods
This fails because of the different shape of the generated type (as in
the embedding case above) plus the generated struct already has a bunch
of methods that we can't necessarily override properly (like `String()`
and a bunch of getters).
### Methods to functions
I considered just moving all the methods we attach to functions in a
specific package, so that for example
```
package protocol
func (f FileInfo) Equal(other FileInfo) bool
```
would become
```
package fileinfos
func Equal(a, b *generated.FileInfo) bool
```
and this would mostly work, but becomes quite verbose and cumbersome,
and somewhat limits discoverability (you can't see what methods are
available on the type in auto completions, etc). In the end I did this
in some cases, like in the database layer where a lot of things like
`func (fv *FileVersion) IsEmpty() bool` becomes `func fvIsEmpty(fv
*generated.FileVersion)` because they were anyway just internal methods.
Fixes#8247
I came accross this in another context and didn't investigate fully, but
literally ten lines above this code, in another method, we say that
filesets _must_ be created under the lock. It's either one or the other
and I'm taking the safer route here.
---------
Co-authored-by: Simon Frei <freisim93@gmail.com>
### Purpose
As discussed in #9686
Syncthing currently does not check folderstate on remote device before
pulling. If no devices have a valid folderstate (i.e all devices have
the folder paused) it will still attempt to pull. On large folders this
will cause a hanging "Syncing" status.
This checks whether at least one connected device has the file available
and has a valid folderstate.
### Testing
Tested locally on multiple devices.
We're new to Go (all our stuff is Python) so please bear with!
Interested if there may be a better place to slot this in.
Thanks,
Jon
---------
Co-authored-by: Simon Frei <freisim93@gmail.com>