6 Commits

Author SHA1 Message Date
Jakob Borg
9ee208b441 chore(sqlite): use normalised tables for file names and versions (#10383)
This changes the files table to use normalisation for the names and
versions. The idea is that these are often common between all remote
devices, and repeating an integer is more efficient than repeating a
long string. A new benchmark bears this out; for a database with 100k
files shared between 31 devices, with some worst case assumption on
version vector size, the database is reduced in size by 50% and the test
finishes quicker:

    Current:
        db_bench_test.go:322: Total size: 6263.70 MiB
    --- PASS: TestBenchmarkSizeManyFilesRemotes (1084.89s)

    New:
        db_bench_test.go:326: Total size: 3049.95 MiB
    --- PASS: TestBenchmarkSizeManyFilesRemotes (776.97s)

The other benchmarks end up about the same within the margin of
variability, with one possible exception being that RemoteNeed seems to
be a little slower on average:

                                          old files/s   new files/s
    Update/n=RemoteNeed/size=1000-8            5.051k        4.654k
    Update/n=RemoteNeed/size=2000-8            5.201k        4.384k
    Update/n=RemoteNeed/size=4000-8            4.943k        4.242k
    Update/n=RemoteNeed/size=8000-8            5.099k        3.527k
    Update/n=RemoteNeed/size=16000-8           3.686k        3.847k
    Update/n=RemoteNeed/size=30000-8           4.456k        3.482k

I'm not sure why, possibly that query can be optimised anyhow.

Signed-off-by: Jakob Borg <jakob@kastelo.net>
2025-09-12 09:27:41 +00:00
Jakob Borg
2306c6d989 chore(db): benchmark output, migration blocks/s output (#10320)
Just minor tweaks
2025-08-29 14:58:38 +00:00
Jakob Borg
c918299eab refactor(db): slightly improve insert performance (#10318)
This just removes an unnecessary foreign key constraint, where we
already do the garbage collection manually in the database service.
However, as part of getting here I tried a couple of other variants
along the way:

- Changing the order of the primary key from `(hash, blocklist_hash,
idx)` to `(blocklist_hash, idx, hash)` so that inserts would be
naturally ordered. However this requires a new index `on blocks (hash)`
so that we can still look up blocks by hash, and turns out to be
strictly worse than what we already have.
- Removing the primary key entirely and the `WITHOUT ROWID` to make it a
rowid table without any required order, and an index as above. This is
faster when the table is small, but becomes slower when it's large (due
to dual indexes I guess).

These are the benchmark results from current `main`, the second
alternative below ("Index(hash)") and this proposal that retains the
combined primary key ("combined"). Overall it ends up being about 65%
faster.

<img width="764" height="452" alt="Screenshot 2025-08-29 at 14 36 28"
src="https://github.com/user-attachments/assets/bff3f9d1-916a-485f-91b7-b54b477f5aac"
/>

Ref #10264
2025-08-29 15:26:23 +02:00
Simon Frei
e54f51c9c5 chore(db): cleanup DB in tests and remove OpenTemp (#10282)
Filled up my tmpfs with test DBs when running benchmarks :)
2025-08-24 09:58:56 +00:00
Simon Frei
a259a009c8 chore(db): adjust db bench name to improve benchstat grouping (#10283)
The benchstat tool allows custom grouping when comparing with what it
calls "sub-name configuration keys":

https://pkg.go.dev/golang.org/x/perf@v0.0.0-20250813145418-2f7363a06fe1/cmd/benchstat#hdr-Configuring_comparisons

That's quite useful for these benchmarks, as we basically have two
independent configs: The type of benchmark and the size. Real example
usage for the prepared named statements PR (results are rubbish for
unrelated reasons):

```
$ benchstat -row ".name /n" bench-main.out bench-prepared.out
goos: linux
goarch: amd64
pkg: github.com/syncthing/syncthing/internal/db/sqlite
cpu: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
                            │ bench-main-20250823_014059.out │   bench-prepared-20250823_022849.out   │
                            │             sec/op             │     sec/op      vs base                │
Update Insert100Loc                           248.5m ±  8% ¹   157.7m ±  7% ¹  -36.54% (p=0.000 n=50)
Update RepBlocks100                           253.7m ±  4% ¹   163.6m ±  7% ¹  -35.49% (p=0.000 n=50)
Update RepSame100                            130.42m ±  3% ¹   60.26m ±  2% ¹  -53.80% (p=0.000 n=50)
Update Insert100Rem                           38.54m ±  5% ¹   21.94m ±  1% ¹  -43.07% (p=0.000 n=50)
Update GetGlobal100                          10.897m ±  4% ¹   4.231m ±  1% ¹  -61.17% (p=0.000 n=50)
Update LocalSequenced                         7.560m ±  5% ¹   3.124m ±  2% ¹  -58.68% (p=0.000 n=50)
Update GetDeviceSequenceLoc                  17.554µ ±  6% ¹   8.400µ ±  1% ¹  -52.15% (n=50)
Update GetDeviceSequenceRem                  17.727µ ±  4% ¹   8.237µ ±  2% ¹  -53.54% (p=0.000 n=50)
Update RemoteNeed                              4.147 ± 77% ¹    1.903 ± 78% ¹  -54.11% (p=0.000 n=50)
Update LocalNeed100Largest                   21.516m ± 22% ¹   9.312m ± 47% ¹  -56.72% (p=0.000 n=50)
geomean                                       15.35m           7.486m          -51.22%
¹ benchmarks vary in .fullname
```
2025-08-23 16:12:55 +02:00
Jakob Borg
025905fcdf chore: switch database engine to sqlite (fixes #9954) (#9965)
Switch the database from LevelDB to SQLite, for greater stability and
simpler code.

Co-authored-by: Tommy van der Vorst <tommy@pixelspark.nl>
Co-authored-by: bt90 <btom1990@googlemail.com>
2025-03-29 13:50:08 +01:00