Commit Graph

46 Commits

Author SHA1 Message Date
Benjamin Bouvier
7a762035f1 feat(sdk): store the thread subscription bumpstamp and implement the correct upsert semantics 2025-09-01 10:38:34 +02:00
Benjamin Bouvier
d66733052a feat(event cache): add indexes for finding related events 2025-08-26 16:25:56 +02:00
Damir Jelić
139673810f Remember the public Curve25519 key of the sender of the historic room key bundle 2025-08-08 09:19:19 +02:00
Benjamin Bouvier
1a5cb2beb8 feat(stores): allow saving thread subscriptions 2025-07-30 12:07:07 +02:00
Ivan Enderlin
6f42210d6a feat(sqlite): Improve throughput of load_all_chunks_metadata by 1140%.
This patch changes the query used by
`SqliteEventCacheStore::load_all_chunks_metadata`. It was the cause of
severe slowness. The new query improves the throughput by +1140% and the
time by -91.916%. The benchmark will follow in the next patch.

Metrics for 10'000 events (with 1 gap every 80 events).

- Before:
  - throughput: 20.686 Kelem/s,
  - time: 483.43 ms.
- After:
  - throughput: 253.52 Kelem/s,
  - time: 39.478 ms.

This query will visit all chunks of a linked chunk with ID
`hashed_linked_chunk_id`. For each chunk, it collects its ID
(`ChunkIdentifier`), previous chunk, next chunk, and number of
events (`num_events`). If it's a gap, `num_events` is equal to 0,
otherwise it counts the number of events in `event_chunks` where
`event_chunks.chunk_id = linked_chunks.id`.

Why not using a `(LEFT) JOIN` + `COUNT`? Because for gaps, the entire
`event_chunks` will be traversed every time. It's extremely inefficient.
To speed that up, we could use an `INDEX` but it will consume more
storage space. Finally, traversing an `INDEX` boils down to traverse a
B-tree, which is O(log n), whilst this `CASE` approach is O(1). This
solution is nice trade-off and offers great performance.
2025-07-21 10:31:44 +02:00
Benjamin Bouvier
ebcb74a86d refactor!(event cache): introduce LinkedChunkId in the backends (#5182)
In a "soon" future, threads have their own linked chunk. All our code
has been written with the fact that a linked chunk belong to *a room* in
mind, so it needs some biggish update. Fortunately, most of the changes
are mechanical, so they should be rather easy to review.

Part of #4869, namely #5122.
2025-06-09 13:26:46 +00:00
Richard van der Hoff
e89c45ba42 sqlite: store data on received room key bundles 2025-04-23 19:59:24 +01:00
Benjamin Bouvier
d30dc7177f refactor(sqlite): rename gaps to gap_chunks / events_chunks to event_chunks
And some comments have been tweaked too.
2025-04-02 13:26:15 +02:00
Benjamin Bouvier
45f1dca6a3 refactor(event cache): don't return a position in find_event
Getting the position when reading an event is no longer required:

- the only use case for reading the position out of the event cache was
when we wanted to replace a redacted item into the linked chunk; now
with save_event(), we can replace it without having to know its
position.

As an extra measure of caution, I've also included the room_id in the
`events` table, next to the event_id, so that looking for an event is
still restricted to a single room.
2025-04-02 13:26:15 +02:00
Benjamin Bouvier
03f5d0222e refactor(event cache): store the events' content independently of their position in a chunk 2025-04-02 13:26:15 +02:00
Ivan Enderlin
3d653d3fdc fix(sqlite): Design a new schema to get faster insertions.
This patch is twofold. First off, it provides a new schema allowing to
improve the performance of `SqliteEventCacheStore` for 100_000 events
from 6.7k events/sec to 284k events/sec on my machine.

Second, it now assumes that `EventCacheStore` does NOT store invalid
events. It was already the case, but the SQLite schema was not rejecting
invalid event in case some were handled. It's now explicitely forbidden.
2025-03-05 13:57:08 +01:00
Ivan Enderlin
6c57003d17 feat(sqlite) Add an index on events.event_id and .room_id.
This patch adds an index on `events.event_id` and on `events.room_id`
so that queries on this column are faster. It mostly happens for the
`Deduplicator`, which runs for every backwards pagination or sync.

This patch also updates the query in `filter_duplicated_events` to
sort event by their `chunk_id` and `position` so that the results are
constant, it helps when testing.
2025-02-19 11:50:23 +01:00
Kévin Commaille
df9c355aed Merge branch 'main' into media-retention-policy
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
2025-01-28 15:46:55 +01:00
Kévin Commaille
8ca5983093 feat(sqlite): Implement EventCacheStoreMedia for SqliteEventCacheStore
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
2025-01-28 15:38:18 +01:00
Benjamin Bouvier
3428494468 refactor: rename SyncTimelineEvent to TimelineEvent 2025-01-22 20:24:48 +01:00
Daniel Salinas
9641aa9082 feat(send queue): Add an enqueued time to to-be-sent events (#4385)
Add a new created_at to the send_queue_events and
dependent_send_queue_events stored records. This will allow clients to
understand how stale a pending message might be in the event that the
queue encounters and error and becomes wedged.

This change is exposed through the FFI on the `EventTimelineItem` struct
as a new optional field named `local_created_at`. It will be `None` for
any Remote event, and `Some` for Local events (except for those that
were enqueued before the migrations were run).

Signed-off-by: Daniel Salinas

---------

Signed-off-by: Daniel Salinas <zzorba@users.noreply.github.com>
Co-authored-by: Daniel Salinas <danielsalinas@daniels-mbp-2.myfiosgateway.com>
Co-authored-by: Benjamin Bouvier <benjamin@bouvier.cc>
Co-authored-by: Daniel Salinas <danielsalinas@Daniels-MBP-2.attlocal.net>
2025-01-13 16:41:05 +00:00
Benjamin Bouvier
9ed65bc321 task(event cache): address review points 2024-11-28 11:48:46 +01:00
Benjamin Bouvier
9bea0cff24 feat(event cache): implement the sqlite backend for events 2024-11-28 11:48:46 +01:00
Benjamin Bouvier
c02d8cee77 feat!(send queue): add a priority field to maintain ordering of sending
Prior to this patch, the send queue would not maintain the ordering of
sending a media *then* a text, because it would push back a dependent
request graduating into a queued request.

The solution implemented here consists in adding a new priority column
to the send queue, defaulting to 0 for existing events, and use higher
priorities for the media uploads, so they're considered before other
requests.

A high priority is also used for aggregation events that are sent late,
so they're sent as soon as possible, before other subsequent events.
2024-11-14 12:00:08 +01:00
Ivan Enderlin
37304c8cdc refactor: Implement try_take_leased_lock on SqliteEventCacheStore 2024-11-06 15:03:50 +01:00
Benjamin Bouvier
1f2e8c5007 refactor!(event cache store): store the serialized QueuedRequestKind, not a raw event
Changelog: The send queue will now store a serialized
 `QueuedRequestKind` instead of a raw event, which breaks the format.
 As a result, all send queues have been emptied.
2024-11-04 17:42:47 +01:00
Benjamin Bouvier
06e6cba156 chore(event cache store): Support multiple parent key types for dependent requests
This makes it possible to have different kinds of *parent key*, to
update a dependent request. A dependent request waits for the parent key
to be set, before it can be acted upon; before, it could only be an
event id, because a dependent request would only wait for an event to be
sent. In a soon future, we're going to support uploading medias as
requests, and some subsequent requests will depend on this, but won't be
able to rely on an event id (since an upload doesn't return an
event/event id).

Since this changes the format of `DependentQueuedRequest`, which is
directly serialized into the state stores, I've also cleared the table,
to not have to migrate the data in there. Dependent requests are
supposed to be transient anyways, so it would be a bug if they were many
of them in the queue.

Since a migration was needed anyways, I've also removed the `rename`
annotations (that supported a previous format) for the
`DependentQueuedRequestKind` enum.
2024-11-04 17:42:47 +01:00
Benjamin Bouvier
9c858c1208 refactor(base): rename all send-queue related "events" to "requests"
Changelog: Renamed all the send-queue related "events" to "requests", so
  as to generalize usage of the send queue to not-events (e.g. medias,
  redactions, etc.).
2024-10-29 18:15:10 +01:00
Valere
31e9600078 feat(send_queue): Persist failed to send errors (#4137)
Modify the SendQueue in order to persist the error that cause the event
to fail to send as a `QueueWedgeError`. The `QueueWedgeError` is not a
1:1 mapping for all kinds of errors, but holds variant and information
that the client can react to in order to propose "quick fixes"/solution
before retrying to send.

Fixes https://github.com/matrix-org/matrix-rust-sdk/issues/3973 
Also fixes https://github.com/element-hq/element-x-ios/issues/3287
because when a timeline reset occurs the fail to send reason is also
lost.

This PR starts with a refactoring commit
e7696003e8
to introduce the new `QueueWedgedError` and move the logic that was in
the ffi layer to convert api errors to SendState error variant. This
`QueueWedgedError` can be directly use in the `SendingFailed` variant
and expose to ffi.

Second commit
109c133746
adds the persistence, `QueuedEvent` now have an optional error field
instead of a `is_weged` boolean. Same for LocalEchoContent::Event.
Adds also Migration for sqlite and indexeddb

Co-authored-by: Benjamin Bouvier <benjamin@bouvier.cc>

Changelog:  We now persist the error that caused an event to fail to send. The error `QueueWedgeError` contains info that client can use to try to resolve the problem when the error is not automatically retry-able. Some breaking changes occurred in the FFI layer for `timeline::EventSendState`, `SendingFailed` now directly contains the wedge reason enum; use it in place of the removed variant of `EventSendState`.
2024-10-23 12:48:55 +02:00
Richard van der Hoff
12653fb2b6 sqlite: add new curve_key and sender_data_type columns 2024-09-02 18:07:38 +01:00
Kévin Commaille
01e2db1f52 base: Move media cache to new EventCacheStore trait (#3858)
Allows to save media in a different path than the state store.

This adds a "last_access" field to the SQLite implementation, to prepare
for future work on a media retention policy.

This removes the IndexedDB media cache implementation, because as far as
I know it is currently unused, and I have no idea how to implement
efficiently the planned media retention policy with a key-value store.

Closes #1810.

- [x] Public API changes documented in changelogs (optional)

---------

Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
2024-08-22 10:36:43 +02:00
Benjamin Bouvier
2eb6930988 send queue: add an own transaction id for dependent events
The previously named `transaction_id` is also renamed to
`parent_transaction_id` to make it clearer.
2024-07-24 17:54:25 +02:00
Benjamin Bouvier
9a7f18c62c state store: add dependent queued events tables and operations 2024-07-24 17:54:25 +02:00
Benjamin Bouvier
62801f1a6c state store: change schema for send_queue_events table
It turns out that so as to be able to read the room ids, they need to be
*values*, not only *keys* (since keys are one-way hashed). This means we
need to duplicate the room_id field in indexeddb/sqlite, so each entry
contains both the room_id as a key (for queries) and as a value (to
return it).

Since there's no meaningful migration we can apply, the way to go is to
drop the pending events table and recreate it from the ground up. It is
assumed that no one has used the store on indexeddb; otherwise the
workaround would be to drop and recreate it.
2024-06-26 19:42:14 +02:00
Benjamin Bouvier
74b770f4d6 send queue: use state-store backed storage for remembering events to be sent 2024-06-24 13:56:10 +02:00
Damir Jelić
6ea0d886f7 Introduce a secret inbox
Up until now, users had to listen for to-device events to check for
secrets that were received as an `m.secret.send` event.

This has a bunch of shortcomings:
    1. Once the has been given to the consumer, it's gone and can't be
       retrieved anymore. Secrets may get lost if an app restart happens
       before the consumer decides what to do with it.
    2. The consumer can't be sure if the event was received in a
       secure manner.

This commit ads a inbox for our received secrets where we will long-term
store all secrets we receive until the user decides to delete them.

It's deemed fine to store all secrets, since we only accept secrets we
have requested and if they have been received from a verified device of
ours.
2023-07-19 14:27:18 +02:00
Benjamin Bouvier
73ad51879a feat: implement time lease based locks for the CryptoStore (#2140)
This implements a new time lease based lock for the `CryptoStore`, that doesn't require explicit unlocking, so that's more robust in the context of #1928, where any process may die because the device is running out of battery, or unexpected flows cause a lock to not be released properly in one or the other process.

```
//! This is a per-process lock that may be used only for very specific use
//! cases, where multiple processes might concurrently write to the same
//! database at the same time; this would invalidate crypto store caches, so
//! that should be done mindfully. Such a lock can be acquired multiple times by
//! the same process, and it remains active as long as there's at least one user
//! in a given process.
//!
//! The lock is implemented using time-based leases to values inserted in a
//! crypto store. The store maintains the lock identifier (key), who's the
//! current holder (value), and an expiration timestamp on the side; see also
//! `CryptoStore::try_take_leased_lock` for more details.
//!
//! The lock is initially acquired for a certain period of time (namely, the
//! duration of a lease, aka `LEASE_DURATION_MS`), and then a "heartbeat" task
//! renews the lease to extend its duration, every so often (namely, every
//! `EXTEND_LEASE_EVERY_MS`). Since the tokio scheduler might be busy, the
//! extension request should happen way more frequently than the duration of a
//! lease, in case a deadline is missed. The current values have been chosen to
//! reflect that, with a ratio of 1:10 as of 2023-06-23.
//!
//! Releasing the lock happens naturally, by not renewing a lease. It happens
//! automatically after the duration of the last lease, at most.
```

---

* feat: implement a time lease based lock for the crypto store
* feat: switch the crypto-store lock a time-leased based one
* chore: fix CI, don't use unixepoch in sqlite and do time math in rust
* chore: dummy implementation in indexeddb, don't run lease locks tests there
* feat: in NSE, wait the duration of a lease if first attempt to unlock failed
* feat: immediately release the lock when there are no more holders
* chore: clippy
* chore: add comment about atomic sanity
* chore: increase sleeps in timeline queue tests?
* feat: lower lease and renew durations
* feat: keep track of the extend-lease task
* fix: increment num_holders when acquiring the lock for the first time
* chore: reduce indent + abort prev renew task on non-wasm + add logs
2023-06-29 13:05:44 +00:00
Kévin Commaille
ea219d836e base: Do not separate stripped room info
Since stripped and non-stripped room infos use the same types,
the separation is not necessary anymore.

Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
2023-06-14 10:45:37 +02:00
Kévin Commaille
6ca6a9a84a Implement SQLite state store
Co-authored-by: Jonas Platte <jplatte@matrix.org>
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
2023-04-26 17:51:11 +02:00
Jonas Platte
40272c5989 refactor: Move sqlite crypto store migrations in new subdirectory 2023-04-26 17:51:11 +02:00
Damir Jelić
cd865d21c3 Drop outbound group sessions in the SQLite store
The format of the outbound group session struct has changed. We nowadays correctly rotate the group session if we can't restore it, but it's still good to avoid logging the error in this case.
2023-04-13 14:48:14 +00:00
Damir Jelić
316b29c95f Merge branch 'main' into valere/msc_2399 2023-04-13 10:59:05 +02:00
Damir Jelić
c73aeef2ed Fix the serialization of outbound group sessions in the SQLite store
The SQLite crypto store uses rmp_serde to serialize all the data we're
going to store. This works nicely for most things, one exception to this
is the OutboundGroupSession type.

The OutboundGroupSession type stores to-device requests to ensure that
the session doesn't get used before it is shared with the whole group
and to ensure that the to-device requests get restored if the
session gets restored after an application restart.

The to-device requests type critically contain `Raw<AnyToDeviceEvent>`,
the `Raw` type here being the serde_json::Raw type. rmp_serde seems to
serialize this just fine, but later deserialization fails.

We're avoiding the issue by using serde_json to serialize the
OutboundGroupSession.
2023-04-11 10:08:06 +02:00
Damir Jelić
61ea15eb39 Merge branch 'main' into valere/msc_2399 2023-03-15 19:31:04 +01:00
Andy Uhnak
283137f190 Introduce RoomSettings 2023-03-14 10:27:11 +00:00
Andy Uhnak
7e7c91699f Generic key-value API 2023-03-14 10:27:11 +00:00
Andy Uhnak
b4b111f91f Store encryption settings 2023-03-14 10:27:11 +00:00
valere
0ec3aca727 refactor(withheld) store withheld as ShareInfo
and store no_olm sent in device
2023-03-07 18:46:52 +01:00
valere
3dbad4229a feat(withheld): sql store support 2023-02-21 15:05:14 +01:00
Jonas Platte
e12b9e3027 refactor(sqlite): Encode olm hashes with rmp instead of serde_json
… like everything else in the sqlite crypto store.
2023-02-08 15:44:17 +01:00
Anderas
e9cef35f99 Add matrix-sdk-sqlite with a CryptoStore implementation
Note about "Write-Ahead Log" (WAL) mode: The SQLite WAL mode has a
bunch of advantages that are quite nice to have:

1. WAL is significantly faster in most scenarios.
2. WAL provides more concurrency as readers do not block writers and a
   writer does not block readers. Reading and writing can proceed
   concurrently.
3. Disk I/O operations tends to be more sequential using WAL.
4. WAL uses many fewer fsync() operations and is thus less vulnerable
   to problems on systems where the fsync() system call is broken.

The downsides of WAL mode don't really affect us. So let's turn it on.

More info: https://www.sqlite.org/wal.html

Co-authored-by: Jonas Platte <jplatte@matrix.org>
Co-authored-by: Damir Jelić <poljar@termina.org.uk>
2023-02-01 15:06:59 +01:00