This patch updates `chunk_large_query_over`. This function works great
when it's required to chunk some data over a query. However, if the
data don't need to be chunked, it is possible to get a quicker path that
doesn't involve 2 `Vec` allocations, nor a `split_off` of one `Vec`, nor
a `Vec::extend` (which move data in memory). The quicker path, which is
also the most likely to be hit, simply does a single `Vec` allocation,
and that's it.
Up until now, users had to listen for to-device events to check for
secrets that were received as an `m.secret.send` event.
This has a bunch of shortcomings:
1. Once the has been given to the consumer, it's gone and can't be
retrieved anymore. Secrets may get lost if an app restart happens
before the consumer decides what to do with it.
2. The consumer can't be sure if the event was received in a
secure manner.
This commit ads a inbox for our received secrets where we will long-term
store all secrets we receive until the user decides to delete them.
It's deemed fine to store all secrets, since we only accept secrets we
have requested and if they have been received from a verified device of
ours.
This patch introduces a new `chunk_large_query_over` function. Imagine
there is a _dynamic_ query that runs potentially large number of
parameters, so much that the maximum of parameters can be hit. Then,
this function is for you. It will execute the query on chunks of
parameters.
`chunk_large_query_over` uses `Vec::split_off` to avoid cloning `Key`s
as much as possible. It's difficult to use references here because of
the async nature of `SqliteObjectExt::prepare`.
This patch updates `get_kv_blobs`, `get_room_infos`,
`get_maybe_stripped_state_events_for_keys`, `get_profiles`,
`get_user_ids`, and `get_display_names` to use this new function.
This patch finally adds the `repeat_vars` function to replace in a
more efficient way the `vec!["?"; n].join(", ")` pattern. There is less
memory allocations.
This implements a new time lease based lock for the `CryptoStore`, that doesn't require explicit unlocking, so that's more robust in the context of #1928, where any process may die because the device is running out of battery, or unexpected flows cause a lock to not be released properly in one or the other process.
```
//! This is a per-process lock that may be used only for very specific use
//! cases, where multiple processes might concurrently write to the same
//! database at the same time; this would invalidate crypto store caches, so
//! that should be done mindfully. Such a lock can be acquired multiple times by
//! the same process, and it remains active as long as there's at least one user
//! in a given process.
//!
//! The lock is implemented using time-based leases to values inserted in a
//! crypto store. The store maintains the lock identifier (key), who's the
//! current holder (value), and an expiration timestamp on the side; see also
//! `CryptoStore::try_take_leased_lock` for more details.
//!
//! The lock is initially acquired for a certain period of time (namely, the
//! duration of a lease, aka `LEASE_DURATION_MS`), and then a "heartbeat" task
//! renews the lease to extend its duration, every so often (namely, every
//! `EXTEND_LEASE_EVERY_MS`). Since the tokio scheduler might be busy, the
//! extension request should happen way more frequently than the duration of a
//! lease, in case a deadline is missed. The current values have been chosen to
//! reflect that, with a ratio of 1:10 as of 2023-06-23.
//!
//! Releasing the lock happens naturally, by not renewing a lease. It happens
//! automatically after the duration of the last lease, at most.
```
---
* feat: implement a time lease based lock for the crypto store
* feat: switch the crypto-store lock a time-leased based one
* chore: fix CI, don't use unixepoch in sqlite and do time math in rust
* chore: dummy implementation in indexeddb, don't run lease locks tests there
* feat: in NSE, wait the duration of a lease if first attempt to unlock failed
* feat: immediately release the lock when there are no more holders
* chore: clippy
* chore: add comment about atomic sanity
* chore: increase sleeps in timeline queue tests?
* feat: lower lease and renew durations
* feat: keep track of the extend-lease task
* fix: increment num_holders when acquiring the lock for the first time
* chore: reduce indent + abort prev renew task on non-wasm + add logs
This implements a value-based lock in the crypto stores. The intent is to use that for multiple processes to be able to make writes into the store concurrently, while still cooperating on who does them. In particular, we need this for #1928, since we may have up to two different processes trying to write into the crypto store at the same time.
## New methods in the `CryptoStore` trait
The idea is to introduce two new methods touching **custom values** in the crypto store:
- one to atomically insert a value, only if it was missing (so, not following the semantics of `upsert` used in the `set_custom_value`)
- one to atomically remove a custom value
Those two operations match the semantics we want:
- take the lock only if it ain't taken already == insert an entry only if it was missing
- release the lock = remove the entry
By looking at the number of lines affected by the query, we can infer whether the insert/remove happened or not, that is, if we managed to take the lock or not.
## High-level APIs
I've also added an high-level API, `CryptoStoreLock`, that helps managing such a lock, and adds some niceties on top of that:
- exponential backoff to retry attempts at acquiring the lock, when it was already taken
- attempt to gracefully recover when the lock has been taken by an app that's been killed by the environment
- full configuration of the key / value / backoff parameters
While it'd be nice to have something like a `CryptoStoreLockGuard`, it's hard to implement without being racy, because of the `async` statements that would happen in the `Drop` method (and async drop isn't stable yet).
## Test program
There's also a test program in which I shamelessly show my rudimentary unix skills; I've put it in the `labs/` directory but this could as well be a large integration test. A parent program initially fills a custom crypto store, then creates a `pipe()` for 1-way communication with a child created with `fork()`; then the parent sends commands to the child. These commands consist in reading and writing into the crypto store, using a lock. And while the child attempts to perform these operations, the parent tries hard to get the lock at the same time. This helps figuring out a few issues and making sure that cross-process locking would work as intended.
Since stripped and non-stripped room infos use the same types,
the separation is not necessary anymore.
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
The format of the outbound group session struct has changed. We nowadays correctly rotate the group session if we can't restore it, but it's still good to avoid logging the error in this case.
The SQLite crypto store uses rmp_serde to serialize all the data we're
going to store. This works nicely for most things, one exception to this
is the OutboundGroupSession type.
The OutboundGroupSession type stores to-device requests to ensure that
the session doesn't get used before it is shared with the whole group
and to ensure that the to-device requests get restored if the
session gets restored after an application restart.
The to-device requests type critically contain `Raw<AnyToDeviceEvent>`,
the `Raw` type here being the serde_json::Raw type. rmp_serde seems to
serialize this just fine, but later deserialization fails.
We're avoiding the issue by using serde_json to serialize the
OutboundGroupSession.