This patch fixes an issue where 0 was used as the initial value for
the `Skip` higher-order stream in the `TimelineSubscriber`. This is
wrong, as the `SkipCount` value may have been modified before the
`TimelineSubscriber` is created.
This patch provides a test to reproduce the problem.
This patch gathers the logic of the `Timeline::subscribe` into a single
type: `TimelineSubscriber`.
The `TimelineController::subscribe_batched_and_limited` method is
renamed `subscribe` to match `Timeline::subscribe`. Things are simpler
to apprehend.
The `TimelineSubscriber` type configures the subscriber/stream in a
single place. It takes an `&ObservableItems` and a `&SkipCount`, and
configures everything. It also provides a single place to document the
behaviour of the subscriber, with the `Skip` higher-order stream.
This patch updates the pagination mechanism of the `Timeline` to support
lazy backwards pagination.
`Timeline::paginate_backwards` already does different things whether the
timeline focus is live or focused. When it's live, the method will first
try to paginate backwards lazily, by adjusting the `count` value of the
`Skip` stream used by the `Timeline::subscribe` method. If there is not
enough items to provide, the greedy pagination will run.
This patch updates `TimelineStateTransaction::commit` to adjust the
count value of the `Skip` higher-order stream.
This patch also creates `TimelineStateTransaction::new` to simplify the
creation of a state transaction.
This patch applies the `Skip` higher-order stream on the `Timeline`
subscriber. The method `TimelineController::subscribe_batched` is
renamed `subscribe_batched_and_limited` at the same time.
The `Skip` stream uses the `TimelineMetadata::subscriber_skip_count`
observable value as the `count_stream`. The initial count value is 0.
No test needs to be changed so far.
This patch adds the `subscriber_skip_count` field to `TimelineMetadata`.
It's going to be used to define the `count` value of the `Skip`
higher-order stream that is going to be applied to the `Timeline`
subscriber.
This patch adds functions like `compute_next`,
`compute_next_when_paginating_backwards` and
`compute_next_when_paginating_forwards`, which are necessary to
correctly compute the `count` value for the `Skip` higher-order stream.
This patch adds the associated test suite.
This patch renames `TimelineStream` to `TimelineWithDropHandle`, as the
former name was too vague and was not clarify what the type was doing.
The new name makes it clear that it attaches a `TimelineDropHandle` to a
subscriber (since it is part of the `subscriber` module!).
This patch updates `TimelineController::subscribe` to use
`VectorSubscriber::into_values_and_batched_stream`. It returns
the cloned items along with the stream. It saves the need to call
`ObservableItems::clone_items`, thus saving the clone of all items.
tl;dr: `clone_items()` clones… items… and `subscribe()` also clones the
items. There is 2 clones. With `into_values_and_batched_stream()`, there
is 1 clone.
This patch improves the performance of
`ReadReceiptTimelineUpdate::apply`, which does 2 things: it calls
`remove_old_receipt` and `add_new_receipt`. Both of them need an
timeline item position. Until this patch, `rfind_event_by_id` was used
and was the bottleneck. The improvement is twofold as is as follows.
First off, when collecting data to create `ReadReceiptTimelineUpdate`,
the timeline item position can be known ahead of time by using
`EventMeta::timeline_item_index`. This data is not always available, for
example if the timeline item isn't created yet. But let's try to collect
these data if there are some.
Second, inside `ReadReceiptTimelineUpdate::remove_old_receipt`, we use
the timeline item position collected from `EventMeta` if it exists.
Otherwise, let's fallback to a similar `rfind_event_by_id` pattern,
without using intermediate types. It's more straightforward here: we
don't need an `EventTimelineItemWithId`, we only need the position.
Once the position is known, it is stored in `Self` (!), this is the
biggest improvement here. Le't see why.
Finally, inside `ReadReceiptTimelineUpdate::add_new_receipt`, we use
the timeline item position collected from `EventMeta` if it exists,
similarly to what `remove_old_receipt` does. Otherwise, let's fallback
to an iterator to find the position. However, instead of iterating over
**all** items, we can skip the first ones, up to the position of the
timeline item holding the old receipt, so up to the position found by
`remove_old_receipt`.
I'm testing this patch with the `test_lazy_back_pagination` test in
https://github.com/matrix-org/matrix-rust-sdk/pull/4594. With 10_000
events in the sync, the `ReadReceipts::maybe_update_read_receipt` method
was taking 52% of the whole execution time. With this patch, it takes
8.1%.
This patch improves the performance of
`RoomEvents::maybe_apply_new_redaction`. This method deserialises
all the events it receives, entirely. If the event is not an
`m.room.redaction`, then the method returns early. Most of the time,
the event is deserialised for nothing because most events aren't of kind
`m.room.redaction`!
This patch first uses `Raw::get_field("type")` to detect the type of
the event. If it's a `m.room.redaction`, then the event is entirely
deserialized, otherwise the method returns.
When running the `test_lazy_back_pagination` from
https://github.com/matrix-org/matrix-rust-sdk/pull/4594 with 10'000
events, prior to this patch, this method takes 11% of the execution
time. With this patch, this method takes 2.5%.
This patch fixes the performance of
`AllRemoteEvents::increment_all_timeline_item_index_after` and
`decrment_all_timeline_item_index_after`.
It appears that the code was previously iterating over all items. This
is a waste of time. This patch updates the code to iterate over all
items in reverse order:
- if `new_timeline_item_index` is 0, we need to shift all items anyways,
so all items must be traversed, the iterator direction doesn't matter,
- otherwise, it's unlikely we want to traverse all items: the item has
been either inserted or pushed back, so there is no need to traverse
the first items; we can also break the iteration as soon as all
timeline item index after `new_timeline_item_index` has been updated.
I'm testing this patch with the `test_lazy_back_pagination` test in
https://github.com/matrix-org/matrix-rust-sdk/pull/4594. With 10_000
events in the sync, the `ObservableItems::push_back` method (that uses
`AllRemoteEvents::increment_all_timeline_item_index_after`) was taking
7% of the whole execution time. With this patch, it takes 0.7%.
The WAL file can grow depending on the transactions that are run. A
critical case is VACUUM which basically writes the content of the DB
file to the WAL file before writing it back to the DB file.
SQLite doesn't try to reduce the size of the file after that unless we
set an explicit limit,
so we could end up taking twice the size of the database on the
filesystem.
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
It should have been done in the migration of version 7, to reduce the
size of the database on the filesystem after the media cache was moved
to the SqliteEventCacheStore. Better late than never.
Signed-off-by: Kévin Commaille <zecakeh@tedomum.fr>
Fixes#4517.
It turns out that the bugs found in that test were due to 2 causes:
- First commit: `TestRoomDataProvider` didn't use `initial_user_receipts` but returned hardcoded values.
- Second commit: Our own read receipts were ignored in `TimelineStateTransaction::load_read_receipts_for_event`, although we need to process all read receipts via `ReadReceipts::maybe_update_read_receipt` because it knows how to filter out our own read receipts were needed.
This patch drastically improves the performance of
`TimelineEventHandler::deduplicate_local_timeline_item`.
Before this patch, `rfind_event_item` was used to iterate over all
timeline items: for each item in reverse order, if it was an event
timeline item, and if it was a local event timeline item, and if it was
matching the event ID or transaction ID, then a duplicate was found.
The problem is the following: all items are traversed.
However, local event timeline items are always at the back of the items.
Even virtual timeline items are before local event timeline items. Thus,
it is not necessary to traverse all items. It is possible to stop the
iteration as soon as (i) a non event timeline item is met, or (ii) a non
local event timeline item is met.
This patch updates
`TimelineEventHandler::deduplicate_local_timeline_item` to replace to
use of `rfind_event_item` by a custom iterator that stops as soon as a
non event timeline item, or a non local event timeline item, is met, or
—of course— when a local event timeline item is a duplicate.
To do so, [`Iterator::try_fold`] is probably the best companion.
[`Iterator::try_find`] would have been nice, but it is available on
nightlies, not on stable versions of Rust. However, many methods in
`Iterator` are using `try_fold`, like `find` or any other methods that
need to do a “short-circuit”. Anyway, `try_fold` works pretty nice here,
and does exactly what we need.
Our use of `try_fold` is to return a `ControlFlow<Option<(usize,
TimelineItem)>, ()>`. After `try_fold`, we call
`ControlFlow::break_value`, which returns an `Option`. Hence the need
to call `Option::flatten` at the end to get a single `Option` instead of
having an `Option<Option<(usize, TimelineItem)>>`.
I'm testing this patch with the `test_lazy_back_pagination` test in
https://github.com/matrix-org/matrix-rust-sdk/pull/4594. With 10_000
events in the sync, the test was taking 13s to run on my machine. With
this patch, it takes 10s to run. It's a 23% improvement. This
`deduplicate_local_timeline_item` method was taking a large part of the
computation according to the profiler. With this patch, this method is
barely visible in the profiler it is so small.
[`Iterator::try_fold`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_fold
[`Iterator::try_find`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_find