Files
FreshRSS/app/Controllers/entryController.php
Alexandre Alapetite 1f466d7a2e Implement custom order-by (#7149)
Add option to sort results by received date (existing, default), publication date, title, URL (link), random.

fix https://github.com/FreshRSS/FreshRSS/issues/1771
fix https://github.com/FreshRSS/FreshRSS/issues/2083
fix https://github.com/FreshRSS/FreshRSS/issues/2119
fix https://github.com/FreshRSS/FreshRSS/issues/2596
fix https://github.com/FreshRSS/FreshRSS/issues/3204
fix https://github.com/FreshRSS/FreshRSS/issues/4405
fix https://github.com/FreshRSS/FreshRSS/issues/5529
fix https://github.com/FreshRSS/FreshRSS/issues/5864
fix https://github.com/FreshRSS/Extensions/issues/161

URL parameters:
* `&sort=id` (current behaviour, sorting according to newest received articles)
* `&sort=date` (publication date, which is not indicative of how new an article is)
* `&sort=title`
* `&sort=link`
* `&sort=rand` (random order - which disables infinite scrolling, at least for now)

combined with `&order=ASC` or `&order=DESC`

![image](https://github.com/user-attachments/assets/2de5aef1-604e-4a73-a147-569f6f42a1be)

## Implementation notes

The sorting criteria by *received date* (id), which is the default, and which was the only one before this PR, is the one that has the best sorting characteristics:
* *uniqueness*: no entries have the exact same received date
* *monotonicity*: new entries always have a higher received date
* *performance*: this field is efficiently indexed in database for fast usage, including for paging (indexing could also be done to other fields, but with lower effective performance)

In contrary, sorting criteria such as by *publication date*, by *title*, or by *link* are neither unique nor monotonic. In particular, multiple articles may share the same *publication date*, and we may receive articles with a *publication date* far in the future, and then later some new articles with a *publication date* far in the past.

To understand why sorting by *publication date* is problematic, it helps to think about sorting by *title* or by *link*, as sorting by *title* and by *publication date* share more or less the same characteristics.

### Problem 1: new articles

New articles may be received in the background after what is shown on screen, and before the next user action such as *mark all as read*. Due to the lack of *monotonicity* when sorting by e.g. *publication date* or *title*, users risk marking as read a batch of articles containing some fresh articles without seeing them.

Mitigation: A parameter `idMax` tracks the maximum ID related to a batch of actions such as *mark all as read* to exclude articles received after those that are displayed.

### Problem 2: paging / pagination

When navigating articles, only a few articles are displayed, and a new "page" of articles needs to be received from the database when scrolling down or when clicking the button to show more articles. When sorting by e.g. *publication date* or *title*, it is not trivial to show the next page without re-showing some of the same articles, and without skipping any. Indeed, views are often with additional criteria such as showing only unread articles, and users may mark some articles as read while viewing them, hereby removing some articles from the previous pages. And like for *Problem 1*, new articles may have been received in the background. Consequently, it is not possible to use `OFFSET` to implement pagination (so the patches suggested by a few users were wrong due to that, in particular).

Mitigation: `idMax` is also used (just like for *Problem 1*) and a *Keyset Pagination* approach is used, combining an unstable sorting criterion such as *publication date* or *title*, together with *id* to ensure stable sorting. (So, 2 sorting criteria + 1 filter criteria)

See e.g. https://www.alwaysdeveloping.net/dailydrop/2022/07/01-keyset-pagination/

### Problem 3: performance

Sorting by anything else than *received date* (id) is doomed to be slow(er) due to the combination of 3 criteria (see *Problem 2*). An `OFFSET` approach (which is not possible anyway as explained) would be even slower. Furthermore, we have no SQL index at the moment, but they would not necessarily help much due to the multiple sorting criteria needed and involving some `OR` logic which is difficult to optimise for databases.

The nicest syntax would be using tuples and corresponding indexes, but that is poorly supported by MySQL https://bugs.mysql.com/bug.php?id=104128

Mitigation: a compatibility SQL syntax is used to implement *Keyset Pagination*

### Problem 4: user confusion

Several users have shown that they do not fully understand the difference between *received date* and *publication date*, and particularly not the pitfalls of *publication date*.

Mitigation: the menus to mark-as-read *before 1 day* and *before 1 week* are disabled when sorting by anything else than *received date*. Likewise, the separation headers *Today* and *Yesterday* and *Before yesterday* are only shown when sorting by *received date*.

Again here, to better understand why, it helps to think about sorting by *title* or by *link*, as sorting by *title* and by *publication date* share more or less the same characteristics.

* [ ] We should write a Q&A and/or documentation about the problems associated to *sorting by publication date*: risks of not noticing new publication, of inadvertently marking them as read, of having some articles with a date in the future hanging at the top of the views (vice versa when sorting in ascending order), performance, etc.

### Problem 5: APIs

Sorting by anything else than *received date* breaks the guarantees needed for a successful synchronisation via API.

Mitigation: sorting by *received date* is ensured for all API calls.
2025-01-06 16:00:00 +01:00

289 lines
8.8 KiB
PHP

<?php
declare(strict_types=1);
/**
* Controller to handle every entry actions.
*/
class FreshRSS_entry_Controller extends FreshRSS_ActionController {
/**
* JavaScript request or not.
*/
private bool $ajax = false;
/**
* This action is called before every other action in that class. It is
* the common boilerplate for every action. It is triggered by the
* underlying framework.
*/
#[\Override]
public function firstAction(): void {
if (!FreshRSS_Auth::hasAccess()) {
Minz_Error::error(403);
}
// If ajax request, we do not print layout
$this->ajax = Minz_Request::paramBoolean('ajax');
if ($this->ajax) {
$this->view->_layout(null);
Minz_Request::_param('ajax');
}
}
/**
* Mark one or several entries as read (or not!).
*
* If request concerns several entries, it MUST be a POST request.
* If request concerns several entries, only mark them as read is available.
*
* Parameters are:
* - id (default: false)
* - get (default: false) /(c_\d+|f_\d+|s|a)/
* - nextGet (default: $get)
* - idMax (default: 0)
* - is_read (default: true)
*/
public function readAction(): void {
$get = Minz_Request::paramString('get');
$next_get = Minz_Request::paramString('nextGet') ?: $get;
$id_max = Minz_Request::paramString('idMax');
if (!ctype_digit($id_max)) {
$id_max = '0';
}
$is_read = Minz_Request::paramTernary('is_read') ?? true;
FreshRSS_Context::$search = new FreshRSS_BooleanSearch(Minz_Request::paramString('search'));
FreshRSS_Context::$state = Minz_Request::paramInt('state');
if (FreshRSS_Context::isStateEnabled(FreshRSS_Entry::STATE_FAVORITE)) {
if (!FreshRSS_Context::isStateEnabled(FreshRSS_Entry::STATE_NOT_FAVORITE)) {
FreshRSS_Context::$state = FreshRSS_Entry::STATE_FAVORITE;
}
} elseif (FreshRSS_Context::isStateEnabled(FreshRSS_Entry::STATE_NOT_FAVORITE)) {
FreshRSS_Context::$state = FreshRSS_Entry::STATE_NOT_FAVORITE;
} else {
FreshRSS_Context::$state = 0;
}
$params = [];
$this->view->tagsForEntries = [];
$entryDAO = FreshRSS_Factory::createEntryDao();
if (!Minz_Request::hasParam('id')) {
// No id, then it MUST be a POST request
if (!Minz_Request::isPost()) {
Minz_Request::bad(_t('feedback.access.not_found'), ['c' => 'index', 'a' => 'index']);
return;
}
if ($get === '') {
// No get? Mark all entries as read (from $id_max)
$entryDAO->markReadEntries($id_max, false, FreshRSS_Feed::PRIORITY_MAIN_STREAM, FreshRSS_Feed::PRIORITY_IMPORTANT, null, 0, $is_read);
} else {
$type_get = $get[0];
$get = (int)substr($get, 2);
switch ($type_get) {
case 'c':
$entryDAO->markReadCat($get, $id_max, FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 'f':
$entryDAO->markReadFeed($get, $id_max, FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 's':
$entryDAO->markReadEntries($id_max, true, null, FreshRSS_Feed::PRIORITY_IMPORTANT,
FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 'a':
$entryDAO->markReadEntries($id_max, false, FreshRSS_Feed::PRIORITY_MAIN_STREAM, FreshRSS_Feed::PRIORITY_IMPORTANT,
FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 'A':
$entryDAO->markReadEntries($id_max, false, FreshRSS_Feed::PRIORITY_CATEGORY, FreshRSS_Feed::PRIORITY_IMPORTANT,
FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 'Z':
$entryDAO->markReadEntries($id_max, false, FreshRSS_Feed::PRIORITY_ARCHIVED, FreshRSS_Feed::PRIORITY_IMPORTANT,
FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 'i':
$entryDAO->markReadEntries($id_max, false, FreshRSS_Feed::PRIORITY_IMPORTANT, null,
FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
case 't':
$entryDAO->markReadTag($get, $id_max, FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
// Marking all entries in a tag as read can result in other tags also having all entries marked as read,
// so the next unread tag calculation is deferred by passing next_get = 'a' instead of the current get ID.
if ($next_get === 'a' && $is_read) {
$tagDAO = FreshRSS_Factory::createTagDao();
$tagsList = $tagDAO->listTags() ?: [];
$found_tag = false;
foreach ($tagsList as $tag) {
if ($found_tag) {
// Found the tag matching our current ID already, so now we're just looking for the first unread
if ($tag->nbUnread() > 0) {
$next_get = 't_' . $tag->id();
break;
}
} else {
// Still looking for the tag ID matching our $get that was just marked as read
if ($tag->id() === $get) {
$found_tag = true;
}
}
}
// Didn't find any unread tags after the current one? Start over from the beginning.
if ($next_get === 'a') {
foreach ($tagsList as $tag) {
// Check this first so we can return to the current tag if it's the only one that's unread
if ($tag->nbUnread() > 0) {
$next_get = 't_' . $tag->id();
break;
}
// Give up if reached our first tag again
if ($tag->id() === $get) {
break;
}
}
}
// If we still haven't found any unread tags, fallback to the full tag list
if ($next_get === 'a') {
$next_get = 'T';
}
}
break;
case 'T':
$entryDAO->markReadTag(0, $id_max, FreshRSS_Context::$search, FreshRSS_Context::$state, $is_read);
break;
}
if ($next_get !== 'a') {
// Redirect to the correct page (category, feed or starred)
// Not "a" because it is the default value if nothing is given.
$params['get'] = $next_get;
}
}
} else {
/** @var list<numeric-string> $idArray */
$idArray = Minz_Request::paramArrayString('id');
$idString = Minz_Request::paramString('id');
if (count($idArray) > 0) {
$ids = $idArray;
} elseif (ctype_digit($idString)) {
$ids = [$idString];
} else {
$ids = [];
}
$entryDAO->markRead($ids, $is_read);
$tagDAO = FreshRSS_Factory::createTagDao();
$tagsForEntries = $tagDAO->getTagsForEntries($ids) ?: [];
$tags = [];
foreach ($tagsForEntries as $line) {
$tags['t_' . $line['id_tag']][] = (string)$line['id_entry'];
}
$this->view->tagsForEntries = $tags;
}
if (!$this->ajax) {
Minz_Request::good(
$is_read ? _t('feedback.sub.articles.marked_read') : _t('feedback.sub.articles.marked_unread'),
[
'c' => 'index',
'a' => 'index',
'params' => $params,
]
);
}
}
/**
* This action marks an entry as favourite (bookmark) or not.
*
* Parameter is:
* - id (default: false)
* - is_favorite (default: true)
* If id is false, nothing happened.
*/
public function bookmarkAction(): void {
$id = Minz_Request::paramString('id');
$is_favourite = Minz_Request::paramTernary('is_favorite') ?? true;
if ($id != '' && ctype_digit($id)) {
$entryDAO = FreshRSS_Factory::createEntryDao();
$entryDAO->markFavorite($id, $is_favourite);
}
if (!$this->ajax) {
Minz_Request::forward([
'c' => 'index',
'a' => 'index',
], true);
}
}
/**
* This action optimizes database to reduce its size.
*
* This action should be reached by a POST request.
*
* @todo move this action in configure controller.
* @todo call this action through web-cron when available
*/
public function optimizeAction(): void {
$url_redirect = [
'c' => 'configure',
'a' => 'archiving',
];
if (!Minz_Request::isPost()) {
Minz_Request::forward($url_redirect, true);
}
if (function_exists('set_time_limit')) {
@set_time_limit(300);
}
$databaseDAO = FreshRSS_Factory::createDatabaseDAO();
$databaseDAO->optimize();
$feedDAO = FreshRSS_Factory::createFeedDao();
$feedDAO->updateCachedValues();
invalidateHttpCache();
Minz_Request::good(_t('feedback.admin.optimization_complete'), $url_redirect);
}
/**
* This action purges old entries from feeds.
*
* @todo should be a POST request
* @todo should be in feedController
*/
public function purgeAction(): void {
if (function_exists('set_time_limit')) {
@set_time_limit(300);
}
$feedDAO = FreshRSS_Factory::createFeedDao();
$feeds = $feedDAO->listFeeds();
$nb_total = 0;
invalidateHttpCache();
$feedDAO->beginTransaction();
foreach ($feeds as $feed) {
$nb_total += ($feed->cleanOldEntries() ?: 0);
}
$feedDAO->updateCachedValues();
$feedDAO->commit();
$databaseDAO = FreshRSS_Factory::createDatabaseDAO();
$databaseDAO->minorDbMaintenance();
invalidateHttpCache();
Minz_Request::good(_t('feedback.sub.purge_completed', $nb_total), [
'c' => 'configure',
'a' => 'archiving',
]);
}
}