This bug only affects CJK languages and apart from DB growth, the symptom is that word filtering in app lists doesn't find affected apps because we look for a single whitespace between tokens
This code comes from SearchManager, but making it available in the DB library makes sense since the queries are specific to the DB implementation such as zero-whitespace hack.
this is a bit hacky, but there seems to be very little information about this specific bug which affected several installs and either degraded search result quality or broke search completely.
In absence of a better fix or even a way to reproduce the issue, we are resorting to this.
Added indexes based on slow query logs and associated `EXPLAIN QUERY
PLAN` output.
Note: There are some composite primary keys with `repoId` +
`packageName` + ..., and we still ad an index on `packageName`.
This is because the order matters in composite keys. It might
be possible to restructure the primary key to be `packageName` +
`repoId` + ..., however this requires removing the table and
recreating it which is more complex then just adding an index on
`packageIndex` in addition to the primary key. There is also no
guarantee that things wont slow down when restructuring the primary key,
because there may be some cases where it is important that `repoId` is
first in that index.
Only enabled in debug mode. When auditing performance, make sure
to tune the parameters to the open helper. By default it will log
and explain queries that take more than 2s.
Sometimes the added and the lastUpdated timestamp are some seconds apart, so we can't expect them to be equal for new apps. We simply treat everything as new that was added in the last 14 days.
because we'll download another file afterwards and progress reporting would jump to 100% two times. Also entry.jar is really small, so downloading that is fast and doesn't benefit from detailed progress reporting anyway.
The app may maintain PackageInfo for apps already, so there's no need to keep asking the system for it. We could instead just work with what we already have.
Also, we discovered that chunking the package names isn't needed for newer Android versions. This is only relevant for whom has more than 999 apps installed.
This could happen when tapping a button quickly two times in the UI. While we had provisions to ensure proper states, these didn't work, because a parallel fetching job would mess with our state allowing the adding code to run more than once.
This can cause other exceptions (e.g. in json parsing) which prevent repo from getting added properly.
Also, when updating the archive for the first time, we can now re-use the downloaded index.
We now hang on to the index file while streaming it for repo preview purposes. This avoids having to re-download that file and we can properly add the repo right away.
This then allows us to bring the user to the list of apps in that repository without it being initially empty.
This led to many hard to debug issues in the past. It is easier to always fetch fresh data and not cache it.
Previously, we needed the cache as a search index. Now, search uses all localizations, so the cache isn't needed anymore.
by inserting zero whitespace between their characters to help the existing sqlite FTS tokenizers to split them up.
We have considered splitting them up only at word boundaries, but after consulting native speakers decided to do splitting by chars instead.
Doing this is a hack, but due to the limitations of tokenizers currently available with sqlite, we saw no better solution. While the ICU tokenizer is available as well, it doesn't handle diacritics in other languages.
The zero whitespace is added to zh, ja and ko locales when saving their text to the database. It happens for app names, summaries and descriptions either when loading a full index or when applying diffs. Tests have been added for both cases.
This now also searches in descriptions, author and package names.
The search also considers all languages now and is insensitive to diacritics in most languages.
The AppMetadataFts needed to be re-created for this to work. A migration and a test for this have been added.