Compare commits

...

59 Commits

Author SHA1 Message Date
Veloman Yunkan
cd9785fe85 Enter uriEncode() 2023-01-25 23:45:18 +04:00
Veloman Yunkan
b7a019469c Unconditional URI-encoding in RequestContext::get_query<F>(F) 2023-01-25 23:41:52 +04:00
Matthieu Gautier
76dfc03751 Merge pull request #870 from kiwix/urlEncode_quickfix 2023-01-25 16:41:24 +01:00
Veloman Yunkan
ca079a72cc Some clean-up 2023-01-25 19:15:12 +04:00
Veloman Yunkan
471c5b89f4 Dropped the 2nd param of urlEncode()
`urlEncode(str)` is now equivalent to the previous `urlEncode(str, true)`.
2023-01-25 19:15:12 +04:00
Veloman Yunkan
3bf8211b70 Made 2nd param of urlEncode() mandatory
This is a precautionary step before dropping the said parameter.
2023-01-25 19:15:12 +04:00
Veloman Yunkan
ec81d5904d Proper URI-encoding in kiwix::getSearchUrl() 2023-01-25 19:15:12 +04:00
Veloman Yunkan
82dcba542a Demonstrating bugs of kiwix::getSearchUrl() 2023-01-25 19:15:12 +04:00
Veloman Yunkan
63e0d5c7c2 RequestContext::get_query() is fully URI-encoded 2023-01-25 19:15:12 +04:00
Veloman Yunkan
772243e832 Category name is fully URI-encoded 2023-01-25 19:15:12 +04:00
Veloman Yunkan
bad13d76b4 Removed unused code 2023-01-25 19:15:12 +04:00
Veloman Yunkan
0bde4d9412 Properly URI-encoded links in search results
Special URI symbols occurring in the item path part of the search result
link were NOT encoded, because that would also encode the path separator (/)
symbol. Now that `urlEncode()` never encodes the / symbol, it is safe to
encode all other URI-special symbols in the path.
2023-01-25 19:15:12 +04:00
Veloman Yunkan
239b108fa7 / is no longer a reserved char for urlEncode()
This change is a quick hack solving known issues with URI-encoding in
libkiwix.

This change removes the slash character from the list of URL separator
symbols in URL encoding/decoding utilities, and makes it a symbol that
is safe to leave unencoded.

Effects:

- `urlEncode()` never encodes the '/' symbol (even when it is requested
  to encode the URL separator symbols too).

- `urlDecode(str)`/`urlDecode(..., false)` will now decode %2F to '/';
  other encoded URL separator symbols are NOT decoded when the second
  argument of `urlDecode()` is set to false (which is the default).
2023-01-25 19:15:12 +04:00
Veloman Yunkan
c5ccbd37e2 Extracted isHarmlessUriChar() 2023-01-25 19:15:12 +04:00
Veloman Yunkan
822fb3748a Added a unit-test for urlDecode() 2023-01-25 19:15:12 +04:00
Veloman Yunkan
aa2e443eb8 Fixed indentation
Replaced tabs with spaces.
2023-01-25 19:15:11 +04:00
Veloman Yunkan
82d477009d '#' is a URI delimiter symbol 2023-01-25 19:15:11 +04:00
Veloman Yunkan
e49081da80 Fixed urlEncode() for chars below 0x10 2023-01-25 19:15:11 +04:00
Veloman Yunkan
07c7d3931d Added a unit-test of buggy urlEncode()
Added a unit-test for urlEncode() that passes for its current
implementation despite the two bugs that were revealed while creating
the unit-test.
2023-01-25 19:15:11 +04:00
Matthieu Gautier
cf59a93cf1 Merge pull request #869 from kiwix/userlang_cookie_fixes 2023-01-24 19:16:08 +01:00
Veloman Yunkan
e35e7585e0 Server sets userlang cookie as global and permanent
Without specifying the "Path" attribute of the cookie in the "Set-Cookie" header
we end up with multiple instances of the cookie for different URLs. We
want a single "global" cookie for kiwix-serve. Besides we want it to be
"permanent" rather than a session cookie, hence the large (1-year-long)
TTL value for the "Max-Age" attribute.
2023-01-24 19:01:32 +01:00
Veloman Yunkan
fcb97c3c06 Sparing use of "Set-Cookie: userlang=..." header
Server adds the "Set-Cookie: userlang=..." header to the response only
if the "userlang" cookie is not already present with the same value.
2023-01-24 19:01:32 +01:00
Veloman Yunkan
0edee4d066 Improved ServerTest.UserLanguageControl unittest
- Description of a test point was not updated in an earlier commit
  that added proper handling of the Accept-Language header. Also
  after enhancing the limited implementation it made sense to
  add another test point demonstrating that the most suitable language
  (rather than just the first one in the list) is selected.

- Now failures of the test case because of a missing Set-Cookie header
  are more informative.
2023-01-24 19:01:32 +01:00
Kelson
b9937e6859 Merge pull request #868 from adamlamar/windows-git-clone
Fix git clone on Windows
2023-01-19 08:44:36 +01:00
Adam Lamar
59012c50b4 Fix git clone on Windows
The question mark (?) is not a valid filename character on Windows.
Changing to a the pound sign (#) so that this repository can still be
cloned on Windows.
2023-01-18 23:01:14 +01:00
Matthieu Gautier
7a98878273 Merge pull request #866 from kiwix/uri_encoded_redirections 2023-01-10 15:06:18 +01:00
Veloman Yunkan
8eb527389e URI-encoding of redirections to URLs with special symbols 2023-01-10 17:41:59 +04:00
Veloman Yunkan
78b2c1a273 Testing of redirection to URLs with special symbols 2023-01-10 17:41:59 +04:00
Veloman Yunkan
497c0700b5 Fixed metadata options in create_corner_cases_zim_file
Specifying the = symbol with single-character options makes that
character included in the option value (e.g. -l=en results in the
language of the ZIM file being set to =en).
2023-01-10 17:41:59 +04:00
Veloman Yunkan
bac12010aa Updated create_corner_cases_zim_file script
Updated the create_corner_cases_zim_file to work with the latest (v3.1.3)
release of zimwriterfs.
2023-01-10 17:41:59 +04:00
Veloman Yunkan
dad33a850c Merge pull request #857 from kiwix/translatewiki
Localisation updates from https://translatewiki.net.
2023-01-09 15:20:52 +04:00
Veloman Yunkan
0968fc98ee Added new translations to i18n_resources_list.txt 2023-01-09 15:04:51 +04:00
translatewiki.net
ff44d88f21 Localisation updates from https://translatewiki.net. 2023-01-05 13:10:37 +01:00
Matthieu Gautier
1e7baee9d7 Merge pull request #862 from kiwix/suggestion_link_fix 2023-01-03 11:07:14 +01:00
Veloman Yunkan
d9342acf5b Suggestion link points to /content endpoint
Directly pointing the suggestion link to a /content/... URL avoids
an unnecessary redirection by the server (and an associated bug
related to redirection of URLs with URI-encoded special symbols in
them that - in the current implementation - go into the target URL
in decoded form).
2023-01-03 10:57:59 +01:00
Kelson
b3f1ab6579 Merge pull request #863 from kiwix/update-workflows-new-default-branch
New git default branch is 'main'
2022-12-27 14:28:13 +01:00
Emmanuel Engelhart
f5c9b2404a New git default branch is 'main' 2022-12-27 14:27:43 +01:00
Kelson
8b1fe21e4e Delete move.yml 2022-12-27 14:25:28 +01:00
Kelson
815c59ff6d "main" is the new git default branch 2022-12-27 14:23:14 +01:00
Matthieu Gautier
90318dfb6b Merge pull request #860 from kiwix/handling_of_suggestion_links_with_single_quotes 2022-12-21 12:02:58 +01:00
Veloman Yunkan
f3d2f474a7 Handling of suggestions containing special symbols
This change fixes two issues:

1. Presence of URL-specific special symbols (such as ? or #) in the book
   and/or article name resulted in a wrong suggestion link. This is
   fixed by URI-encoding the book name and the path, too.

2. Presence of a single quote symbol in the book and/or article name
   resulted in invalid javascript code in the href attribute of the
   suggestion link.

   The single quote (') symbol is not URL-encoded (unlike its double quote
   counterpart). As a result, enclosing a URL-encoded string in single
   quotes may result in invalid javascript. Using double quotes instead is
   safe, since both double quote (") and backslash (\) symbols (which are
   the only special symbols for such quoting) undergo URL-encoding.
2022-12-17 18:39:17 +04:00
Veloman Yunkan
12140098e6 Extracted makeJSLink() 2022-12-15 18:53:32 +04:00
Veloman Yunkan
c7d8081e9a gotoUrl() takes URLs relative to root location 2022-12-15 18:21:22 +04:00
Matthieu Gautier
a10067e6b6 Merge pull request #849 from kiwix/backend_userlang_control 2022-12-14 15:39:31 +01:00
Veloman Yunkan
28e9fb48b6 Properly implemented parseUserLanguagePreferences() 2022-12-14 15:34:46 +01:00
Veloman Yunkan
634f3fcf14 Properly implemented selectMostSuitableLanguage() 2022-12-14 15:34:46 +01:00
Veloman Yunkan
88597e1834 Enter selectMostSuitableLanguage() 2022-12-14 15:34:46 +01:00
Veloman Yunkan
69b3e1f8a7 Moved user language preferences into i18n.{h,cpp} 2022-12-14 15:34:46 +01:00
Veloman Yunkan
669d8898ac Enter UserLangPreferences 2022-12-14 15:34:46 +01:00
Veloman Yunkan
14f0f79061 User language control via userlang cookie 2022-12-14 15:34:46 +01:00
Veloman Yunkan
600ff07986 Test descriptions in ServerTest.UserLanguageControl 2022-12-14 15:34:46 +01:00
Veloman Yunkan
1d74b5e311 Server sets the userlang cookie on every response 2022-12-14 15:34:46 +01:00
Veloman Yunkan
c0fe6f4aee Added cookies to ServerTest.UserLanguageControl 2022-12-14 15:34:46 +01:00
Matthieu Gautier
aa7053bbe8 Merge pull request #859 from kiwix/safe_href_in_suggestion_links 2022-12-14 15:31:56 +01:00
Veloman Yunkan
99f24eb598 Safe href in suggestion links 2022-12-12 17:15:46 +04:00
Kelson
6790a144a1 Merge pull request #856 from kiwix/compress-web-fonts
Gzip compress HTTP response for Web fonts
2022-12-08 14:36:32 +01:00
Emmanuel Engelhart
cd3d2110d9 Error if run_command() fails, remove meson warning 2022-12-08 13:03:33 +01:00
Emmanuel Engelhart
b404241d0b Fix font compression tests 2022-12-08 12:55:28 +01:00
Emmanuel Engelhart
2d42d6dc60 Gzip compress HTTP response for Web fonts 2022-12-07 19:21:27 +01:00
32 changed files with 719 additions and 149 deletions

27
.github/move.yml vendored
View File

@@ -1,27 +0,0 @@
# Configuration for Move Issues - https://github.com/dessant/move-issues
# Delete the command comment when it contains no other content
deleteCommand: true
# Close the source issue after moving
closeSourceIssue: true
# Lock the source issue after moving
lockSourceIssue: false
# Mention issue and comment authors
mentionAuthors: true
# Preserve mentions in the issue content
keepContentMentions: true
# Move labels that also exist on the target repository
moveLabels: true
# Set custom aliases for targets
# aliases:
# r: repo
# or: owner/repo
# Repository to extend settings from
# _extends: repo

View File

@@ -3,7 +3,7 @@ name: CI
on:
push:
branches:
- master
- main
pull_request:
jobs:

View File

@@ -73,8 +73,8 @@ jobs:
- uses: legoktm/gh-action-dput@master
name: Upload dev package
# Only upload on pushes to master
if: github.event_name == 'push' && github.event.ref == 'refs/heads/master' && startswith(matrix.distro, 'ubuntu-')
# Only upload on pushes to git default branch
if: github.event_name == 'push' && github.event.ref == 'refs/heads/main' && startswith(matrix.distro, 'ubuntu-')
with:
gpg_key: ${{ secrets.LAUNCHPAD_GPG }}
repository: ppa:kiwixteam/dev

View File

@@ -7,10 +7,10 @@ GNU/Linux, macOS, Android, iOS, ...).
[![Release](https://img.shields.io/github/v/tag/kiwix/libkiwix?label=release&sort=semver)](https://download.kiwix.org/release/libkiwix/)
[![Repositories](https://img.shields.io/repology/repositories/libkiwix?label=repositories)](https://github.com/kiwix/libkiwix/wiki/Repology)
[![Build Status](https://github.com/kiwix/libkiwix/workflows/CI/badge.svg?query=branch%3Amaster)](https://github.com/kiwix/libkiwix/actions?query=branch%3Amaster)
[![Build Status](https://github.com/kiwix/libkiwix/workflows/CI/badge.svg?query=branch%3Amain)](https://github.com/kiwix/libkiwix/actions?query=branch%3Amain)
[![Doc](https://readthedocs.org/projects/libkiwix/badge/?style=flat)](https://libkiwix.readthedocs.org/en/latest/?badge=latest)
[![CodeFactor](https://www.codefactor.io/repository/github/kiwix/libkiwix/badge)](https://www.codefactor.io/repository/github/kiwix/libkiwix)
[![Codecov](https://codecov.io/gh/kiwix/libkiwix/branch/master/graph/badge.svg)](https://codecov.io/gh/kiwix/libkiwix)
[![Codecov](https://codecov.io/gh/kiwix/libkiwix/branch/main/graph/badge.svg)](https://codecov.io/gh/kiwix/libkiwix)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
Disclaimer

View File

@@ -82,7 +82,7 @@ std::string fullEntryXML(const Book& book, const std::string& rootLocation, cons
{"title", book.getTitle()},
{"description", book.getDescription()},
{"language", book.getLanguage()},
{"content_id", urlEncode(contentId, true)},
{"content_id", urlEncode(contentId)},
{"updated", bookDate}, // XXX: this should be the entry update datetime
{"book_date", bookDate},
{"category", book.getCategory()},
@@ -216,7 +216,7 @@ string OPDSDumper::dumpOPDSFeedV2(const std::vector<std::string>& bookIds, const
{"endpoint_root", endpointRoot},
{"feed_id", gen_uuid(libraryId + endpoint + "?" + query)},
{"filter", onlyAsNonEmptyMustacheValue(query)},
{"query", query.empty() ? "" : "?" + urlEncode(query)},
{"query", query.empty() ? "" : "?" + query},
{"totalResults", to_string(m_totalResults)},
{"startIndex", to_string(m_startIndex)},
{"itemsPerPage", to_string(m_count)},

View File

@@ -94,7 +94,7 @@ kainjow::mustache::data buildQueryData
kainjow::mustache::data query;
query.set("pattern", kiwix::encodeDiples(pattern));
std::ostringstream ss;
ss << searchProtocolPrefix << "?pattern=" << urlEncode(pattern, true);
ss << searchProtocolPrefix << "?pattern=" << urlEncode(pattern);
ss << "&" << bookQuery;
query.set("unpaginatedQuery", ss.str());
auto lang = extractValueFromQuery(bookQuery, "books.filter.lang");
@@ -171,9 +171,10 @@ std::string SearchRenderer::renderTemplate(const std::string& tmpl_str)
kainjow::mustache::data items{kainjow::mustache::data::type::list};
for (auto it = m_srs.begin(); it != m_srs.end(); it++) {
kainjow::mustache::data result;
std::string zim_id(it.getZimId());
const std::string zim_id(it.getZimId());
const auto path = mp_nameMapper->getNameForId(zim_id) + "/" + it.getPath();
result.set("title", it.getTitle());
result.set("absolutePath", absPathPrefix + urlEncode(mp_nameMapper->getNameForId(zim_id), true) + "/" + urlEncode(it.getPath()));
result.set("absolutePath", absPathPrefix + urlEncode(path));
result.set("snippet", it.getSnippet());
if (mp_library) {
result.set("bookTitle", mp_library->getBookById(zim_id).getTitle());

View File

@@ -70,6 +70,14 @@ public: // functions
return s;
}
size_t getStringCount(const std::string& lang) const {
try {
return lang2TableMap.at(lang)->entryCount;
} catch(const std::out_of_range&) {
return 0;
}
}
private: // functions
const I18nStringTable* getStringsFor(const std::string& lang) const {
try {
@@ -84,13 +92,17 @@ private: // data
const I18nStringTable* enStrings;
};
const I18nStringDB& getStringDb()
{
static const I18nStringDB stringDb;
return stringDb;
}
} // unnamed namespace
std::string getTranslatedString(const std::string& lang, const std::string& key)
{
static const I18nStringDB stringDb;
return stringDb.get(lang, key);
return getStringDb().get(lang, key);
}
namespace i18n
@@ -111,4 +123,70 @@ std::string ParameterizedMessage::getText(const std::string& lang) const
return i18n::expandParameterizedString(lang, msgId, params);
}
namespace
{
LangPreference parseSingleLanguagePreference(const std::string& s)
{
const size_t langStart = s.find_first_not_of(" \t\n");
if ( langStart == std::string::npos ) {
return {"", 0};
}
const size_t langEnd = s.find(';', langStart);
if ( langEnd == std::string::npos ) {
return {s.substr(langStart), 1};
}
const std::string lang = s.substr(langStart, langEnd - langStart);
// We don't care about langEnd == langStart which will result in an empty
// language name - it will be dismissed by parseUserLanguagePreferences()
float q = 1.0;
int nCharsScanned;
if ( 1 == sscanf(s.c_str() + langEnd + 1, "q=%f%n", &q, &nCharsScanned)
&& langEnd + 1 + nCharsScanned == s.size() ) {
return {lang, q};
}
return {"", 0};
}
} // unnamed namespace
UserLangPreferences parseUserLanguagePreferences(const std::string& s)
{
UserLangPreferences result;
std::istringstream iss(s);
std::string singleLangPrefStr;
while ( std::getline(iss, singleLangPrefStr, ',') )
{
const auto langPref = parseSingleLanguagePreference(singleLangPrefStr);
if ( !langPref.lang.empty() && langPref.preference > 0 ) {
result.push_back(langPref);
}
}
return result;
}
std::string selectMostSuitableLanguage(const UserLangPreferences& prefs)
{
if ( prefs.empty() ) {
return "en";
}
std::string bestLangSoFar("en");
float bestScoreSoFar = 0;
const auto& stringDb = getStringDb();
for ( const auto& entry : prefs ) {
const float score = entry.preference * stringDb.getStringCount(entry.lang);
if ( score > bestScoreSoFar ) {
bestScoreSoFar = score;
bestLangSoFar = entry.lang;
}
}
return bestLangSoFar;
}
} // namespace kiwix

View File

@@ -89,6 +89,18 @@ private: // data
const Parameters params;
};
struct LangPreference
{
const std::string lang;
const float preference;
};
typedef std::vector<LangPreference> UserLangPreferences;
UserLangPreferences parseUserLanguagePreferences(const std::string& s);
std::string selectMostSuitableLanguage(const UserLangPreferences& prefs);
} // namespace kiwix
#endif // KIWIX_SERVER_I18N

View File

@@ -245,7 +245,7 @@ std::pair<std::string, Library::BookIdSet> InternalServer::selectBooks(const Req
auto bookName = request.get_argument("content");
try {
const auto bookIds = Library::BookIdSet{mp_nameMapper->getIdForName(bookName)};
const auto queryString = request.get_query([&](const std::string& key){return key == "content";}, true);
const auto queryString = request.get_query([&](const std::string& key){return key == "content";});
return {queryString, bookIds};
} catch (const std::out_of_range&) {
throw Error(noSuchBookErrorMsg(bookName));
@@ -270,7 +270,7 @@ std::pair<std::string, Library::BookIdSet> InternalServer::selectBooks(const Req
}
}
const auto bookIds = Library::BookIdSet(id_vec.begin(), id_vec.end());
const auto queryString = request.get_query([&](const std::string& key){return key == "books.id";}, true);
const auto queryString = request.get_query([&](const std::string& key){return key == "books.id";});
return {queryString, bookIds};
} catch(const std::out_of_range&) {}
@@ -288,7 +288,7 @@ std::pair<std::string, Library::BookIdSet> InternalServer::selectBooks(const Req
throw Error(noSuchBookErrorMsg(bookName));
}
}
const auto queryString = request.get_query([&](const std::string& key){return key == "books.name";}, true);
const auto queryString = request.get_query([&](const std::string& key){return key == "books.name";});
return {queryString, bookIds};
} catch(const std::out_of_range&) {}
@@ -299,7 +299,7 @@ std::pair<std::string, Library::BookIdSet> InternalServer::selectBooks(const Req
throw Error(nonParameterizedMessage("no-book-found"));
}
const auto bookIds = Library::BookIdSet(id_vec.begin(), id_vec.end());
const auto queryString = request.get_query([&](const std::string& key){return startsWith(key, "books.filter.");}, true);
const auto queryString = request.get_query([&](const std::string& key){return startsWith(key, "books.filter.");});
return {queryString, bookIds};
}
@@ -1055,7 +1055,7 @@ std::unique_ptr<Response> InternalServer::handle_content(const RequestContext& r
} catch (const std::out_of_range& e) {}
if (archive == nullptr) {
const std::string searchURL = m_root + "/search?pattern=" + kiwix::urlEncode(pattern, true);
const std::string searchURL = m_root + "/search?pattern=" + kiwix::urlEncode(pattern);
return HTTP404Response(*this, request)
+ urlNotFoundMsg
+ suggestSearchMsg(searchURL, kiwix::urlDecode(pattern));
@@ -1096,7 +1096,7 @@ std::unique_ptr<Response> InternalServer::handle_content(const RequestContext& r
if (m_verbose.load())
printf("Failed to find %s\n", urlStr.c_str());
std::string searchURL = m_root + "/search?content=" + bookName + "&pattern=" + kiwix::urlEncode(pattern, true);
std::string searchURL = m_root + "/search?content=" + bookName + "&pattern=" + kiwix::urlEncode(pattern);
return HTTP404Response(*this, request)
+ urlNotFoundMsg
+ suggestSearchMsg(searchURL, kiwix::urlDecode(pattern));

View File

@@ -25,8 +25,10 @@
#include <sstream>
#include <cstdio>
#include <atomic>
#include <cctype>
#include "tools/stringTools.h"
#include "i18n.h"
namespace kiwix {
@@ -66,12 +68,13 @@ fullURL2LocalURL(const std::string& full_url, const std::string& rootLocation)
} // unnamed namespace
RequestContext::RequestContext(struct MHD_Connection* connection,
std::string rootLocation,
std::string _rootLocation,
const std::string& _url,
const std::string& _method,
const std::string& version) :
rootLocation(_rootLocation),
full_url(_url),
url(fullURL2LocalURL(_url, rootLocation)),
url(fullURL2LocalURL(_url, _rootLocation)),
method(str2RequestMethod(_method)),
version(version),
requestIndex(s_requestIndex++),
@@ -80,6 +83,7 @@ RequestContext::RequestContext(struct MHD_Connection* connection,
{
MHD_get_connection_values(connection, MHD_HEADER_KIND, &RequestContext::fill_header, this);
MHD_get_connection_values(connection, MHD_GET_ARGUMENT_KIND, &RequestContext::fill_argument, this);
MHD_get_connection_values(connection, MHD_COOKIE_KIND, &RequestContext::fill_cookie, this);
try {
acceptEncodingGzip =
@@ -89,6 +93,8 @@ RequestContext::RequestContext(struct MHD_Connection* connection,
try {
byteRange_ = ByteRange::parse(get_header(MHD_HTTP_HEADER_RANGE));
} catch (const std::out_of_range&) {}
userlang = determine_user_language();
}
RequestContext::~RequestContext()
@@ -110,14 +116,22 @@ MHD_Result RequestContext::fill_argument(void *__this, enum MHD_ValueKind kind,
if ( ! _this->queryString.empty() ) {
_this->queryString += "&";
}
_this->queryString += key;
_this->queryString += urlEncode(key);
if ( value ) {
_this->queryString += "=";
_this->queryString += value;
_this->queryString += urlEncode(value);
}
return MHD_YES;
}
MHD_Result RequestContext::fill_cookie(void *__this, enum MHD_ValueKind kind,
const char *key, const char* value)
{
RequestContext *_this = static_cast<RequestContext*>(__this);
_this->cookies[key] = value == nullptr ? "" : value;
return MHD_YES;
}
void RequestContext::print_debug_info() const {
printf("method : %s (%d)\n", method==RequestMethod::GET ? "GET" :
method==RequestMethod::POST ? "POST" :
@@ -180,6 +194,10 @@ std::string RequestContext::get_full_url() const {
return full_url;
}
std::string RequestContext::get_root_path() const {
return rootLocation.empty() ? "/" : rootLocation;
}
bool RequestContext::is_valid_url() const {
return !url.empty();
}
@@ -198,16 +216,33 @@ std::string RequestContext::get_header(const std::string& name) const {
}
std::string RequestContext::get_user_language() const
{
return userlang.lang;
}
bool RequestContext::user_language_comes_from_cookie() const
{
return userlang.selectedBy == UserLanguage::SelectorKind::COOKIE;
}
RequestContext::UserLanguage RequestContext::determine_user_language() const
{
try {
return get_argument("userlang");
return {UserLanguage::SelectorKind::QUERY_PARAM, get_argument("userlang")};
} catch(const std::out_of_range&) {}
try {
return get_header("Accept-Language");
return {UserLanguage::SelectorKind::COOKIE, cookies.at("userlang")};
} catch(const std::out_of_range&) {}
return "en";
try {
const std::string acceptLanguage = get_header("Accept-Language");
const auto userLangPrefs = parseUserLanguagePreferences(acceptLanguage);
const auto lang = selectMostSuitableLanguage(userLangPrefs);
return {UserLanguage::SelectorKind::ACCEPT_LANGUAGE_HEADER, lang};
} catch(const std::out_of_range&) {}
return {UserLanguage::SelectorKind::DEFAULT, "en"};
}
std::string RequestContext::get_requested_format() const

View File

@@ -91,20 +91,20 @@ class RequestContext {
std::string get_url() const;
std::string get_url_part(int part) const;
std::string get_full_url() const;
std::string get_root_path() const;
std::string get_query() const { return queryString; }
template<class F>
std::string get_query(F filter, bool mustEncode) const {
std::string get_query(F filter) const {
std::string q;
const char* sep = "";
auto encode = [=](const std::string& value) { return mustEncode?urlEncode(value, true):value; };
for ( const auto& a : arguments ) {
if (!filter(a.first)) {
continue;
}
for (const auto& v: a.second) {
q += sep + encode(a.first) + '=' + encode(v);
q += sep + urlEncode(a.first) + '=' + urlEncode(v);
sep = "&";
}
}
@@ -118,7 +118,25 @@ class RequestContext {
std::string get_user_language() const;
std::string get_requested_format() const;
bool user_language_comes_from_cookie() const;
private: // types
struct UserLanguage
{
enum SelectorKind
{
QUERY_PARAM,
COOKIE,
ACCEPT_LANGUAGE_HEADER,
DEFAULT
};
SelectorKind selectedBy;
std::string lang;
};
private: // data
std::string rootLocation;
std::string full_url;
std::string url;
RequestMethod method;
@@ -130,10 +148,15 @@ class RequestContext {
ByteRange byteRange_;
std::map<std::string, std::string> headers;
std::map<std::string, std::vector<std::string>> arguments;
std::map<std::string, std::string> cookies;
std::string queryString;
UserLanguage userlang;
private: // functions
UserLanguage determine_user_language() const;
static MHD_Result fill_header(void *, enum MHD_ValueKind, const char*, const char*);
static MHD_Result fill_cookie(void *, enum MHD_ValueKind, const char*, const char*);
static MHD_Result fill_argument(void *, enum MHD_ValueKind, const char*, const char*);
};

View File

@@ -64,7 +64,13 @@ bool is_compressible_mime_type(const std::string& mimeType)
|| mimeType.find("application/javascript") != std::string::npos
|| mimeType.find("application/atom") != std::string::npos
|| mimeType.find("application/opensearchdescription") != std::string::npos
|| mimeType.find("application/json") != std::string::npos;
|| mimeType.find("application/json") != std::string::npos
// Web fonts
|| mimeType.find("application/font-") != std::string::npos
|| mimeType.find("application/x-font-") != std::string::npos
|| mimeType.find("application/vnd.ms-fontobject") != std::string::npos
|| mimeType.find("font/") != std::string::npos;
}
bool compress(std::string &content) {
@@ -381,6 +387,13 @@ MHD_Result Response::send(const RequestContext& request, MHD_Connection* connect
MHD_add_response_header(response, p.first.c_str(), p.second.c_str());
}
if ( ! request.user_language_comes_from_cookie() ) {
const std::string cookie = "userlang=" + request.get_user_language()
+ ";Path=" + request.get_root_path()
+ ";Max-Age=31536000";
MHD_add_response_header(response, MHD_HTTP_HEADER_SET_COOKIE, cookie.c_str());
}
if (m_returnCode == MHD_HTTP_OK && m_byteRange.kind() == ByteRange::RESOLVED_PARTIAL_CONTENT)
m_returnCode = MHD_HTTP_PARTIAL_CONTENT;

View File

@@ -322,9 +322,6 @@ kainjow::mustache::data kiwix::onlyAsNonEmptyMustacheValue(const std::string& s)
std::string kiwix::render_template(const std::string& template_str, kainjow::mustache::data data)
{
kainjow::mustache::mustache tmpl(template_str);
kainjow::mustache::data urlencode{kainjow::mustache::lambda2{
[](const std::string& str,const kainjow::mustache::renderer& r) { return urlEncode(r(str), true); }}};
data.set("urlencoded", urlencode);
std::stringstream ss;
tmpl.render(data, [&ss](const std::string& str) { ss << str; });
return ss.str();

View File

@@ -161,15 +161,14 @@ std::string kiwix::encodeDiples(const std::string& str)
return result;
}
/* urlEncode() based on javascript encodeURI() &
encodeURIComponent(). Mostly code from rstudio/httpuv (GPLv3) */
namespace
{
bool isReservedUrlChar(char c)
{
switch (c) {
case ';':
case ',':
case '/':
case '?':
case ':':
case '@':
@@ -177,22 +176,22 @@ bool isReservedUrlChar(char c)
case '=':
case '+':
case '$':
case '#':
return true;
default:
return false;
}
}
bool needsEscape(char c, bool encodeReserved)
bool isHarmlessUriChar(char c)
{
if (c >= 'a' && c <= 'z')
return false;
return true;
if (c >= 'A' && c <= 'Z')
return false;
return true;
if (c >= '0' && c <= '9')
return false;
if (isReservedUrlChar(c))
return encodeReserved;
return true;
switch (c) {
case '-':
case '_':
@@ -203,8 +202,46 @@ bool needsEscape(char c, bool encodeReserved)
case '\'':
case '(':
case ')':
return false;
case '/':
return true;
}
return false;
}
bool mustBeUriEncodedFor(kiwix::URIComponentKind target, char c)
{
if (isHarmlessUriChar(c))
return false;
switch (c) {
case '/': // There is no reason to encode the path separator in the general
// case. It must be encoded only in a path component when its
// semantics of a path separator has to be suppressed.
return false;
case '@': // In a relative URL of the form abc@def/xyz (with no / in abc)
// a non-encoded @ will make "abc" and "def" to be interpreted as
// username and host components, respectively
return target == kiwix::URIComponentKind::PATH;
case ':': // In a relative URL of the form abc:def/xyz (with no / in abc)
// a non-encoded : will make "abc" and "def" to be interpreted as
// host and port components, respectively
return target == kiwix::URIComponentKind::PATH;
case '?': // A non-encoded '?' acts as a separator between the path
// and query components
return target == kiwix::URIComponentKind::PATH;
case '&': return target == kiwix::URIComponentKind::QUERY;
case '=': return target == kiwix::URIComponentKind::QUERY;
case '+': return target == kiwix::URIComponentKind::QUERY;
case '#': // A non-encoded '#' in either path or query-component
// would mark the beginning of the fragment component
return true;
}
return true;
}
@@ -230,23 +267,43 @@ int hexToInt(char c) {
}
}
std::string kiwix::urlEncode(const std::string& value, bool encodeReserved)
} // unnamed namespace
std::string kiwix::urlEncode(const std::string& value)
{
std::ostringstream os;
os << std::hex << std::uppercase;
for (std::string::const_iterator it = value.begin();
it != value.end();
it++) {
if (!needsEscape(*it, encodeReserved)) {
os << *it;
for (const char c : value) {
if (isHarmlessUriChar(c)) {
os << c;
} else {
os << '%' << std::setw(2) << static_cast<unsigned int>(static_cast<unsigned char>(*it));
const unsigned int charVal = static_cast<unsigned char>(c);
os << '%' << std::setw(2) << std::setfill('0') << charVal;
}
}
return os.str();
}
namespace kiwix
{
std::string uriEncode(URIComponentKind target, const std::string& value)
{
std::ostringstream os;
os << std::hex << std::uppercase;
for (const char c : value) {
if ( mustBeUriEncodedFor(target, c) ) {
const unsigned int charVal = static_cast<unsigned char>(c);
os << '%' << std::setw(2) << std::setfill('0') << charVal;
} else {
os << c;
}
}
return os.str();
}
} // namespace kiwix
std::string kiwix::urlDecode(const std::string& value, bool component)
{
std::ostringstream os;
@@ -267,15 +324,15 @@ std::string kiwix::urlDecode(const std::string& value, bool component)
int iHi = hexToInt(hi);
int iLo = hexToInt(lo);
if (iHi < 0 || iLo < 0) {
// Invalid escape sequence
os << '%' << hi << lo;
continue;
// Invalid escape sequence
os << '%' << hi << lo;
continue;
}
char c = (char)(iHi << 4 | iLo);
if (!component && isReservedUrlChar(c)) {
os << '%' << hi << lo;
os << '%' << hi << lo;
} else {
os << c;
os << c;
}
} else {
os << *it;

View File

@@ -55,9 +55,22 @@ private:
};
std::string urlEncode(const std::string& value, bool encodeReserved = false);
/* urlEncode() is the equivalent of JS encodeURIComponent(), with the only
* difference that the slash (/) symbol is NOT encoded. */
std::string urlEncode(const std::string& value);
std::string urlDecode(const std::string& value, bool component = false);
// Only URI components that are of interest to libkiwix
// are included in the below enumeration type
enum class URIComponentKind
{
PATH,
QUERY
};
// Encode 'value' for usage in a URI componenet specified by 'target'
std::string uriEncode(URIComponentKind target, const std::string& value);
std::string join(const std::vector<std::string>& list, const std::string& sep);
std::string ucAll(const std::string& word);

33
static/i18n/ar.json Normal file
View File

@@ -0,0 +1,33 @@
{
"@metadata": {
"authors": [
"Asma",
"Ravan",
"محمد أحمد عبد الفتاح"
]
},
"name": "الإنجليزية",
"no-such-book": "لا يوجد مثل هذا الكتاب: {{BOOK_NAME}}",
"too-many-books": "طلب العديد من الكتب {{NB_BOOKS}} حيث الحد {{LIMIT}}",
"no-book-found": "لا يوجد كتاب يطابق معايير الاختيار",
"url-not-found": "لم يتم العثور على عنوان URL المطلوب \"{{url}}\" على هذا الخادم.",
"suggest-search": "قم بإجراء بحث عن النص الكامل لـ <a href=\"{{{SEARCH_URL}}}\">{{PATTERN}}</a>",
"random-article-failure": "مع الأسف! فشل اختيار مقال عشوائي :(",
"invalid-raw-data-type": "{{DATATYPE}} ليس طلبًا صالحًا للمحتوى الأولي.",
"no-value-for-arg": "لم يتم تقديم قيمة للوسيطة {{ARGUMENT}}",
"no-query": "لم يتم تقديم ملخص.",
"raw-entry-not-found": "لا يمكن العثور على إدخال {{DATATYPE}} {{ENTRY}}",
"400-page-title": "طلب غير صالح",
"400-page-heading": "طلب غير صالح",
"404-page-title": "المحتوى غير موجود",
"404-page-heading": "لم يتم العثور عليه",
"500-page-title": "خطأ في الخادم الداخلي",
"500-page-heading": "خطأ في الخادم الداخلي",
"fulltext-search-unavailable": "البحث عن النص الكامل غير متاح",
"no-search-results": "محرك البحث عن النص الكامل غير متاح لهذا المحتوى.",
"library-button-text": "اذهب لصفحة الترحيب",
"home-button-text": "انتقل إلى الصفحة الرئيسية لـ \"{{BOOK_TITLE}}\"",
"random-page-button-text": "اذهب إلى صفحة عشوائية",
"searchbox-tooltip": "بحث \"{{BOOK_TITLE}}\"",
"confusion-of-tongues": "قد يشارك في البحث كتابان أو أكثر بلغات مختلفة، مما قد يؤدي إلى نتائج محيرة."
}

32
static/i18n/sl.json Normal file
View File

@@ -0,0 +1,32 @@
{
"@metadata": {
"authors": [
"Eleassar"
]
},
"name": "slovenščina",
"suggest-full-text-search": "vsebuje »{{{SEARCH_TERMS}}}« ...",
"no-such-book": "Ni take knjige: {{BOOK_NAME}}",
"too-many-books": "Preveč zahtevanih knjig ({{NB_BOOKS}}), omejitev je {{LIMIT}}",
"no-book-found": "Izbirnim merilom ne ustreza nobena knjiga",
"url-not-found": "Zahtevanega URL-ja »{{url}}« v tem strežniku ni bilo mogoče najti.",
"suggest-search": "Preiščite celotno besedilo za <a href=\"{{{SEARCH_URL}}}\">{{PATTERN}}</a>",
"random-article-failure": "Ups! Ni bilo mogoče izbrati naključnega članka :(",
"invalid-raw-data-type": "{{DATATYPE}} ni veljaven zahtevek za neobdelano vsebino.",
"no-value-for-arg": "Argument {{ARGUMENT}} nima določene nobene vrednosti",
"no-query": "Poizvedba ni podana.",
"raw-entry-not-found": "Ni mogoče najti vnosa {{ENTRY}} vrste {{DATATYPE}}",
"400-page-title": "Neveljaven zahtevek",
"400-page-heading": "Neveljaven zahtevek",
"404-page-title": "Vsebine ni mogoče najti",
"404-page-heading": "Ni najdeno",
"500-page-title": "Notranja napaka strežnika",
"500-page-heading": "Notranja napaka strežnika",
"fulltext-search-unavailable": "Iskanje po celotnem besedilu ni na voljo",
"no-search-results": "Iskalnik po celotnem besedilu za to vsebino ni na voljo.",
"library-button-text": "Pojdite na pozdravno stran",
"home-button-text": "Pojdite na glavno stran »{{BOOK_TITLE}}«",
"random-page-button-text": "Pojdite na naključno izbrano stran",
"searchbox-tooltip": "Poiščite »{{BOOK_TITLE}}«",
"confusion-of-tongues": "V iskanju bi bili uporabljeni dve ali več knjig v različnih jezikih, kar lahko pripelje do nejasnih zadetkov."
}

View File

@@ -28,5 +28,6 @@
"library-button-text": "前往歡迎首頁",
"home-button-text": "前往「{{BOOK_TITLE}}」的首頁",
"random-page-button-text": "前往隨機選取頁面",
"searchbox-tooltip": "在{{BOOK_TITLE}}搜尋"
"searchbox-tooltip": "在{{BOOK_TITLE}}搜尋",
"confusion-of-tongues": "搜索裡有加入兩本或更多不同語言的書籍,這可能會導致混淆結果。"
}

View File

@@ -1,3 +1,4 @@
i18n/ar.json
i18n/bn.json
i18n/cs.json
i18n/de.json
@@ -15,6 +16,7 @@ i18n/pl.json
i18n/ru.json
i18n/sc.json
i18n/sk.json
i18n/sl.json
i18n/sv.json
i18n/test.json
i18n/tr.json

View File

@@ -1,6 +1,7 @@
resource_files = run_command(res_manager,
'--list-all',
files('resources_list.txt')
files('resources_list.txt'),
check: true
).stdout().strip().split('\n')
preprocessed_resources = custom_target('preprocessed_resource_files',
@@ -33,7 +34,8 @@ lib_resources = custom_target('resources',
i18n_resource_files = run_command(find_program('python3'),
'-c',
'import sys; f=open(sys.argv[1]); print(f.read())',
files('i18n_resources_list.txt')
files('i18n_resources_list.txt'),
check: true
).stdout().strip().split('\n')
i18n_resources = custom_target('i18n_resources',

View File

@@ -43,17 +43,27 @@ function gotoMainPageOfCurrentBook() {
}
function gotoUrl(url) {
contentIframe.src = url;
contentIframe.src = root + url;
}
function gotoRandomPage() {
gotoUrl(`${root}/random?content=${currentBook}`);
gotoUrl(`/random?content=${currentBook}`);
}
function performSearch() {
const searchbox = document.getElementById('kiwixsearchbox');
const q = encodeURIComponent(searchbox.value);
gotoUrl(`${root}/search?books.name=${currentBook}&pattern=${q}`);
gotoUrl(`/search?books.name=${currentBook}&pattern=${q}`);
}
function makeJSLink(jsCodeString, linkText, linkAttr="") {
// Values of the href attribute are assumed by the browser to be
// fully URI-encoded (no matter what the scheme is). Therefore, in
// order to prevent the browser from decoding any URI-encoded parts
// in the JS code we have to URI-encode a second time.
// (see https://stackoverflow.com/questions/33721510)
const uriEncodedJSCode = encodeURIComponent(jsCodeString);
return `<a ${linkAttr} href="javascript:${uriEncodedJSCode}">${linkText}</a>`;
}
function suggestionsApiURL()
@@ -336,13 +346,21 @@ function setupSuggestions() {
},
resultItem: {
element: (item, data) => {
let searchLink;
const uriEncodedBookName = encodeURIComponent(currentBook);
let url;
if (data.value.kind == "path") {
searchLink = `${root}/${currentBook}/${htmlDecode(data.value.path)}`;
const path = encodeURIComponent(htmlDecode(data.value.path));
url = `/content/${uriEncodedBookName}/${path}`;
} else {
searchLink = `${root}/search?content=${encodeURIComponent(currentBook)}&pattern=${encodeURIComponent(htmlDecode(data.value.value))}`;
const pattern = encodeURIComponent(htmlDecode(data.value.value));
url = `/search?content=${uriEncodedBookName}&pattern=${pattern}`;
}
item.innerHTML = `<a class="suggest" href="javascript:gotoUrl('${searchLink}')">${htmlDecode(data.value.label)}</a>`;
// url can't contain any double quote and/or backslash symbols
// since they should have been URI-encoded. Therefore putting it
// inside double quotes should result in valid javascript.
const jsAction = `gotoUrl("${url}")`;
const linkText = htmlDecode(data.value.label);
item.innerHTML = makeJSLink(jsAction, linkText, 'class="suggest"');
},
highlight: "autoComplete_highlight",
selected: "autoComplete_selected"

View File

Binary file not shown.

1
test/data/corner_cases/c# Symbolic link
View File

@@ -0,0 +1 @@
c#.html

View File

@@ -0,0 +1,10 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>C#</title>
</head>
<body>
<p>C# (pronounced see sharp) is a general-purpose, high-level multi-paradigm programming language. C# encompasses static typing, strong typing, lexically scoped, imperative, declarative, functional, generic, object-oriented (class-based), and component-oriented programming disciplines</p>
</body>
</html>

View File

@@ -0,0 +1 @@
c#.html

View File

@@ -2,13 +2,14 @@
cd "$(dirname "$0")"
rm -f corner_cases.zim
zimwriterfs -w empty.html \
-f empty.png \
-l=en \
-t="ZIM corner cases" \
-d="" \
-c="" \
-p="" \
zimwriterfs --withoutFTIndex --dont-check-arguments \
-w empty.html \
-I empty.png \
-l en \
-t "ZIM corner cases" \
-d "" \
-c "" \
-p "" \
corner_cases \
corner_cases.zim \
&& echo 'corner_cases.zim was successfully created' \

View File

@@ -193,7 +193,7 @@ TEST_F(LibraryServerTest, catalog_search_by_phrase)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (q=&quot;ray charles&quot;)</title>\n"
" <title>Filtered zims (q=%22ray%20charles%22)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>2</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -212,7 +212,7 @@ TEST_F(LibraryServerTest, catalog_search_by_words)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (q=ray charles)</title>\n"
" <title>Filtered zims (q=ray%20charles)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>3</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -233,7 +233,7 @@ TEST_F(LibraryServerTest, catalog_prefix_search)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (q=description:ray description:charles)</title>\n"
" <title>Filtered zims (q=description%3Aray%20description%3Acharles)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>2</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -250,7 +250,7 @@ TEST_F(LibraryServerTest, catalog_prefix_search)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (q=title:&quot;ray charles&quot;)</title>\n"
" <title>Filtered zims (q=title%3A%22ray%20charles%22)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>1</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -269,7 +269,7 @@ TEST_F(LibraryServerTest, catalog_search_with_word_exclusion)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (q=ray -uncategorized)</title>\n"
" <title>Filtered zims (q=ray%20-uncategorized)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>2</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -288,7 +288,7 @@ TEST_F(LibraryServerTest, catalog_search_by_tag)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (tag=_category:jazz)</title>\n"
" <title>Filtered zims (tag=_category%3Ajazz)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>1</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -342,7 +342,7 @@ TEST_F(LibraryServerTest, catalog_search_by_language)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (lang=eng,fra)</title>\n"
" <title>Filtered zims (lang=eng%2Cfra)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>2</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -694,7 +694,7 @@ TEST_F(LibraryServerTest, catalog_v2_entries_filtered_by_search_terms)
EXPECT_EQ(r->status, 200);
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
CATALOG_V2_ENTRIES_PREAMBLE("?q=%22ray%20charles%22")
" <title>Filtered Entries (q=&quot;ray charles&quot;)</title>\n"
" <title>Filtered Entries (q=%22ray%20charles%22)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>2</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -726,8 +726,8 @@ TEST_F(LibraryServerTest, catalog_v2_entries_filtered_by_language)
const auto r = zfs1_->GET("/ROOT/catalog/v2/entries?lang=eng,fra");
EXPECT_EQ(r->status, 200);
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
CATALOG_V2_ENTRIES_PREAMBLE("?lang=eng,fra")
" <title>Filtered Entries (lang=eng,fra)</title>\n"
CATALOG_V2_ENTRIES_PREAMBLE("?lang=eng%2Cfra")
" <title>Filtered Entries (lang=eng%2Cfra)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>2</totalResults>\n"
" <startIndex>0</startIndex>\n"
@@ -865,7 +865,7 @@ TEST_F(LibraryServerTest, no_name_mapper_returned_catalog_use_uuid_in_link)
EXPECT_EQ(maskVariableOPDSFeedData(r->body),
OPDS_FEED_TAG
" <id>12345678-90ab-cdef-1234-567890abcdef</id>\n"
" <title>Filtered zims (tag=_category:jazz)</title>\n"
" <title>Filtered zims (tag=_category%3Ajazz)</title>\n"
" <updated>YYYY-MM-DDThh:mm:ssZ</updated>\n"
" <totalResults>1</totalResults>\n"
" <startIndex>0</startIndex>\n"

View File

@@ -37,34 +37,34 @@ TEST(OpdsCatalog, getSearchUrl)
}
{
Filter f;
f.query("abc def");
EXPECT_SEARCH_URL("/catalog/v2/entries?q=abc%20def");
f.query("abc def#xyz");
EXPECT_SEARCH_URL("/catalog/v2/entries?q=abc%20def%23xyz");
}
{
Filter f;
f.category("ted");
EXPECT_SEARCH_URL("/catalog/v2/entries?category=ted");
f.category("ted&bob");
EXPECT_SEARCH_URL("/catalog/v2/entries?category=ted%26bob");
}
{
Filter f;
f.lang("eng");
EXPECT_SEARCH_URL("/catalog/v2/entries?lang=eng");
f.lang("eng,fra");
EXPECT_SEARCH_URL("/catalog/v2/entries?lang=eng%2Cfra");
}
{
Filter f;
f.name("second");
EXPECT_SEARCH_URL("/catalog/v2/entries?name=second");
f.name("second?");
EXPECT_SEARCH_URL("/catalog/v2/entries?name=second%3F");
}
{
Filter f;
f.acceptTags({"paper", "plastic"});
EXPECT_SEARCH_URL("/catalog/v2/entries?tag=paper;plastic");
f.acceptTags({"#paper", "#plastic"});
EXPECT_SEARCH_URL("/catalog/v2/entries?tag=%23paper%3B%23plastic");
}
{
Filter f;
f.query("abc");
f.category("ted");
EXPECT_SEARCH_URL("/catalog/v2/entries?q=abc&category=ted");
f.query("abc=123");
f.category("@ted");
EXPECT_SEARCH_URL("/catalog/v2/entries?q=abc%3D123&category=%40ted");
}
{
Filter f;
@@ -79,7 +79,7 @@ TEST(OpdsCatalog, getSearchUrl)
f.lang("html");
f.name("edsonarantesdonascimento");
f.acceptTags({"body", "script"});
EXPECT_SEARCH_URL("/catalog/v2/entries?q=peru&category=scifi&lang=html&name=edsonarantesdonascimento&tag=body;script");
EXPECT_SEARCH_URL("/catalog/v2/entries?q=peru&category=scifi&lang=html&name=edsonarantesdonascimento&tag=body%3Bscript");
}
#undef EXPECT_SEARCH_URL
}

View File

@@ -20,6 +20,7 @@
#include "gtest/gtest.h"
#include "../src/tools/otherTools.h"
#include "zim/suggestion_iterator.h"
#include "../src/server/i18n.h"
#include <regex>
@@ -172,3 +173,63 @@ R"EXPECTEDJSON([
)EXPECTEDJSON"
);
}
std::string toString(const kiwix::LangPreference& x)
{
std::ostringstream oss;
oss << "{" << x.lang << ", " << x.preference << "}";
return oss.str();
}
std::string toString(const kiwix::UserLangPreferences& prefs) {
std::ostringstream oss;
for ( const auto& x : prefs )
oss << toString(x);
return oss.str();
}
TEST(I18n, parseUserLanguagePreferences)
{
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("")),
""
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("*")),
"{*, 1}"
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("fr")),
"{fr, 1}"
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("fr-CH")),
"{fr-CH, 1}"
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("fr, en-US")),
"{fr, 1}{en-US, 1}"
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("ru;q=0.5")),
"{ru, 0.5}"
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("fr-CH,ru;q=0.5")),
"{fr-CH, 1}{ru, 0.5}"
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("ru;q=0.5, *;q=0.1")),
"{ru, 0.5}{*, 0.1}"
);
// rejected input
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("ru;")),
""
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("ru;q")),
""
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("ru;q=")),
""
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("ru;0.8")),
""
);
EXPECT_EQ(toString(kiwix::parseUserLanguagePreferences("fr,ru;0.8,en;q=0.5")),
"{fr, 1}{en, 0.5}"
);
}

View File

@@ -69,12 +69,15 @@ const ResourceCollection resources200Compressible{
{ DYNAMIC_CONTENT, "/ROOT/skin/taskbar.css" },
{ STATIC_CONTENT, "/ROOT/skin/taskbar.css?cacheid=216d6b5d" },
{ DYNAMIC_CONTENT, "/ROOT/skin/viewer.js" },
{ STATIC_CONTENT, "/ROOT/skin/viewer.js?cacheid=51e745c2" },
{ STATIC_CONTENT, "/ROOT/skin/viewer.js?cacheid=ab5374c5" },
{ DYNAMIC_CONTENT, "/ROOT/skin/fonts/Poppins.ttf" },
{ STATIC_CONTENT, "/ROOT/skin/fonts/Poppins.ttf?cacheid=af705837" },
{ DYNAMIC_CONTENT, "/ROOT/skin/fonts/Roboto.ttf" },
{ STATIC_CONTENT, "/ROOT/skin/fonts/Roboto.ttf?cacheid=84d10248" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/search" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/root.xml" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/languages" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/entries" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/partial_entries" },
@@ -124,10 +127,6 @@ const ResourceCollection resources200Uncompressible{
{ STATIC_CONTENT, "/ROOT/skin/favicon/safari-pinned-tab.svg?cacheid=8d487e95" },
{ DYNAMIC_CONTENT, "/ROOT/skin/favicon/site.webmanifest" },
{ STATIC_CONTENT, "/ROOT/skin/favicon/site.webmanifest?cacheid=bc396efb" },
{ DYNAMIC_CONTENT, "/ROOT/skin/fonts/Poppins.ttf" },
{ STATIC_CONTENT, "/ROOT/skin/fonts/Poppins.ttf?cacheid=af705837" },
{ DYNAMIC_CONTENT, "/ROOT/skin/fonts/Roboto.ttf" },
{ STATIC_CONTENT, "/ROOT/skin/fonts/Roboto.ttf?cacheid=84d10248" },
{ DYNAMIC_CONTENT, "/ROOT/skin/hash.png" },
{ STATIC_CONTENT, "/ROOT/skin/hash.png?cacheid=f836e872" },
{ DYNAMIC_CONTENT, "/ROOT/skin/magnet.png" },
@@ -150,6 +149,7 @@ const ResourceCollection resources200Uncompressible{
{ DYNAMIC_CONTENT, "/ROOT/catalog/searchdescription.xml" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/categories" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/languages" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/searchdescription.xml" },
{ DYNAMIC_CONTENT, "/ROOT/catalog/v2/illustration/6f1d19d0-633f-087b-fb55-7ac324ff9baf?size=48" },
@@ -157,9 +157,9 @@ const ResourceCollection resources200Uncompressible{
{ ZIM_CONTENT, "/ROOT/content/zimfile/I/m/Ray_Charles_classic_piano_pose.jpg" },
{ ZIM_CONTENT, "/ROOT/content/corner_cases/A/empty.html" },
{ ZIM_CONTENT, "/ROOT/content/corner_cases/-/empty.css" },
{ ZIM_CONTENT, "/ROOT/content/corner_cases/-/empty.js" },
{ ZIM_CONTENT, "/ROOT/content/corner_cases/empty.html" },
{ ZIM_CONTENT, "/ROOT/content/corner_cases/empty.css" },
{ ZIM_CONTENT, "/ROOT/content/corner_cases/empty.js" },
// The following url's responses are too small to be compressed
@@ -291,7 +291,7 @@ R"EXPECTEDRESULT( <img src="../skin/download.png?
/* url */ "/ROOT/viewer",
R"EXPECTEDRESULT( <link type="text/css" href="./skin/taskbar.css?cacheid=216d6b5d" rel="Stylesheet" />
<link type="text/css" href="./skin/css/autoComplete.css?cacheid=08951e06" rel="Stylesheet" />
<script type="text/javascript" src="./skin/viewer.js?cacheid=51e745c2" defer></script>
<script type="text/javascript" src="./skin/viewer.js?cacheid=ab5374c5" defer></script>
<script type="text/javascript" src="./skin/autoComplete.min.js?cacheid=1191aaaf"></script>
const blankPageUrl = root + "/skin/blank.html?cacheid=6b1fa032";
<label for="kiwix_button_show_toggle"><img src="./skin/caret.png?cacheid=22b942b4" alt=""></label>
@@ -827,7 +827,7 @@ TEST_F(ServerTest, Http400HtmlError)
expected_body==R"(
<h1>Invalid request</h1>
<p>
The requested URL "/ROOT/search?content=non-existing-book&pattern=a"&lt;script foo&gt;" is not a valid request.
The requested URL "/ROOT/search?content=non-existing-book&pattern=a%22%3Cscript%20foo%3E" is not a valid request.
</p>
<p>
No such book: non-existing-book
@@ -910,7 +910,7 @@ TEST_F(ServerTest, HttpXmlError)
/* HTTP status code */ 400,
/* expected response XML */ R"(
<error>Invalid request</error>
<detail>The requested URL "/ROOT/search?format=xml&content=non-existing-book&pattern=a"&lt;script foo&gt;" is not a valid request.</detail>
<detail>The requested URL "/ROOT/search?format=xml&content=non-existing-book&pattern=a%22%3Cscript%20foo%3E" is not a valid request.</detail>
<detail>No such book: non-existing-book</detail>
)" },
// There is a flaw in our way to handle query string, we cannot differenciate
@@ -976,57 +976,154 @@ TEST_F(ServerTest, UserLanguageControl)
{
struct TestData
{
const std::string description;
const std::string url;
const std::string acceptLanguageHeader;
const char* const requestCookie; // Cookie: header of the request
const char* const responseSetCookie; // Set-Cookie: header of the response
const std::string expectedH1;
operator TestContext() const
{
return TestContext{
TestContext ctx{
{"description", description},
{"url", url},
{"acceptLanguageHeader", acceptLanguageHeader},
};
if ( requestCookie ) {
ctx.push_back({"requestCookie", requestCookie});
}
return ctx;
}
};
const char* const NO_COOKIE = nullptr;
const TestData testData[] = {
{
"Default user language is English",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=en;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "Not Found"
},
{
"userlang URL query parameter is respected",
/*url*/ "/ROOT/content/zimfile/invalid-article?userlang=en",
/*Accept-Language:*/ "",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=en;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "Not Found"
},
{
"userlang URL query parameter is respected",
/*url*/ "/ROOT/content/zimfile/invalid-article?userlang=test",
/*Accept-Language:*/ "",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=test;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"'Accept-Language: *' is handled",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "*",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=en;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "Not Found"
},
{
"Accept-Language: header is respected",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "test",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=test;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
// userlang query parameter takes precedence over Accept-Language
"userlang cookie is respected",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "",
/*Request Cookie:*/ "userlang=test",
/*Response Set-Cookie:*/ NO_COOKIE,
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"userlang cookie is correctly parsed",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "",
/*Request Cookie:*/ "anothercookie=123; userlang=test",
/*Response Set-Cookie:*/ NO_COOKIE,
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"userlang cookie is correctly parsed",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "",
/*Request Cookie:*/ "userlang=test; anothercookie=abc",
/*Response Set-Cookie:*/ NO_COOKIE,
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"userlang cookie is correctly parsed",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "",
/*Request Cookie:*/ "cookie1=abc; userlang=test; cookie2=xyz",
/*Response Set-Cookie:*/ NO_COOKIE,
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"Multiple userlang cookies are not a problem",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "",
/*Request Cookie:*/ "cookie1=abc; userlang=en; userlang=test; cookie2=xyz",
/*Response Set-Cookie:*/ NO_COOKIE,
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"userlang query parameter takes precedence over Accept-Language",
/*url*/ "/ROOT/content/zimfile/invalid-article?userlang=en",
/*Accept-Language:*/ "test",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=en;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "Not Found"
},
{
// The value of the Accept-Language header is not currently parsed.
"userlang query parameter takes precedence over its cookie counterpart",
/*url*/ "/ROOT/content/zimfile/invalid-article?userlang=en",
/*Accept-Language:*/ "",
/*Request Cookie:*/ "userlang=test",
/*Response Set-Cookie:*/ "userlang=en;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "Not Found"
},
{
"userlang in cookies takes precedence over Accept-Language",
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "test",
/*Request Cookie:*/ "userlang=en",
/*Response Set-Cookie:*/ NO_COOKIE,
/* expected <h1> */ "Not Found"
},
{
"Most suitable language is selected from the Accept-Language header",
// In case of a comma separated list of languages (optionally weighted
// with quality values) the default (en) language is used instead.
// with quality values) the most suitable language is selected.
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "test;q=0.9, en;q=0.2",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=test;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "[I18N TESTING] Content not found, but at least the server is alive"
},
{
"Most suitable language is selected from the Accept-Language header",
// In case of a comma separated list of languages (optionally weighted
// with quality values) the most suitable language is selected.
/*url*/ "/ROOT/content/zimfile/invalid-article",
/*Accept-Language:*/ "test;q=0.2, en;q=0.9",
/*Request Cookie:*/ NO_COOKIE,
/*Response Set-Cookie:*/ "userlang=en;Path=/ROOT;Max-Age=31536000",
/* expected <h1> */ "Not Found"
},
};
@@ -1038,7 +1135,16 @@ TEST_F(ServerTest, UserLanguageControl)
if ( !t.acceptLanguageHeader.empty() ) {
headers.insert({"Accept-Language", t.acceptLanguageHeader});
}
if ( t.requestCookie ) {
headers.insert({"Cookie", t.requestCookie});
}
const auto r = zfs1_->GET(t.url.c_str(), headers);
if ( t.responseSetCookie ) {
ASSERT_TRUE(r->has_header("Set-Cookie")) << t;
EXPECT_EQ(t.responseSetCookie, getHeaderValue(r->headers, "Set-Cookie")) << t;
} else {
EXPECT_FALSE(r->has_header("Set-Cookie"));
}
std::regex_search(r->body, h1Match, h1Regex);
const std::string h1(h1Match[1]);
EXPECT_EQ(h1, t.expectedH1) << t;
@@ -1101,6 +1207,17 @@ TEST_F(ServerTest, NonEndpointUrlsAreRedirectedToContentUrls)
}
}
TEST_F(ServerTest, RedirectionsToURLsWithSpecialSymbols)
{
auto g = zfs1_->GET("/ROOT/content/corner_cases/c_sharp.html");
ASSERT_EQ(302, g->status);
ASSERT_TRUE(g->has_header("Location"));
ASSERT_EQ(g->get_header_value("Location"), "/ROOT/content/corner_cases/c%23.html");
ASSERT_EQ(getCacheControlHeader(*g), "max-age=0, must-revalidate");
ASSERT_FALSE(g->has_header("ETag"));
}
TEST_F(ServerTest, BookMainPageIsRedirectedToArticleIndex)
{
{
@@ -1412,7 +1529,7 @@ TEST_F(ServerTest, InvalidAndMultiRangeByteRangeRequestsResultIn416Responses)
TEST_F(ServerTest, ValidByteRangeRequestsOfZeroSizedEntriesResultIn416Responses)
{
const char url[] = "/ROOT/content/corner_cases/-/empty.js";
const char url[] = "/ROOT/content/corner_cases/empty.js";
const char* ranges[] = {
"bytes=0-",

View File

@@ -196,7 +196,7 @@ struct SearchResult
const std::vector<SearchResult> LARGE_SEARCH_RESULTS = {
SEARCH_RESULT(
/*link*/ "/ROOT/content/zimfile/A/Genius_+_Soul_=_Jazz",
/*link*/ "/ROOT/content/zimfile/A/Genius_%2B_Soul_%3D_Jazz",
/*title*/ "Genius + Soul = Jazz",
/*snippet*/ R"SNIPPET(...Grammy Hall of Fame in 2011. It was re-issued in the UK, first in 1989 on the Castle Communications "Essential Records" label, and by Rhino Records in 1997 on a single CD together with Charles' 1970 My Kind of <b>Jazz</b>. In 2010, Concord Records released a deluxe edition comprising digitally remastered versions of Genius + Soul = <b>Jazz</b>, My Kind of <b>Jazz</b>, <b>Jazz</b> Number II, and My Kind of <b>Jazz</b> Part 3. Professional ratings Review scores Source Rating Allmusic link Warr.org link Encyclopedia of Popular Music...)SNIPPET",
/*bookTitle*/ "Ray Charles",
@@ -236,7 +236,7 @@ const std::vector<SearchResult> LARGE_SEARCH_RESULTS = {
),
SEARCH_RESULT(
/*link*/ "/ROOT/content/zimfile/A/Catchin'_Some_Rays:_The_Music_of_Ray_Charles",
/*link*/ "/ROOT/content/zimfile/A/Catchin'_Some_Rays%3A_The_Music_of_Ray_Charles",
/*title*/ "Catchin&apos; Some Rays: The Music of Ray Charles",
/*snippet*/ R"SNIPPET(...<b>jazz</b> singer Roseanna Vitro, released in August 1997 on the Telarc <b>Jazz</b> label. Catchin' Some Rays: The Music of Ray Charles Studio album by Roseanna Vitro Released August 1997 Recorded March 26, 1997 at Sound on Sound, NYC April 4,1997 at Quad Recording Studios, NYC Genre Vocal <b>jazz</b> Length 61:00 Label Telarc <b>Jazz</b> CD-83419 Producer Paul Wickliffe Roseanna Vitro chronology Passion Dance (1996) Catchin' Some Rays: The Music of Ray Charles (1997) The Time of My Life: Roseanna Vitro Sings the Songs of......)SNIPPET",
/*bookTitle*/ "Ray Charles",
@@ -244,7 +244,7 @@ const std::vector<SearchResult> LARGE_SEARCH_RESULTS = {
),
SEARCH_RESULT(
/*link*/ "/ROOT/content/zimfile/A/That's_What_I_Say:_John_Scofield_Plays_the_Music_of_Ray_Charles",
/*link*/ "/ROOT/content/zimfile/A/That's_What_I_Say%3A_John_Scofield_Plays_the_Music_of_Ray_Charles",
/*title*/ "That&apos;s What I Say: John Scofield Plays the Music of Ray Charles",
/*snippet*/ R"SNIPPET(That's What I Say: John Scofield Plays the Music of Ray Charles Studio album by John Scofield Released June 7, 2005 (2005-06-07) Recorded December 2004 Studio Avatar Studios, New York City Genre <b>Jazz</b> Length 65:21 Label Verve Producer Steve Jordan John Scofield chronology EnRoute: John Scofield Trio LIVE (2004) That's What I Say: John Scofield Plays the Music of Ray Charles (2005) Out Louder (2006) Professional ratings Review scores Source Rating Allmusic All About <b>Jazz</b> All About <b>Jazz</b>...)SNIPPET",
/*bookTitle*/ "Ray Charles",
@@ -284,7 +284,7 @@ const std::vector<SearchResult> LARGE_SEARCH_RESULTS = {
),
SEARCH_RESULT(
/*link*/ "/ROOT/content/zimfile/A/Here_We_Go_Again:_Celebrating_the_Genius_of_Ray_Charles",
/*link*/ "/ROOT/content/zimfile/A/Here_We_Go_Again%3A_Celebrating_the_Genius_of_Ray_Charles",
/*title*/ "Here We Go Again: Celebrating the Genius of Ray Charles",
/*snippet*/ R"SNIPPET(...and <b>jazz</b> trumpeter Wynton Marsalis. It was recorded during concerts at the Rose Theater in New York City, on February 9 and 10, 2009. The album received mixed reviews, in which the instrumentation of Marsalis' orchestra was praised by the critics. Here We Go Again: Celebrating the Genius of Ray Charles Live album by Willie Nelson and Wynton Marsalis Released March 29, 2011 (2011-03-29) Recorded February 9 10 2009 Venue Rose Theater, New York Genre <b>Jazz</b>, country Length 61:49 Label Blue Note......)SNIPPET",
/*bookTitle*/ "Ray Charles",
@@ -356,7 +356,7 @@ const std::vector<SearchResult> LARGE_SEARCH_RESULTS = {
),
SEARCH_RESULT(
/*link*/ "/ROOT/content/zimfile/A/Ray_Sings,_Basie_Swings",
/*link*/ "/ROOT/content/zimfile/A/Ray_Sings%2C_Basie_Swings",
/*title*/ "Ray Sings, Basie Swings",
/*snippet*/ R"SNIPPET(...from 1973 with newly recorded instrumental tracks by the contemporary Count Basie Orchestra. Professional ratings Review scores Source Rating AllMusic Ray Sings, Basie Swings Compilation album by Ray Charles, Count Basie Orchestra Released October 3, 2006 (2006-10-03) Recorded Mid-1970s, February - May 2006 Studio Los Angeles Genre Soul, <b>jazz</b>, Swing Label Concord/Hear Music Producer Gregg Field Ray Charles chronology Genius &amp; Friends (2005) Ray Sings, Basie Swings (2006) Rare Genius: The Undiscovered Masters (2010)...)SNIPPET",
/*bookTitle*/ "Ray Charles",

View File

@@ -105,4 +105,93 @@ TEST(stringTools, extractFromString)
ASSERT_THROW(extractFromString<float>("3.14.5"), std::invalid_argument);
}
};
namespace URLEncoding
{
const char letters[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
const char digits[] = "0123456789";
const char nonEncodableSymbols[] = ".-_~()*!/";
const char uriDelimSymbols[] = ":@?=+&#$;,";
const char otherSymbols[] = R"(`%^[]{}\|"<>)";
const char whitespace[] = " \n\t\r";
const char someNonASCIIChars[] = "Σ♂♀ツ";
}
TEST(stringTools, urlEncode)
{
using namespace URLEncoding;
EXPECT_EQ(urlEncode(letters), letters);
EXPECT_EQ(urlEncode(digits), digits);
EXPECT_EQ(urlEncode(nonEncodableSymbols), nonEncodableSymbols);
EXPECT_EQ(urlEncode(uriDelimSymbols), "%3A%40%3F%3D%2B%26%23%24%3B%2C");
EXPECT_EQ(urlEncode(otherSymbols), "%60%25%5E%5B%5D%7B%7D%5C%7C%22%3C%3E");
EXPECT_EQ(urlEncode(whitespace), "%20%0A%09%0D");
EXPECT_EQ(urlEncode(someNonASCIIChars), "%CE%A3%E2%99%82%E2%99%80%E3%83%84");
}
TEST(stringTools, urlDecode)
{
using namespace URLEncoding;
const std::string allTestChars = std::string(letters)
+ digits
+ nonEncodableSymbols
+ uriDelimSymbols
+ otherSymbols
+ whitespace
+ someNonASCIIChars;
for ( const char c : allTestChars ) {
const std::string str(1, c);
EXPECT_EQ(urlDecode(urlEncode(str), true), str);
}
EXPECT_EQ(urlDecode(urlEncode(allTestChars), true), allTestChars);
const std::string encodedUriDelimSymbols = urlEncode(uriDelimSymbols);
EXPECT_EQ(urlDecode(encodedUriDelimSymbols, false), encodedUriDelimSymbols);
}
TEST(stringTools, uriEncode)
{
const char letters[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, letters), letters);
EXPECT_EQ(uriEncode(URIComponentKind::QUERY, letters), letters);
const char digits[] = "0123456789";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, digits), digits);
EXPECT_EQ(uriEncode(URIComponentKind::QUERY, digits), digits);
const char nonEncodableSymbols[] = ".-_~()*!/";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, nonEncodableSymbols), nonEncodableSymbols);
EXPECT_EQ(uriEncode(URIComponentKind::QUERY, nonEncodableSymbols), nonEncodableSymbols);
const char uriDelimSymbols[] = ":@?=+&#$;,";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, uriDelimSymbols), "%3A%40%3F=+&%23%24%3B%2C");
EXPECT_EQ(uriEncode(URIComponentKind::QUERY, uriDelimSymbols), ":@?%3D%2B%26%23%24%3B%2C");
const char otherSymbols[] = R"(`%^[]{}\|"<>)";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, otherSymbols), "%60%25%5E%5B%5D%7B%7D%5C%7C%22%3C%3E");
EXPECT_EQ(uriEncode(URIComponentKind::PATH, otherSymbols), uriEncode(URIComponentKind::QUERY, otherSymbols));
const char whitespace[] = " \n\t\r";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, whitespace), "%20%0A%09%0D");
EXPECT_EQ(uriEncode(URIComponentKind::PATH, whitespace), uriEncode(URIComponentKind::QUERY, whitespace));
const char someNonASCIIChars[] = "Σ♂♀ツ";
EXPECT_EQ(uriEncode(URIComponentKind::PATH, someNonASCIIChars), "%CE%A3%E2%99%82%E2%99%80%E3%83%84");
EXPECT_EQ(uriEncode(URIComponentKind::PATH, someNonASCIIChars), uriEncode(URIComponentKind::QUERY, someNonASCIIChars));
}
} // unnamed namespace