Compare commits

..

979 Commits

Author SHA1 Message Date
Nikhil Tanwar
fc87def18b Add documentation for widget
Added documentation for current widget usage, currently supported arguments, using custom CSS/JS
2022-08-15 18:22:23 +05:30
Nikhil Tanwar
09dc6f90fd Add custom CSS and JS support
Added an event listener for message event.
The idea is the website which embeds the widget will send a message using postMessage().
The expected message is:
{
css: // custom CSS code
js: // custom JS code
}
2022-08-15 18:22:22 +05:30
Nikhil Tanwar
58c04b3f77 Basic widget handling
Adds handling for parameters:
disablefilter - disable search filters
disableclick - disable book click action
disabledownload - disable download button
disabledesc - disable description
2022-08-15 18:21:41 +05:30
Nikhil Tanwar
d27220f65d Give name to IIFE in index.js and expose updateBookCount
Gives a name to the IIFE wrapping code in index.js - kiwixServe
and exposes updateBookCount through it
2022-08-11 21:18:21 +05:30
Nikhil Tanwar
efe42c9bbe Add widget endpoint
Adds an endpoint /widget to provide kiwix serve widget.
2022-08-11 21:18:21 +05:30
Nikhil Tanwar
489dfc1123 Give a name to function for updating book count
Extracts function updateBookCount() from the unnamed function in resize event.
2022-08-11 21:10:47 +05:30
Matthieu Gautier
9f545718c2 Merge pull request #806 from kiwix/content_endpoint
/content endpoint
2022-08-11 17:04:02 +02:00
Veloman Yunkan
e323dcf6c9 Redirecting /nonendpoint URLs to /content/nonendpoint 2022-08-11 18:04:05 +04:00
Veloman Yunkan
3b98987cb3 More robust handling of endpoint URLs
The next goal is to redirect old-style /book/path/to/entry URLs to
/content/book/path/to/entry, which seemed pretty trivial.

However, given the current handling of some endpoint URLs, more work was
required to ensure that invalid endpoint URLs (e.g.  "/random/number" or
"/suggest/fr") are not interpreted as content URLs. Previously, that was
not a user-observable issue, since the result would be an immediate 404
error (except in certain edge cases, like handling the request for
"/random/number" when there is a book with name "random" containing an
article at path "/number"). With redirection of URLs that were assumed
to refer to content a 404 error would be issued for the
transformed URL ("/content/random/number") which may be confusing.

Therefore this change is to ensure the correct routing of endpoint URL
handling.
2022-08-11 18:04:05 +04:00
Veloman Yunkan
fd36d11ccf Search results now use the /content URL scheme 2022-08-11 18:04:05 +04:00
Veloman Yunkan
dc56f82c29 Using /content/... URLs in OPDS output 2022-08-11 18:04:05 +04:00
Veloman Yunkan
1b1c1e352e Introduced /content endpoint
Book content is now served under /content/book/...

The old access to book content via a top-level URL /book/... is so far
preserved for backward compatibility.

Redirects were changed to use the new URL scheme. Links in the search results
still use the old scheme.
2022-08-11 18:04:05 +04:00
Veloman Yunkan
a4b18893aa Moved handling of the "/" URL 2022-08-11 18:04:05 +04:00
Matthieu Gautier
d737db666a Merge pull request #802 from kiwix/include_tags_in_free_text_library_search
Included tags in free text catalog search
2022-08-10 16:43:48 +02:00
Veloman Yunkan
cff143b4ec Included tags in free text catalog search 2022-08-06 07:39:45 +02:00
Matthieu Gautier
8e6d893f7f Merge pull request #804 from kiwix/illustration_url_uses_the_book_uuid
Illustration URL uses the book UUID
2022-08-04 15:43:22 +02:00
Veloman Yunkan
111aab0c23 Illustration URL uses the book UUID
If the server is initialized with a library.xml file, then the id
specified in the XML file is used (rather than the UUID recorded in the
ZIM file).

Note that in test/data/library.xml the book ids are fake and
different from the real ZIM IDs; that file was created for testing
of the /catalog endpoint which doesn't access ZIM content, so the
the same ZIM file zimfile.zim was added to library.xml three times as
three different books (with unique human-friendly ids). This explains
the diff in test/library_server.cpp.
2022-08-03 16:13:21 +02:00
Kelson
dd90ca1018 Merge pull request #805 from kiwix/fixFavicon
Add favicons (for different devices) to kiwix-serve welcome page
2022-08-03 16:11:54 +02:00
Nikhil Tanwar
3facd594f6 Add favicon for different devices.
Added favicon files for a number of devices.
All files and html code is generated by: https://realfavicongenerator.net/
The file used to generate favicons can be found at: https://upload.wikimedia.org/wikipedia/commons/b/b0/Kiwix_logo_v3.svg
2022-08-03 18:52:13 +05:30
Kelson
4cd52b0809 Merge pull request #796 from kiwix/noJquery
No jquery
2022-08-01 15:15:02 +02:00
Nikhil Tanwar
baf22c2516 So long, jQuery
Now after porting index.js and taskbar.js to vanilla JS, it is time to remove files.
Deleted static/skin/jquery-ui
Updated customIndexPage template in README.md.
Thank you for your service, jQuery :)
2022-07-31 19:16:46 +05:30
Nikhil Tanwar
f8a530100f Implement taskbar scroll actions in vanilla JS
Completes the porting of remaining jQuery code in taskbar.js - scroll function, blur and focus events and the cybook hack.
2022-07-31 19:16:02 +05:30
Nikhil Tanwar
a0db199388 Turn suggestions into hyperlinks
The suggestions are now clickable hyperlinks.
2022-07-31 19:11:46 +05:30
Matthieu Gautier
f0f473b829 Show suggestions using autoComplete.js
This change only shows suggestions. Clicking them does nothing.
2022-07-31 17:15:08 +05:30
Nikhil Tanwar
1e247d75bb Welcome, autoComplete.js
Added autoComplete.css and .js files.
Linked files in head_taskbar.html
2022-07-31 16:16:07 +05:30
Matthieu Gautier
4f3ec817db Update index.js to not use jquery anymore. 2022-07-31 01:06:27 +05:30
Kelson
98bcf8acd6 Merge pull request #791 from kiwix/ci_pull_request
CI triggered on pull_request event
2022-07-27 21:28:30 +02:00
Emmanuel Engelhart
b69bf4d062 Simplify branch retrieval 2022-07-20 21:21:31 +02:00
Emmanuel Engelhart
6891ce3b57 Use actions/checkout@v2 2022-07-20 21:21:31 +02:00
Emmanuel Engelhart
16197afc95 CI triggered on pull_request event 2022-07-20 21:21:31 +02:00
Kelson
abccd9d706 Merge pull request #800 from kiwix/escClose
Exit download modal on pressing escape key.
2022-07-20 21:19:01 +02:00
Nikhil Tanwar
d0adb4e722 Exit download modal on pressing escape key.
Adds an event listener to call closeModal() when Escape key is pressed.
2022-07-21 00:39:26 +05:30
Kelson
88c25b3a6c Merge pull request #786 from kiwix/optimizedWelcome
More tiles on kiwix-serve welcome page (optimised)
2022-07-20 19:41:50 +02:00
Emmanuel Engelhart
5aa74c62d6 Better align kiwix-serve welcome page filters 2022-07-20 19:18:18 +02:00
Nikhil Tanwar
2b6da38c46 Center tiles on welcome page
This change centers tiles on welcome page to give a more consistent whitespace look on both sides.
For this, the layout in Isotope JS is changed to masonry.
2022-07-20 19:18:18 +02:00
Matthieu Gautier
dfc6cad9c2 Merge pull request #795 from kiwix/icu_data_check 2022-07-19 11:23:11 +02:00
Veloman Yunkan
28f8dbcf20 New unit-test stringTools.ICULanguageInfo 2022-07-07 16:13:49 +04:00
Matthieu Gautier
81865c0f0e Merge pull request #794 from kiwix/fixHeader 2022-07-06 17:05:40 +02:00
Nikhil Tanwar
538a46f262 Include iostream header in include/version.h
This is needed for kiwix-tools compilation.
2022-07-05 20:47:14 +05:30
Kelson
e1d1d202bd Merge pull request #789 from kiwix/remove_wrappers
Remove libzim's wrapper.
2022-07-03 19:34:58 +02:00
Matthieu Gautier
71e2df7406 Explicit std
Removed headers were `using namespace std`.
So we have to be explicit everywhere.
2022-07-02 16:33:32 +02:00
Matthieu Gautier
69931fb347 Remove libzim's wrapper.
It is time to remove them. They are deprecated since 10.0.0
2022-07-02 16:33:32 +02:00
Veloman Yunkan
12e0fb6934 Merge pull request #711 from kiwix/tagFilter
Add tag filtering in kiwix-serve
2022-06-25 18:22:06 +04:00
Nikhil Tanwar
43ab6dfb6a Add ability to filter by tags in kiwix serve
This change introduces filtering by tags.
To filter, the user can click on the tag name and it will filter it.
A label is added (clickable) to show the tag filter, it can be clicked to remove the filter
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
93f2686a94 Refactoring kiwixButton
Move hover behaviour as a different class - kiwixButtonHover
Add cursor:pointer to kiwixButton
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
19a9c84e13 Change class name "searchButton" to "kiwixButton"
This is done to retain the button design in more button designs (ex: tags)
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
f034018b5c Extract setNoResultsContent() from checkAndInjectEmptyMessage()
Extracted the code from the un-named function in setTimeout for easier understanding.
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
596b223a9d Drop onclick handler for reset-filter link
This removes the onclick handler around the reset-filter link which redirected to '/?lang='
Everything under the handler was already done on window.onload
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
0c549af307 Add check to not add same link in session history
Previously, if the following steps were executed:
1. Click a book tile/visit an unrelated link from the address bar
2. Press back button
Then forward history was discarded (forward button gets disabled).
This happened because of the window.history.pushState on every window.onload event. This led to the same link being added in history and thus discarding the previous "forward-history"
This change adds a condition to only push the current state if the queries are not same.
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
37b39430d1 Use shortened URL in pushState
Earlier we were using the full URL, now only query string is passed in pushState, much cleaner!
2022-06-25 18:10:01 +04:00
Nikhil Tanwar
947744caea Introduce updateVisibleParams()
Adds a function to wrap logic to update select boxes on history change
2022-06-25 18:10:01 +04:00
Kelson
b9e03d2772 Merge pull request #790 from kiwix/docTypo 2022-06-25 07:38:28 +02:00
Nikhil Tanwar
e9b7eeb3c9 Fix documentation typos
Replace wrong  mentions of libzim with libkiwix
Remove libkiwix deprecated functions mention in usage.rst - they are removed now
2022-06-25 09:44:51 +05:30
Matthieu Gautier
15cb9025bb Merge pull request #779 from kiwix/serving_customized_resources 2022-06-22 15:31:20 +02:00
Veloman Yunkan
1139f2cb4c Testing of static front-end resource customization
One important missing test is that the content of the customized
resource is read from storage every time rather than once. Testing
that requirement would involve creating temporary files which is a
little more work.
2022-06-22 17:11:08 +04:00
Veloman Yunkan
0086049d4f Extracted LibraryServerTest into a file of its own 2022-06-22 15:22:12 +04:00
Veloman Yunkan
e3e4bfa533 Support for serving customized resources
During work on the kiwix-serve front-end, the edit-save-test cycle is
a multistep procedure:

1. build and install libkiwix
2. build kiwix-tools
3. run kiwix-serve
4. reload the web-page in the browser

When making changes in static resources that are served by kiwix-serve
unmodified, the steps 1-3 can be eliminated if kiwix-serve is capable of
serving resources from the file-system. This commit adds such a
functionality to kiwix-serve. Now, if during startup of kiwix-serve the
environment variable `KIWIX_SERVE_CUSTOMIZED_RESOURCES` is defined it is
assumed to point to a file where every line has the following format:

URL MIMETYPE RESOURCE_FILE_PATH

When a request is received by kiwix-serve and its URL matches any of the
URLs read from the customized resource file, then the resource data is
read from the respective file RESOURCE_FILE_PATH and served with
mime-type MIMETYPE.

Though this feature was introduced in order to facilitate the
development of the iframe-based content viewer, it can also be useful to
users who would like to customize the kiwix-serve front-end on their own
(without re-building all of kiwix-serve).

There is some overlap with a feature of the kiwix-compile-resources
script that also allows to override resources. The differences are:

1. The new way of customizing front-end resources has all such resources
   listed in a text file and there is a single environment variable
   from which the path of that file is read. kiwix-compile-resources
   associates a separate environment variable with each resource.

2. The new way uses regular paths to identify a resource. The
   kiwix-compile-resources method encodes the resource path by replacing
   any non-alphanumeric characters (including the path separator) with
   underscores (so that the resulting resource identifier can be used
   to construct the name of the environment variable controlling that
   resource).

3. The new method allows adding new front-end resources. The old method
   only allows to modify existing resources.

4. The new method allows (actually requires) to specify the URL at which
   the overriden resource should be served (similarly, the MIME-type can/must
   be specified, too). The old method only allows to override the contents of
   a resource.

5. The new method only allows to override front-end resources that are
   served without any preprocessing by kiwix-serve at runtime. The old
   method allows to override template resources as well (note that
   internationalization/translation resources cannot be overriden using the
   old method, either).
2022-06-22 10:59:41 +02:00
Matthieu Gautier
6938253e59 Merge pull request #784 from kiwix/version_11.0.0 2022-06-15 14:42:08 +02:00
Matthieu Gautier
c5d3ffe0d7 Fix CI for android 2022-06-15 14:36:02 +02:00
Matthieu Gautier
3938d791e7 New version 11.0.0 2022-06-15 11:49:03 +02:00
Matthieu Gautier
1572956da0 Merge pull request #771 from kiwix/translatewiki 2022-06-13 19:12:23 +02:00
Matthieu Gautier
84b1321545 Update i18n_resources_list.txt 2022-06-13 18:46:29 +02:00
translatewiki.net
0f3bca442a Localisation updates from https://translatewiki.net. 2022-06-13 13:08:42 +02:00
Matthieu Gautier
83a9e54399 Merge pull request #780 from kiwix/deduping_searchResults_unittests 2022-06-10 15:47:50 +02:00
Veloman Yunkan
baa97cadf0 Dropped test/server_xml_search.cpp 2022-06-10 15:34:18 +02:00
Veloman Yunkan
06d7a2320f test/server_search.cpp covers XML search too 2022-06-10 15:34:18 +02:00
Veloman Yunkan
f279769435 Deduplicated the snippet regex 2022-06-10 15:34:18 +02:00
Veloman Yunkan
7d4867194a Moved a function 2022-06-10 15:34:18 +02:00
Veloman Yunkan
0340a49780 git mv test/server_{html_,}search.cpp 2022-06-10 15:34:18 +02:00
Veloman Yunkan
ed6aa5a89a Renamed some functions and variables
Included the word "Html" in the names of those functions and variables
which will get Xml siblings soon.
2022-06-10 15:34:18 +02:00
Veloman Yunkan
75796ed6a5 Introduced struct SearchResult 2022-06-10 15:34:18 +02:00
Veloman Yunkan
ddd639eaa1 Moved TestData out of ServerTest.searchResults
Now that ServerTest.searchResults is in a separate cpp file, there are
no reasons for hiding its test data definition inside the unit test
function.

The diff is much-much simpler if whitespace changes are ignored.
2022-06-10 15:34:18 +02:00
Veloman Yunkan
1c98b00128 Got rid of TaskbarlessServerTest
Now ServerTest provides an optional taskbarless kiwix::Server.
2022-06-10 15:34:18 +02:00
Veloman Yunkan
600acb76c7 XML responses should be taskbarless by default
Unit-tests of search results in XML format should work the same way with
a server that would inject a taskbar into HTML responses. This small
change actually validates that taskbar injection is disabled for XML
responses.
2022-06-10 15:34:18 +02:00
Matthieu Gautier
3bbbd1b15d Merge pull request #783 from kiwix/windows_fix 2022-06-10 15:34:05 +02:00
Matthieu Gautier
ae47e5ee4e uint is not defined on Windows 2022-06-10 11:21:35 +02:00
Matthieu Gautier
b442e2371e Do not use deprecated constructor for Reader.
We have a specific private non deprecated constructor especially for that,
let's use it.
2022-06-10 10:41:31 +02:00
Matthieu Gautier
69c5c88c30 Merge pull request #782 from kiwix/windows_fix 2022-06-09 16:21:00 +02:00
Matthieu Gautier
70382d15e2 Windows compiler complains about the implicit cast from double to size_t. 2022-06-09 15:21:06 +02:00
Matthieu Gautier
62306373be Merge pull request #781 from kiwix/no_wrapper 2022-06-09 14:23:04 +02:00
Matthieu Gautier
01c384bb64 Remove the java wrapper.
- The meson's `wrapper` option is removed.
- New meson's option `static-linkage` is added to tell meson to link
  with static library.
2022-06-09 10:23:02 +02:00
Veloman Yunkan
56167dc23e Merge pull request #731 from kiwix/opensearch
Render xml result - opensearch
2022-06-04 00:51:56 +04:00
Veloman Yunkan
cc8ad9ebf2 Testing of HTTP errors in XML format 2022-06-03 15:46:41 +02:00
Matthieu Gautier
bfcf317f09 Properly set "language" parameter in opensearch::Query tag. 2022-06-03 15:46:41 +02:00
Matthieu Gautier
ee01859984 Adapt test/server_xml_search.cpp to xml search.
This is the real change.
2022-06-03 15:46:41 +02:00
Matthieu Gautier
9ec8593f8c Add a testing dummy version of xml search results.
`test/server_xml_search.cpp` is a plain copy of
`test/server_html_search.cpp`
2022-06-03 15:46:41 +02:00
Matthieu Gautier
7cb98f7f4e Make opensearch start parameter 1 indexed. 2022-06-03 15:46:41 +02:00
Matthieu Gautier
8100977cda Use a macro to create the SearchResult.
It avoid some duplication around the actual data to test.
2022-06-03 15:46:41 +02:00
Matthieu Gautier
a0cf91157a Split test/server.cpp
The file starts now to be too long.

- Move testing of the search html result in `test/server_html_search.cpp`
- Move common code used to launch server and so
  in `test/server_testing_tools.h'

This is mainly code move with a small change:
Instead of setting the default PORT (8001) as a const int in the
`ServerTest` class, we now use SERVER_PORT.
SERVER_PORT must be defined before include `server_testing_tools.h`.
This allow several test to be run in parallele without trying to open
the same port.
2022-06-03 15:46:41 +02:00
Matthieu Gautier
cadd2a5cbb Make the HTTPErrorHtmlResponse not Html only. 2022-06-03 15:46:41 +02:00
Matthieu Gautier
e51a5b9ebc Introduce get_requested_format helper 2022-06-03 15:46:41 +02:00
Matthieu Gautier
5d6b0ea96a Add searchdescription.xml endpoint 2022-06-03 15:46:41 +02:00
Matthieu Gautier
e5df5e936f Render the search result using (opensearch/atom) xml format. 2022-06-03 15:46:41 +02:00
Matthieu Gautier
c4f706863c Merge pull request #778 from kiwix/small_fixes 2022-06-02 17:20:06 +02:00
Matthieu Gautier
fbc7656b3f Use proper argument order when building the SearchRenderer from a Searcher 2022-06-02 17:08:50 +02:00
Matthieu Gautier
d196496802 Make the Searcher owning the stored Reader
If we keep a reference to a `Reader` it is better to (share) owning
the reference. Else the reader may be deleted after we create the searcher.

This is especially the case now we are creating the `Reader` at demand
and we don't store it in the library's cache.
2022-06-02 17:08:17 +02:00
Matthieu Gautier
3704d8ab87 Merge pull request #729 from kiwix/multizimsearch 2022-06-02 12:49:57 +02:00
Matthieu Gautier
a7651d0e9b Check early that provided bookIds are valid 2022-06-02 12:37:52 +02:00
Matthieu Gautier
3bca43344f Correctly url encode querystring
Fix tests with querystring needed url encoding
(pattern=jazz&books.query.title=Ray%20Charles)
2022-06-02 12:37:52 +02:00
Matthieu Gautier
b857293cfd Build the bookSelection query string when we parse the query.
We have to reuse the query the user give us to generate the
pagination links.
At search result rendering step we don't have access to the query object.
The best place to know which arguments are used to select books
(and so which arguments to keep in the pagination links) is when we
parse the query to select books.

Fix tests (pagination links) with book selector other than "books.id="
(pattern=jazz&books.query.lang=eng)
2022-06-02 12:37:52 +02:00
Matthieu Gautier
b483a8e4e4 Make the request_context be able to generate a querystring for a subset.
The request_context can now take a filter to select arguments to
keep in the query string.
2022-06-02 12:37:52 +02:00
Matthieu Gautier
e2ab7fd62e Add some more testing.
Note that some tests are failing and will be fixed in next commits.
2022-06-02 12:37:52 +02:00
Veloman Yunkan
f45962c697 First test case for multizim search 2022-06-02 12:37:52 +02:00
Veloman Yunkan
3b3d7ad9c4 Preparing to enhance the search results testsuite
Providing the core part of the query explicitly in the search results
testsuite test data.
2022-06-02 12:37:52 +02:00
Matthieu Gautier
1514661c26 Protect search from multi threading race condition.
libzim's search is not thread safe (mainly because xapian is not).
So we must protect our search objects from multi thread calls.

The best way to do this is to associate a mutex to the `zim::Searcher`
and lock the searcher each time we access object derivated from the
searcher (search, results, iterator, ...)
2022-06-02 12:37:52 +02:00
Matthieu Gautier
e5ea210d2c Add a template specialization for ConcurrentCache storing shared_ptr
When ConcurrentCache store a shared_ptr we may have shared_ptr in used
while the ConcurrentCache has drop it.
When we "recreate" a value to put in the cache, we don't want to recreate
it, but copying the shared_ptr in use.

To do so we use a (unlimited) store of weak_ptr (aka `WeakStore`)
Every created shared_ptr added to the cache has a weak_ptr ref also stored
in the WeakStore, and we check the WeakStore before creating the value.
2022-06-02 12:37:52 +02:00
Matthieu Gautier
2b38d2cf1b Copy the lrucache test from libzim.
- Adapt lrucache.cpp for rigth include path
  and use `kiwix::lru_cache` instead of `zim::lru_cache`.
- Add missing `#include <set>` in lrucache.h
2022-06-02 12:37:52 +02:00
Matthieu Gautier
0081b4d8e7 Make the limit of zim files per search configurable.
The default value is 0, which means no limit.
2022-06-02 12:37:52 +02:00
Matthieu Gautier
b74910b2af Limit the number of zim in multizim fulltext search.
We are currently limiting to 5 but it will be changed in next commit.
2022-06-02 12:37:50 +02:00
Matthieu Gautier
cf30233358 Prefix env variable name with KIWIX_ 2022-06-02 12:23:43 +02:00
Matthieu Gautier
f0065fdd6f Introduce Error exception to do i18n 2022-06-02 12:23:42 +02:00
Matthieu Gautier
c72132054d Move i18n helper functions 2022-06-02 12:22:28 +02:00
Matthieu Gautier
077ceac5a5 Make the search_rendered handle multizim search.
This introduce a intermediate mustache object to store information
about the request made by the user.
2022-06-02 12:22:28 +02:00
Matthieu Gautier
39d0a56be8 Use selectBooks in handle_search 2022-06-02 12:22:28 +02:00
Matthieu Gautier
76d5fafb72 Introduce selectBooks
`selectBooks` allow us to parse a query in a "standard" way to get
the book(s) on which the user want to work.
2022-06-02 12:22:28 +02:00
Matthieu Gautier
4438106c2f Add a prefix in get_search_filter
The prefix will be used to parse a "query to select book" in different context.
For now we have only one context : selecting books for the catalog search.
But we will want to select books to do fulltext search on them
(will be done in later commit)
2022-06-02 12:22:28 +02:00
Matthieu Gautier
76ebfd7ea4 Move get_search_filter and subrange. 2022-06-02 12:22:27 +02:00
Matthieu Gautier
22996e4a6b Allow user to select multiple books when doing search. 2022-06-02 12:22:27 +02:00
Matthieu Gautier
98c54b2279 Handle multiple arguments in RequestContext. 2022-06-02 12:22:27 +02:00
Matthieu Gautier
854623618c Use the newly introduced searcherCache for multizim searcher. 2022-06-02 12:22:25 +02:00
Matthieu Gautier
fd0edbba80 Use a set of id as key for a the searcher Cache.
It will allow use to cache seacher for multiple zim files.
2022-05-24 14:55:48 +02:00
Matthieu Gautier
f5af0633ec Move the searcher cache into the Library 2022-05-24 14:55:48 +02:00
Matthieu Gautier
740581c55c Link the cache size to the book count.
Unless explicitly set via user env variable.
2022-05-24 14:55:48 +02:00
Matthieu Gautier
582e3ec46d Use a concurrent cache to store Archive cache. 2022-05-24 14:55:48 +02:00
Matthieu Gautier
28fb76bbc2 Remove m_readers in Library::impl
It is a deprecated interface and it is a simple wrapper on Archive.
2022-05-24 14:55:48 +02:00
Matthieu Gautier
7c688a4acc Move getCacheLength to a generic helper function getEnvVar 2022-05-24 14:55:48 +02:00
Kelson
d4da05e591 Merge pull request #764 from kiwix/pre_multisearch
Preparatory work on multizim
2022-05-23 19:29:08 +02:00
Matthieu Gautier
66b2449800 Remove unnecessary catch
Catch of std::exception is already made in `handle_request`
2022-05-23 19:17:28 +02:00
Matthieu Gautier
aad95e3413 Introduce a results intermediate object in the template rendering.
Url in href must not be html encoded. As we already url encode the path, it
is ok to have `'` in the url.
2022-05-23 19:16:14 +02:00
Matthieu Gautier
f0dd34b6db Introduce buildQueryData helper in SearchRenderer 2022-05-23 19:13:25 +02:00
Matthieu Gautier
bbdde93f49 Introduce a pagination object to render search result. 2022-05-23 19:12:17 +02:00
Matthieu Gautier
cb62da65c3 Raise a exception if something went wrong in the template rendering. 2022-05-23 10:56:39 +02:00
Matthieu Gautier
288b4ae7df Fix count of remote books in Library::Impl::getBookCount 2022-05-23 10:56:39 +02:00
Matthieu Gautier
52c12b0c2f Introduce Library::Impl::getBookCount
We simply introduce a `getBookCount` which is not protected by a lock.
2022-05-23 10:56:39 +02:00
Matthieu Gautier
4695f47dd2 Introduce operator+= to simplify response creation. 2022-05-23 10:56:39 +02:00
Matthieu Gautier
f42f6a60df Use extractFromString to parse request argument.
On top of reusing code, it throw a exception if we cannot convert given
argument in the type we want.
2022-05-23 10:56:39 +02:00
Matthieu Gautier
717c39f2ef Better ExtractFromString
- Throw a exception if we cannot extract from string.
  (We throw the same exception as `std::sto*`)
- Add a specialization to extract string from string
- Add some unit test
2022-05-23 10:56:39 +02:00
Matthieu Gautier
aa1f73472d Remove unecessary BookDB helper class.
It was needed to not expose Xapian in public header.
Now we can remove it and directly use a Xapian db.
2022-05-23 10:56:39 +02:00
Matthieu Gautier
090c2fd31a Move LibraryBase out of public API.
We use composition instead of inheritance to implement Library.
2022-05-23 10:56:39 +02:00
Matthieu Gautier
ff2c7b1fb2 Merge pull request #765 from kiwix/unittests_for_search_results_page 2022-05-23 10:55:28 +02:00
Veloman Yunkan
963362e1ea One more test-point for search result pagination 2022-05-18 13:30:42 +04:00
Veloman Yunkan
1a8d874a2c Testing the request for an out-of-bounds page 2022-05-18 13:30:42 +04:00
Veloman Yunkan
8e7658bb10 Almost full coverage of search result pagination
The snippets in the test data had to be updated to account for
pagination-dependent snippet variability of pre-7.2.2 libzim.
2022-05-18 13:28:52 +04:00
Veloman Yunkan
8f2f93371b Changed a test in order to avoid a bug in Xapian
Xapian version 1.4.18 contains a bug in snippet generation caused by
incorrect handling of stemming.

The test-point with a search pattern "beatles" produced snippets with no
highlights of the search term. Debugging showed that the search pattern
"beatles" was transformed to a search term "beatl" which then didn't
match the word "beatles" in the text from which a snippet had to be
extracted.

The test case passed on my development machine as well as for most CI
configurations. However the "Packages / build-deb (ubuntu-bionic)"
variant failed because of a slightly different handling of punctuation
at the snippet boundaries:

Test context:
  url: /ROOT/search?pattern=beatles&content=zimfile
  actual snippet:   ...side "Yellow Submarine" ...........
  expected snippet: ...-side "Yellow Submarine" ...........

Above mismatch resulted in a looser comparison of the snippet contents
and failed the requirement that the snippet MUST contain highlights
(this is how the said bug in Xapian was discovered).

An attempt to change the search pattern to "field" didn't eliminate the
problem. Despite the search pattern itself being in singular form (i.e.
identical to its stemmed version) the plural form "fields" in the
snippet was still not highlighted.

Using for a search pattern an adjective instead of a noun achieved the
desired outcome.
2022-05-18 13:28:52 +04:00
Veloman Yunkan
eeca88573b Validation of snippets in search results
The "expected" snippets in the test data must be a union of all possible
snippets produced at runtime for a given (document, search terms) pair
on all platforms of interest:

- Overlapping snippets must be properly merged

- Non-overlapping snippets can be joined with a " ... " in between.
2022-05-18 13:20:27 +04:00
Veloman Yunkan
4521249452 Excluded snippets from search results validation 2022-05-18 13:05:29 +04:00
Veloman Yunkan
21e183c2e4 First test for a non-first page of search results 2022-05-18 12:45:47 +04:00
Veloman Yunkan
d56ccbd019 First search results test-point with pagination 2022-05-18 12:45:47 +04:00
Veloman Yunkan
825cf1c948 Added a test-point for a large unpaginated search 2022-05-18 12:45:47 +04:00
Veloman Yunkan
57c31a43a4 Another simple test-point for /search endpoint 2022-05-18 12:45:47 +04:00
Veloman Yunkan
84c68d4d7b Search results pagination bugfix
Search results pagination is disabled for a single page outcome too.
2022-05-18 12:45:47 +04:00
Veloman Yunkan
f2cf42427a New unit-test TaskbarlessServerTest.searchResults
This is a preliminary implementation checking only the following
cases:

- no search results
- all search results fitting on a single page

The second test-case fails because of a bug in search renderer (leading
to the pagination footer being pointlessly enabled). Will fix it in the
next commit.
2022-05-18 12:45:47 +04:00
Veloman Yunkan
612ecc975d Support for testing a server without a taskbar
Taskbar injected by a server adds distraction to unit-tests focusing
on the HTML contents of the returned pages. The new test-suite
TaskbarlessServerTest will have taskbar disabled.
2022-05-18 12:45:47 +04:00
Veloman Yunkan
ae56d399b7 Explained why search_result.html needs inline CSS
In #727 inline CSS [was extracted](e4a4b2f961)
from `static/templates/no_search_result.html` into a separate stylesheet
resource. The purpose was to later

1. get rid of the custom `static/templates/no_search_result.html` error
   template and use a general purpose error template instead (this was
   accomplished by PR #744).

2. deduplicate the CSS code between `static/templates/no_search_result.html` and
   `static/templates/search_result.html` by making the latter to also refer to
   an internal CSS resource rather than containing inline stylesheet code.

While preparing to implement the 2nd point, I figured out that
`kiwix::SearchRenderer` is used as a component in `kiwix-desktop` too,
which probably would be upset by a link to a libkiwix's internal CSS resource.

This commit documents that finding.
2022-05-18 12:45:47 +04:00
Kelson
eaa8c3c91c Merge pull request #776 from kiwix/fix_i18n_windows
Specify utf8 encoding when opening i18n resource file.
2022-05-17 22:50:20 +02:00
Matthieu Gautier
26c06d8c2a Specify utf8 encoding when opening i18n resource file.
Else, on windows, we will try to open files with "local" encoding (cp1252)
2022-05-17 18:36:35 +02:00
Matthieu Gautier
eee6803328 Merge pull request #774 from kiwix/manually_generate_i18n_resource_list 2022-05-17 14:57:51 +02:00
Matthieu Gautier
d19ae1b054 Update i18n_resources_list.txt using generate_i18n_resources_list.py 2022-05-16 14:27:48 +02:00
Matthieu Gautier
abe2fa0179 Add a script to generate the i18n resource list automatically. 2022-05-16 14:27:48 +02:00
Matthieu Gautier
6e93bad565 Do not auto discover i18n files.
Revert to the plain old 'i18n_resources_list.txt' file.

Auto discovering of i18n file has a main flaw (and a small bug):
- The main flaw is that rerun the configure will not detect new
  translation files. It means that if we use cache in our CI,
  new translation will not be included.
- The bug is that on Windows, meson fails with a error about a non existent
  `` (empty) file name. I suppose it is because python replace
  `\n` by `\r\n` on Windows, and the the `.strip().split('\n')` keeps empty
  lines.

The small bug could be fixed, but the main flaw make the whole better if
we use a script to generate the listing.

This commit is somehow a half revert of 2eff5b55a6
2022-05-16 14:27:37 +02:00
Kelson
5fb919e73e Merge pull request #772 from kiwix/roundHomepage 2022-05-15 10:02:05 +02:00
Nikhil Tanwar
2771a95d40 Floor the value returned by viewPortToCount()
Previously, the value returned by viewPortToCount() could be a decimal number, this floors its value.
Helps in clean requests and caching.
Fix #766
2022-05-15 08:02:32 +05:30
Kelson
8dbf015689 Merge pull request #770 from kiwix/magnetLink
Use real magnet link in download modal
2022-05-14 17:05:05 +02:00
Nikhil Tanwar
6cdc47eb62 Use real magnet link in download modal
Previously, on clicking Magnet, we were redirecting to a different site:
https://download.kiwix.org/zim/other/xyzBookWithDate.zim.magnet

This had the real magnet link as page content
Now we use the real magnet link in the href, thus not redirecting and starting the download right away.
Fix #767
2022-05-14 17:00:14 +02:00
Matthieu Gautier
cbd37073e8 Merge pull request #761 from kiwix/translatewiki 2022-05-11 17:04:33 +02:00
translatewiki.net
d131b732d8 Localisation updates from https://translatewiki.net. 2022-05-11 16:11:17 +02:00
Matthieu Gautier
17c1b3b82f Merge pull request #759 from kiwix/diacritics_insensitive_suggestions 2022-05-10 15:51:18 +02:00
Veloman Yunkan
744dd87fb0 Testing that /suggest is diacritics insensitive 2022-05-10 15:15:19 +02:00
Matthieu Gautier
d469e2aed8 Merge pull request #768 from kiwix/update_ci 2022-05-10 15:13:42 +02:00
Matthieu Gautier
73d2d47ca7 Run the CI on Ubuntu Bionic and Fedora 35
Xenial and f31 are eol
2022-05-10 14:58:56 +02:00
Matthieu Gautier
55149407d2 Merge pull request #763 from kiwix/i18n_resource_discovery 2022-05-09 15:11:02 +02:00
Veloman Yunkan
2eff5b55a6 Automatic discovery of i18n resources
Excluding qqq.json any .json file under static/i18n is now considered to
be a i18n resource. This eliminates the need to update the
i18n_resources_list.txt file every time a new language json file is
added. Thus Translatewiki PRs will not require extra work.
2022-05-09 15:12:16 +04:00
Kelson
26eccb5a5f Merge pull request #712 from kiwix/static_resource_versioning
Static resource versioning
2022-05-02 23:49:55 +02:00
Veloman Yunkan
1b81ccc5e5 Using a regular expression with named groups 2022-05-02 20:48:05 +04:00
Veloman Yunkan
091786c7d8 A slight simplification of resource preprocessing
Now the whole content of a resource is preprocessed with a single
invocation of `re.sub()` rather than line-by-line.

Also, the function `get_preprocessed_resource()` returns a single value
rather than a (preprocessed_content, modification_count) pair; the
situation when the preprocessed resource is identical to the source
version is signalled by a return value of None.
2022-05-02 20:38:08 +04:00
Veloman Yunkan
c0b9e2a466 Cache-id of resources with account for dependency
The cache-id of resources now includes dependency information. This commit
illustrates that property with the changed cache-id of skin/index.js which
depends on skin/{download,hash,magnet,bittorent}.png.

The implementation is not fool-proof - cyclic dependency between
resources is not detected and will lead to infinite recursion.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
03ab2f67dd Using global variables for base & output directories 2022-05-02 20:37:22 +04:00
Veloman Yunkan
157f01e951 Preparing to handle inter-resource dependency
The current implementation of resource preprocessing contains a bug
(with respect to the problem that it tries to solve): it doesn't take
into account the dependence of static resources on each other. If
resource A refers to B and B refers to C, then a change in C would
result in its cache id being updated in the preprocessed version of B.
However the cache id of B won't change since the cache id is derived
from the source rather than from the preprocessed output.

This commit is the first step towards addressing the described issue.

Now cache-id of a resource is computed on demand rather than precomputed
for all resources. The only thing remaining is to compute the cache-id
from the preprocessed content.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
42fd6e8926 Made kiwix-resources work with python 3.5-
Formatted string literals appeared in Python 3.6. Some CI platforms
still use older versions of Python.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
707df3d10b Removing the old preprocessed resource, if any
If during an earlier build a resource was symlinked in the build
directory (because it wasn't modified by preprocessing) and later
changes are made to the resource that result in its preprocessing no
longer being a no-op, then the preprocessing is performed (in place) on
the original resource directly (via the symlink). Therefore any symlinks
must be removed before preprocessing a resource.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
c016dfd2ce Resource preprocessing handles relative links
... but only if they contain "/skin/" as a substring.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
150851b33d kiwix-resources preprocesses all resources
kiwix-resources preprocesses all resources rather than only templates. At
this point this doesn't change anything since only (some) template resources
contain KIWIXCACHEID placeholders. But this enhancement opens the door
to the preprocessing of static/skin/index.js (after preprocessing is
able to handle relative links, which comes in the next commit).
2022-05-02 20:37:22 +04:00
Veloman Yunkan
3b9f28b2b5 Applied cache-id to search_results.css
The story of search_results.css

static/skin/search_results.css was extracted from
static/templates/no_search_result.html before the latter was dropped.

static/templates/no_search_result.html in turn seems to be a copied and
edited version of static/templates/search_result.html.

In the context of exploratory work on the internationalization of
kiwix-serve (PR #679) I noticed duplication of inline CSS across those
two templates and intended to eliminated it. That goal was not fully
accomplished (static/templates/search_result.html remained untouched)
because by that time PR #679 grew too big and the efforts were diverted
into splitting it into smaller ones. Thus search_results.css slipped
into one of those small PRs, without making much sense because nothing
really justifies preserving custom CSS in the "Fulltext search unavailable"
error page.

At the same time, it served as the only case where a link to a cacheable
resource is generated in C++ code (rather than found in a template).
This poses certain problems to the handling of cache-ids. A workaround
is to expel the URL into a template so that it is processed by
`kiwix-resources`. This commit merely demonstrates that solution. But
whether it should be preserved (or rather the "Fulltext search
unavailable" page should be deprived of CSS) is questionable.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
fc85215ea0 Preprocessing of template resources
In template resources (found under static/templates), strings of the
form "PATH/TO/STATIC/RESOURCE?KIWIXCACHEID" are expanded into
"PATH/TO/STATIC/RESOURCE?cacheid=CACHEIDVAL" where CACHEIDVAL is a
8-digit hexadecimal hash digest of the file at
static/PATH/TO/STATIC/RESOURCE.
2022-05-02 20:37:22 +04:00
Veloman Yunkan
acdc1dfb27 New unit-test ServerTest.CacheIdsOfStaticResources
Introduced a new unit-test which will ensure that static resources of
kiwix-serve have the cache ids applied to them in the links embedded into
the HTML code.

At this point there are no cache ids. The new unit-test will help to
visualize how they come into existence.
2022-05-02 20:37:22 +04:00
Matthieu Gautier
f90cc39a52 Merge pull request #757 from kiwix/gzip_compression 2022-04-28 14:36:51 +02:00
Matthieu Gautier
fba0f09f4f Do not compress content smaller than 1400 Bytes 2022-04-27 18:23:39 +02:00
Matthieu Gautier
0d294c50a5 [SERVER] Support gzip encoding instead of deflate.
The `compress` function is copied from httplib
2022-04-27 18:23:38 +02:00
Kelson
dc42f831c0 Merge pull request #756 from kiwix/doc-badge
Add documentation badge in README
2022-04-23 11:20:42 +02:00
Emmanuel Engelhart
1757f7f168 Add documentation badge in README 2022-04-23 10:38:15 +02:00
Matthieu Gautier
c43c637bea Merge pull request #679 from kiwix/kiwix-serve-i18n 2022-04-14 15:21:47 +02:00
Veloman Yunkan
927c12574a Preliminary support for Accept-Language: header
In the absence of the "userlang" query parameter in the URL, the value
of the "Accept-Language" header is used. However, it is assumed that
"Accept-Language" specifies a single language (rather than a comma
separated list of languages possibly weighted with quality values).

Example:

Accept-Language: fr
// should work

Accept-Language: fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5
// The requested language will be considered to be
// "fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5".
// The i18n code will fail to find resources for such a language
// and will use the default "en" instead.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
9987fbd488 Fixed CI build failure under android_arm* 2022-04-13 16:40:20 +02:00
Veloman Yunkan
a0d9a824e1 Internationalized searchbox tooltip 2022-04-13 16:40:20 +02:00
Veloman Yunkan
5052d4018c hy translation of the suggest-search message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
11be821c46 Internationalized "Go to a randomly selected page"
At this point a potential issue has been revealed. Now we produce
the final HTML via 2-level template expansion

1. Render parameterized messages
2. Render the HTML template

In which templates we should use double mustache "{{}}" (HTML-escaping)
tags and where we may use triple mustache "{{{}}}" (non-escaping) tags?
2022-04-13 16:40:20 +02:00
Veloman Yunkan
527a606281 Testing the translation of "Go to random page"
The new test fails since the "Go to random page" button is not yet
internationalized.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
3da81a3d0f Internationalized "Go to the main page" button 2022-04-13 16:40:20 +02:00
Veloman Yunkan
ed7717c1e7 Testing the translation of "Go to the main page"
The new test fails since the "Go to the main page" button is not yet
internationalized.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
f73be3cde7 Initializing mustache data via initializer list 2022-04-13 16:40:20 +02:00
Veloman Yunkan
c2bfeb4030 "Go to welcome page" is internationalized 2022-04-13 16:40:20 +02:00
Veloman Yunkan
901664b097 "Go to welcome page" in taskbar isn't translated
The (failing) tests now demonstrate that some text in the taskbar is not
translated. Will fix in the next commit.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
6f3db20078 Internationalized "Fulltext search unavailable" page 2022-04-13 16:40:20 +02:00
Veloman Yunkan
fbd23a8329 Fully internationalized 400, 404 & 500 error pages 2022-04-13 16:40:20 +02:00
Veloman Yunkan
d2c864b010 Internationalized raw-entry-not-found message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
779382642b Internationalized bad raw access datatype message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
ca7e0fb4a0 Internationalized random article failure message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
52d4f73e89 RIP searchSuggestionHTML() & English-only message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
1ace16229d Internationalized search suggestion message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
cb5ae01fd8 Localized "No such book" 404 message for /random
However the title and the heading of the 404 page are not localized yet.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
b2526c7a98 Translation of the url-not-found message 2022-04-13 16:40:20 +02:00
Veloman Yunkan
387f977d6c Enter ParameterizedMessage 2022-04-13 16:40:20 +02:00
Veloman Yunkan
202ec81d8b URL-not-found message went into i18n JSON resource
Yet, the URL-not-found message is not yet fully internationalized
since its usage is hardcoded to English.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
577b6e29f9 kiwix::i18n::expandParameterizedString() 2022-04-13 16:40:20 +02:00
Veloman Yunkan
e4a0a029ff User language control via userlang query param
This is a draft commit enabling the testing of the support for
kiwix-serve internationalization.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
507e111f34 i18n data is kept in and generated from JSON files
Introduced a new resource compiler script kiwix-compile-i18n that
processes i18n string data stored in JSON files and generates sorted C++
tables of string keys and values for all languages.
2022-04-13 16:40:20 +02:00
Veloman Yunkan
d029c2b8d5 Enter I18nStringDB 2022-04-13 16:40:20 +02:00
Veloman Yunkan
c574735f51 makeFulltextSearchSuggestion() works via mustache 2022-04-13 16:40:20 +02:00
Veloman Yunkan
a18dd82d82 Introduced makeFulltextSearchSuggestion() helper 2022-04-13 16:40:20 +02:00
Matthieu Gautier
e22e073d43 Merge pull request #747 from kiwix/version_10.1.1 2022-04-12 11:32:06 +02:00
Matthieu Gautier
6dcf4ee034 New version 10.1.1 2022-04-11 17:13:58 +02:00
Kelson
61ccbc65fb Merge pull request #743 from kiwix/fix_article_count
Correctly detect the number of article for zim version <= 6
2022-04-06 17:28:51 +02:00
Matthieu Gautier
85a9d35488 Correctly detect the number of article for zim version <= 6 2022-04-06 17:21:14 +02:00
Matthieu Gautier
a17258fcc9 Merge pull request #744 from kiwix/fullsearch_text_unavailable_error 2022-04-06 15:14:18 +02:00
Veloman Yunkan
ae1bf39023 Got rid of static/templates/no_search_result.html
The "Fulltext search unavailable" error page is now generated using the
static/templates/error.html template. Also added two test cases checking
that error page.
2022-04-06 14:42:29 +02:00
Veloman Yunkan
dbcbdff275 Added an optional CSS link to error.html 2022-04-05 20:49:09 +04:00
Matthieu Gautier
c1823b8ee4 Merge pull request #738 from kiwix/HTTPErrorHtmlResponse 2022-04-04 18:47:12 +02:00
Veloman Yunkan
3f41ce8337 Unit test for HTTP 500 Internal Server Error
Currently an internal server error can be triggered by accessing an
entry belonging to a redirect loop. The ZIM file (test/data/poor.zim)
containing such a loop was copied from openzim/zim-tools repository
(test/data/zimfiles/poor.zim).
2022-04-04 18:35:20 +02:00
Veloman Yunkan
2a20e87341 Got rid of Response::build_500()
This change is not tested (mostly due to the difficulties of triggering
an internal server error).
2022-04-04 18:35:20 +02:00
Veloman Yunkan
2028bf3a98 Fixed the CI build failure under android_arm* 2022-04-04 18:35:20 +02:00
Veloman Yunkan
545d409150 Reused HTTPErrorHtmlResponse in HTTP400HtmlResponse 2022-04-04 18:35:20 +02:00
Veloman Yunkan
89dc9afc28 Renamed 404.html to error.html
404.html no longer contains anything specific to the 404 error and will
henceforth serve (with some enhancements) as a general purpose error
page template.
2022-04-04 18:35:20 +02:00
Veloman Yunkan
647118dd5e Enter HTTPErrorHtmlResponse
In addition to serving as a base class for `HTTP404HtmlResponse`,
`HTTPErrorHtmlResponse` is going to be used for a couple of other error
pages.
2022-04-04 18:35:20 +02:00
Veloman Yunkan
d8a60db739 Preparing for a single error page template 2022-04-04 18:35:20 +02:00
Veloman Yunkan
f4059f3faf Got rid of withTaskbarInfo() 2022-04-04 18:35:20 +02:00
Veloman Yunkan
800cc5b68a Got rid of Response::build_404() 2022-04-04 18:35:19 +02:00
Kelson
b1f03385e4 Merge pull request #739 from kiwix/fix_windows_extern 2022-04-03 13:36:21 +02:00
Matthieu Gautier
feb30d08aa Correctly define the variable urlNotFoundMsg and invalidUrlMsg.
As we must declare the two variables as `extern` in response.h,
we must define it somewhere (and `response.cpp` is a good place).
2022-04-01 11:58:57 +02:00
Matthieu Gautier
95d4dd63ac Merge pull request #724 from kiwix/search_improvement 2022-03-29 14:42:24 +02:00
Matthieu Gautier
311f783ea9 Always use the search pattern when searching in the server.
There is no reason to not use the pattern if there is a geo_query.
If both the pattern and the qeo_query are provided, we must use both.
2022-03-29 14:06:19 +02:00
Matthieu Gautier
f2a1c0f106 Add braces around for loop's body. 2022-03-29 14:05:45 +02:00
Matthieu Gautier
2cc4befb12 Correctly display searchpattern in search result page.
The `searchPattern` is already "diples encoded".
So we can simply using it without protecting us from script injection.

Fix #723
2022-03-29 14:05:45 +02:00
Matthieu Gautier
3641dbf14d Handle book without xapian index. 2022-03-29 14:05:45 +02:00
Matthieu Gautier
1962262f94 Correctly handle invalid book.
If user request for a non existent book, we must return a 400 page.
(This is done by throwing a `std::invalid_argument` and let the catch
handle it)
2022-03-29 14:05:45 +02:00
Matthieu Gautier
7407f30790 Better cache usage.
It is better to directly try to get the `Search` from the cache instead
of getting the `Searcher` first which could be useless in Search already
exist.
2022-03-29 14:05:45 +02:00
Matthieu Gautier
d740ffe465 Introduce SearchInfo.
SearchInfo is a small helper structure to store information about the
queried search. It regroup already existing information (`patternString`,
geo query, ...) in one structure.
It is also used as key in the cache instead of using a generated string.
2022-03-29 14:05:39 +02:00
Matthieu Gautier
e7293346be Return http 400 error response when needed. 2022-03-28 17:37:41 +02:00
Matthieu Gautier
b1643e422e Introduce HTTP400HtmlResponse.
HTTP400HtmlResponse is build on the same design than HTTP404HtmlResponse.
2022-03-28 17:35:15 +02:00
Kelson
574c1ad690 Merge pull request #736 from kiwix/pin_jinja2_doc
Remove pinning of Sphinx<4
2022-03-28 15:50:17 +02:00
Matthieu Gautier
59364a737a [WIP] Remove pinning of Sphinx<4
It seems we add this pinning to fix a dependencies issue.
Let's remove it.
2022-03-28 15:37:05 +02:00
Kelson
49f24d18df Merge pull request #732 from kiwix/HTTP404HtmlResponse
New way of building 404 error HTML responses
2022-03-28 15:27:46 +02:00
Veloman Yunkan
ec2e10b40e Moved taskbarInfo into ContentResponseBlueprint 2022-03-28 14:56:40 +02:00
Veloman Yunkan
2da8ea1650 Moved function definition to cpp 2022-03-28 14:56:40 +02:00
Veloman Yunkan
0eb8f09f79 One more victory of HTTP404HtmlResponse
One more instance of `Response::build_404()` & `withTaskbarInfo()`
was taken over by `HTTP404HtmlResponse`.
2022-03-28 14:56:40 +02:00
Veloman Yunkan
0ecbdbcf63 Enter TaskbarInfo
After this change it's time to say thank you and good-bye to
`withTaskbarInfo()`. But it will take a while.
2022-03-28 14:56:40 +02:00
Veloman Yunkan
9bc09a815c noSuchBookErrorMsg() 2022-03-28 14:56:40 +02:00
Veloman Yunkan
48d377ca44 HTTP404HtmlResponse::operator+(const std::string&) 2022-03-28 14:56:40 +02:00
Veloman Yunkan
d5ae92e4e2 More uses of HTTP404HtmlResponse 2022-03-28 14:56:40 +02:00
Veloman Yunkan
1a5e2eda0f HTTP404HtmlResponse::operator+(UrlNotFoundMsg) 2022-03-28 14:56:40 +02:00
Veloman Yunkan
89785a259a Enter HTTP404HtmlResponse 2022-03-28 14:56:40 +02:00
Veloman Yunkan
668063205c Enter UrlNotFoundMsg iomanipulator-like class 2022-03-28 14:56:40 +02:00
Veloman Yunkan
df98c58d07 Enter ContentResponseBlueprint 2022-03-28 14:56:40 +02:00
Veloman Yunkan
ff8da65c68 Separated make404ResponseData() 2022-03-28 14:56:40 +02:00
Veloman Yunkan
ae60ba806b Made 404.html error template a little more generic
The fact that an info message was moved into C++ code is temporary
since it will be moved to a message resource file soon.
2022-03-28 14:56:40 +02:00
Veloman Yunkan
8cfcf2ea86 A new overload of Response::build_404() 2022-03-28 14:56:40 +02:00
Veloman Yunkan
26c16bb1b2 Renamed a variable 2022-03-28 14:56:40 +02:00
Veloman Yunkan
ca965d448f Got rid of 2 parameters in Response::build_404()
Instead of passing the `bookName` and `bookTitle` parameters to
`Response::build_404()`, `withTaskbarInfo()` is applied to its result
when needed. Note, that in `InternalServer::handle_raw()`
`withTaskbarInfo()` was not utilized since the results of the `/raw`
endpoint are not supposed to be decorated with a taskbar.
2022-03-28 14:56:40 +02:00
Veloman Yunkan
6d16d7386d Changed the signature of ContentResponse::set_taskbar() 2022-03-28 14:56:40 +02:00
Veloman Yunkan
40e9a19c48 Introduced withTaskbarInfo() helper function
This was done in preparation for removing the `bookName` and `bookTitle`
parameters from `Response::build_404()`, but since the new function
could already be put to some use in this commit that was done too.
2022-03-28 14:56:40 +02:00
Veloman Yunkan
d487c78ea4 Changed the return type of Response::build_404() 2022-03-28 14:56:40 +02:00
Veloman Yunkan
96cbd2bf26 kiwix::onlyAsNonEmptyMustacheValue() 2022-03-28 14:56:40 +02:00
Matthieu Gautier
941c3b5df3 Merge pull request #734 from kiwix/br_10.1.0 2022-03-24 18:55:38 +01:00
Matthieu Gautier
b9e40def88 New version 10.1.0 2022-03-24 18:26:35 +01:00
Kelson
116ecd1c78 Merge pull request #733 from kiwix/kelson42-patch-1
Add release badge
2022-03-24 17:45:54 +01:00
Kelson
8f2faf37dc Add release badge 2022-03-24 17:45:03 +01:00
Matthieu Gautier
ddc4c3ec2c Merge pull request #727 from kiwix/testing_of_ft_search_unavailable_page 2022-03-23 15:06:47 +01:00
Veloman Yunkan
511261cc81 Testing of "Fulltext search unavailable" page 2022-03-18 15:57:11 +04:00
Veloman Yunkan
aaf232bee4 Support for CSS URL in HTML response tests 2022-03-18 15:56:19 +04:00
Veloman Yunkan
a3460f6f48 Supporting varying page title in HTML response tests 2022-03-18 15:50:25 +04:00
Veloman Yunkan
e4a4b2f961 Extracted CSS out of no_search_results.html 2022-03-18 15:46:54 +04:00
Veloman Yunkan
389d29c92e Searching in a non-existent book is a 404 case 2022-03-18 15:46:41 +04:00
Veloman Yunkan
c64fce52e7 Made 404 HTML template consistent with the rest 2022-03-18 15:46:01 +04:00
Kelson
a5baafd09f Merge pull request #725 from kiwix/safer_testing_of_html_responses
Safer testing of HTML responses
2022-03-18 07:03:02 +01:00
Veloman Yunkan
ed46541b6f Clean-up promised in the previous commit 2022-03-11 22:53:46 +04:00
Veloman Yunkan
e93ccd18d4 Robust test data in ServerTest.404WithBodyTesting
Before this change the meaning of test data in the ServerTest.404WithBodyTesting
unit test entirely depended on the number of entries:

- 2 entries: url, expected body
- 4 entries: url, book name, book title and expected body

This was fragile and non scalable (if other combinations of expected
response data are needed).

This commit defines a mini-DSL taking advantage of operator overloading
that allows to define test data in a robust way with the help of the compiler.

Some code in `TestContentIn404HtmlResponse` is obsoleted by this change
however it is not removed in this commit so that the change is easier to
understand. That will be done next.
2022-03-11 22:53:38 +04:00
Kelson
f893777dc0 Merge pull request #721 from kiwix/xssVul
Use encoded URLs for searchSuggestionHtml
2022-03-09 14:33:11 +01:00
Nikhil Tanwar
04d682486a Add some tests to emulate XSS attack 2022-03-09 06:31:24 +01:00
Nikhil Tanwar
8136138492 use encoded URLs for searchSuggestionHtml
Previously, the seachURL was not encoded.
This resulted in an XSS vulnerability, a concept of proof is:

start kiwix-serve
visit - http://192.168.18.1:8081/"><svg onload="alert(1)">
This would display an alert message.

This encodes the searchURL before passing it to searchSuggestionHtml
2022-03-09 06:31:24 +01:00
Matthieu Gautier
e48b550b68 Merge pull request #620 from kiwix/search_caching 2022-03-08 18:12:33 +01:00
Maneesh P M
6523d9f563 Retrieve SuggestionSearcher from LRU Cache
We create a cache for SuggestionSearcher very similar to that of FT
searcher. User can specify a custom cache size using the environment
variable SUGGESTION_SEARCHER_CACHE_SIZE. It has a default value of 10%
of the number of books in the library.
2022-03-08 17:35:39 +01:00
Maneesh P M
7cb4c1361f Retrieve Searcher and Search from LRU Cache
We use the new cache template to implement two kind of cache.
1: The Searcher cache is more general in terms of its usage. A Searcher
   can be used for multiple searches without much change to itself. We
   try to retrieve the searcher and perform searches using it whenever
   possible, and if not we put a searcher into the cache. User can
   specify a custom cache length by manipulating the environment
   variable SEARCHER_CACHE_SIZE. It's default value is 10% of all the
   books available.
2: The search cache is much more restricted in terms of usage. It's main
   purpose is to avoid re-searching on the searcher during page changes
   to generate SearchResultSet of various ranges. User can specify a
   custom cache length using the environment variable SEARCH_CACHE_SIZE
   with a default value of 2;
2022-03-08 17:35:39 +01:00
Maneesh P M
a51f8d66a7 Introduce a LRU Cache and concurrent cache
The cache is copied from libzim project : https://github.com/openzim/libzim
The exact file as been copied from commit 27f5e70
2022-03-08 17:34:27 +01:00
Kelson
833bbc89ba Merge pull request #701 from kiwix/ladakhISO
Add mappings for languages not given by libicu
2022-03-05 17:07:48 +01:00
Emmanuel Engelhart
4bd02f07eb Beautify slightly the code 2022-03-05 16:59:15 +01:00
Nikhil Tanwar
9488842416 Add dagbani language in language map
Adds dagbani (dag) language in iso639_3 language map
2022-03-05 16:51:59 +01:00
Nikhil Tanwar
34b50ba30e Add mappings for languages not given by libicu
Adds a std::map<std::string, std::string> with display names for language codes not given by libicu
Fault language codes are taken from library.kiwix.org
2022-03-05 16:51:59 +01:00
Kelson
cfab560d74 Merge pull request #718 from kiwix/fix_booktitle_renderer
Fix search rendered for book title
2022-03-04 21:30:37 +01:00
Matthieu Gautier
422f4c7dd7 Reuse constructor when creating the SearchRenderer with basic constructor. 2022-03-04 17:08:59 +01:00
Matthieu Gautier
cc3545ac3b fixup! Readd a SearchRenderer constructor without Library argument. 2022-03-04 17:07:41 +01:00
Matthieu Gautier
609bc24cbe Small cleanups.
- Remove unused `archive`
- Replace tab by spaces
2022-02-25 15:46:13 +01:00
Matthieu Gautier
d9124ed40b Set the book title only if we have a library. 2022-02-25 15:46:13 +01:00
Matthieu Gautier
921671eb4d Do not use ostringstream to convert the uuid into string.
`zim::Uuid` already have a string convertion operator. Let's use it.
2022-02-25 15:46:13 +01:00
Matthieu Gautier
ec18eb40ea Readd a SearchRenderer constructor without Library argument.
Adding the library argument breaks the API. It is better to add
another constructor to not have to create another major version.
2022-02-25 15:46:13 +01:00
Matthieu Gautier
a11abcf480 Merge pull request #715 from kiwix/opds_dc_issued 2022-02-23 14:39:14 +01:00
Veloman Yunkan
ae2d7d20dc Handling of <dc:issued> in OPDS import 2022-02-23 14:20:49 +01:00
Veloman Yunkan
42ee14c8f5 New unit-test LibraryOpdsImportTest.allInOne 2022-02-21 23:20:35 +04:00
Veloman Yunkan
afb556bf64 Added <dc:issued> field to OPDS entries 2022-02-19 11:35:44 +04:00
Kelson
5c38300504 Merge pull request #713 from kiwix/use-local-include
Fix library.h include
2022-02-17 13:08:56 +01:00
Emmanuel Engelhart
cb2226c11f Fix library.h include 2022-02-17 10:56:46 +01:00
Matthieu Gautier
4cce4dce0b Merge pull request #710 from kiwix/unittests_for_404_html_page 2022-02-16 14:38:23 +01:00
Veloman Yunkan
34d069e61f Two more 404 error tests for the /raw endpoint 2022-02-16 14:25:43 +01:00
Veloman Yunkan
7a6562395a Testing of /ROOT/zimfile/invalid-article 2022-02-16 14:23:11 +01:00
Veloman Yunkan
92f9ee9280 Preparing to test archive dependent 404 responses 2022-02-16 14:23:11 +01:00
Veloman Yunkan
ae2d9b234f More test points in ServerTest.404WithBodyTesting 2022-02-16 14:23:11 +01:00
Veloman Yunkan
0ba452aece New unit-test ServerTest.404WithBodyTesting
The `ServerTest.RandomOnNonExistentBook` unit test was replaced with a
more general one testing multiple 404 scenarios where the content of the
body is checked too.
2022-02-16 14:23:11 +01:00
Veloman Yunkan
5f4256b900 Enter helper function makeExpected404Response() 2022-02-16 14:23:11 +01:00
Veloman Yunkan
a34dc725f9 ServerTest.RandomOnNonExistentBook
This test was introduced with the purpose of testing the error message
in the 404 page returned by /random for a non-existent book. The actual
expected output currently present in this new unit-test is too much for
that purpose and may become a maintenance burden if more tests of that
kind are added.
2022-02-16 14:23:11 +01:00
Kelson
892db07a2d Merge pull request #705 from thavelick/book_title_in_search_results
Add book titles to search results
2022-02-16 12:59:38 +01:00
Tristan Havelick
58be502f3f add book titles to search results 2022-02-16 12:50:18 +01:00
Matthieu Gautier
62ba2f4861 Merge pull request #709 from kiwix/unittest_for_suggestions 2022-02-16 11:30:26 +01:00
Veloman Yunkan
c782cc718a Workaround for Packages/build-deb CI failures
Packages/build-deb CI flows failed on ubuntu-bionic and ubuntu-focal
with the following mismatch in the ServerTest.suggestions unit-test:

```
[ RUN      ] ServerTest.suggestions
../test/server.cpp:715: Failure
Expected equality of these values:
  r->body
    Which is: ...
  removeEOLWhitespaceMarkers(expectedResponse)
    Which is: ...
With diff:
@@ -2,5 +2,5 @@
   {
     \"value\" : \"Ray (movie)\",
-    \"label\" : \"Ray (&lt;b&gt;movie&lt;/b&gt;...\",
+    \"label\" : \"Ray (&lt;b&gt;movie&lt;/b&gt;)\",
     \"kind\" : \"path\"
       , \"path\" : \"A/Ray_(movie)\"

Test context:
	url: /ROOT/suggest?content=zimfile&term=movie
```

For some reason (probably, a bug), the implementation of
`Xapian::MSet::snippet()` on those platforms decided that a single closing
parenthesis is more than is appropriate for inclusion in the snippet and
replaced it with a (longer) ellipsis.

Taking advantage of the necessity to work around that bug, the
ServerTest.suggestions's functional coverage was enhanced - the
problematic test point was replaced with a new one using a phrase
instead of a single term.
2022-02-14 18:17:48 +04:00
Veloman Yunkan
9a6aef4dba Moved/renamed LibraryServerTest.suggestions_in_range 2022-02-11 16:06:52 +04:00
Veloman Yunkan
943cbbf6ce New unit test ServerTest.suggestions 2022-02-11 16:06:52 +04:00
Kelson
ec94d9bfd9 Merge pull request #703 from kiwix/fix-mipsel-cross-compilation
Use "host_machine", not "target_machine" for cross-compilation
2022-02-09 07:09:56 +01:00
Emmanuel Engelhart
f2088d7fe0 Use "host_machine", not "target_machine" for cross-compilation
Read https://mesonbuild.com/Cross-compilation.html for all the rationals.
2022-02-09 07:02:14 +01:00
Kelson
19dd068e5a Merge pull request #706 from kiwix/langFix
Only add value in language object if value entry node is 'language'
2022-02-08 21:31:58 +01:00
Nikhil Tanwar
d56e56293b Only add value in language object if value entry node is 'language' 2022-02-08 19:03:35 +05:30
Kelson
dc4f9a4939 Merge pull request #700 from kiwix/ipLimit
Add method to change MHD_OPTION_PER_IP_CONNECTION_LIMIT
2022-02-06 15:12:32 +01:00
Nikhil Tanwar
261adf0ef9 Add method to change MHD_OPTION_PER_IP_CONNECTION_LIMIT
Adds new method setIpConnectionLimit() to server.
Default is 0 (infinite)
2022-02-05 18:31:42 +05:30
Matthieu Gautier
ce24b1fa5f Merge pull request #699 from kiwix/br_10.0.1 2022-02-02 16:06:18 +01:00
Matthieu Gautier
9193719c8f New version 10.0.1 2022-02-02 15:55:40 +01:00
Kelson
d0d253beed Merge pull request #698 from kiwix/legoktm-patch-1
PPA: Remove Ubuntu Hirsute, EOL
2022-02-01 08:27:50 +01:00
Kunal Mehta
cf95d513d6 PPA: Remove Ubuntu Hirsute, EOL 2022-01-31 23:13:00 -08:00
Kelson
e72c0b75f6 Merge pull request #697 from kiwix/titleWithLanguage
Add title with full language name
2022-01-29 10:01:20 +01:00
Nikhil Tanwar
4d996584fa Add title with full language name
This adds title and aria-label attributes with value as the language of book
Provides tooltip on language badges
2022-01-28 22:53:38 +05:30
Kelson
dd3338c2d0 Merge pull request #694 from kiwix/kelson42-patch-1
Fix macOS CI
2022-01-28 10:45:42 +01:00
Kelson
b19eb1ea61 Stop publishing code coverage for macOS
No reason to upload code coverage twice
2022-01-27 21:23:16 +01:00
Kelson
6d14639f77 Use Python 3.10 to fix macOS CI 2022-01-27 21:22:19 +01:00
Kelson
89e3a57a05 Merge pull request #693 from kiwix/non-minified-isotope
Add non-minified version of isotope.pkgd.js
2022-01-27 20:02:28 +01:00
Kunal Mehta
b94e4b7e3b Add non-minified version of isotope.pkgd.js
Debian wants to have the source files for minified scripts. Otherwise
same rationale as #368 which was for jquery-ui.

I downloaded this from <https://unpkg.com/isotope-layout@3.0.6/dist/isotope.pkgd.js>.
2022-01-27 00:11:04 -08:00
Kelson
68465079f0 Merge pull request #691 from kiwix/downloadLinkFix
Add span in querySelector
2022-01-24 15:43:10 +01:00
Nikhil Tanwar
f6309bb4c8 Add span in querySelector
Earlier querySelector for download button was returning a div, on which we called the getAttribute function hence returning null
This now returns a <span> element which returns the correct link with getAttribute
2022-01-24 19:23:26 +05:30
Kelson
45e9b76b19 Merge pull request #689 from kiwix/slightly-better-fix-685
Fix title='_' case too #685
2022-01-24 09:05:35 +01:00
Emmanuel Engelhart
5a9dbf85ec Fix title='_' case too #685 2022-01-24 08:35:36 +01:00
Kelson
cd412867d9 Merge pull request #688 from kiwix/legoktm-patch-1 2022-01-24 06:42:52 +01:00
Kunal Mehta
01edd830bc PPA: Fix libzim-dev dependency
Our libzim packages are "7.2.0~focal" but the ~ means that "7.2.0" is greater than
"7.2.0~focal" so the dependency can't be satisfied. Depending on "7.2.0~" will
allow "7.2.0~focal" to satisfy the dependency.
2022-01-23 20:30:23 -08:00
Kelson
ceb46f1069 Merge pull request #687 from kiwix/langCatBoxFill
Add undefined check for humanFriendlyTitle
2022-01-23 20:39:48 +01:00
Nikhil Tanwar
270773d6ba Add undefined check for humanFriendlyTitle
humanFriendlyTitle() now returns an empty string if title is undefined
which is handled in loadAndDisplayOptions()
2022-01-23 20:33:46 +01:00
Kelson
234606b170 Merge pull request #686 from kiwix/catalog_search_with_zero_count 2022-01-22 08:47:06 +01:00
Veloman Yunkan
b8328a78f6 /catalog/search?count=0 returns all entries 2022-01-21 19:31:46 +04:00
Veloman Yunkan
08c3a9d8b2 Testing of /catalog/search?count=0 2022-01-21 19:28:16 +04:00
Matthieu Gautier
744065276b Merge pull request #682 from kiwix/fix_test_compilation_windows 2022-01-19 16:22:16 +01:00
Matthieu Gautier
e3f8046915 Do not include posix header on Windows
They are included by 3dbcbe5 which need them to test permission rights on
posix only.
2022-01-19 16:15:59 +01:00
Matthieu Gautier
3198a849e2 Merge pull request #681 from kiwix/version_10.0.0 2022-01-19 16:08:08 +01:00
Matthieu Gautier
fcae21c55b Update Changelog 2022-01-19 16:02:39 +01:00
Matthieu Gautier
7da81acaa3 Update libzim dependency version to 7.2.0 2022-01-19 16:02:39 +01:00
Matthieu Gautier
a7d19b170a Update .gitignore 2022-01-19 12:47:40 +01:00
Matthieu Gautier
000029166b Fix the copyright in the doc configuration 2022-01-19 12:47:00 +01:00
Kelson
2ccea8e370 Merge pull request #680 from kiwix/fix_windows_compilation
Add a new private constructor not deprecated for Reader.
2022-01-18 16:31:32 +01:00
Matthieu Gautier
84587e7f03 Add a new private constructor not deprecated for Reader.
As we still create a `Reader` in the deprecated code of `Library`,
we need a way to create a reader without raising a deprecated warning.

So we create a another constructor with a dummy argument and we use it.
2022-01-18 12:22:11 +01:00
Kelson
38fc187303 Merge pull request #678 from kiwix/depends_libzim_7.1.0
Needs libzim 7.1.0 getMetadataItem()
2022-01-16 09:59:54 +01:00
Emmanuel Engelhart
2dc8fc9ab3 Needs libzim 7.1.0 getMetadataItem() 2022-01-15 15:57:04 +01:00
Kelson
60b4be2286 Merge pull request #676 from kiwix/customURLfix
Remove {root} from book item's link in library
2022-01-14 16:31:16 +01:00
Nikhil Tanwar
a8a68c54a2 Remove {root} from book item's link in library
The OPDS stream now provides full URL, including the custom root
Appending {root} duplicates the custom root, when given resulting in a broken link
2022-01-14 16:23:52 +01:00
Kelson
367e5d2636 Merge pull request #675 from kiwix/fix_android
Revert removing of deprecated methods used by android wrapper.
2022-01-14 16:19:14 +01:00
Matthieu Gautier
fcd865bb81 Revert removing of deprecated methods used by android wrapper. 2022-01-14 12:28:50 +01:00
Matthieu Gautier
715151d725 Merge pull request #674 from kiwix/deprecated 2022-01-13 14:32:28 +01:00
Matthieu Gautier
e5eeb08206 Remove old deprecated methods. 2022-01-13 14:23:29 +01:00
Matthieu Gautier
ac6f91798f Assume SuggestionItem has a public constructor.
`SuggestionItem` is somehow a simple container.
2022-01-13 14:23:29 +01:00
Matthieu Gautier
e5d26a4699 Deprecate SearchRenderer creation from a Searcher. 2022-01-13 14:23:29 +01:00
Matthieu Gautier
3d64a9d9a9 Deprecate Searcher creation.
As the `Searcher` is now deprecated, we also remove the unit tests on it.
`Searcher` is now untested, and so it reduces the code coverage.
2022-01-13 14:23:29 +01:00
Matthieu Gautier
fb7d9f02c8 Deprecate Reader creation.
As we `Reader` is now deprecated, we also remove the unit tests on it.
`Reader` is now untested, and so it reduces the code coverage.
2022-01-13 14:23:29 +01:00
Matthieu Gautier
96e0d15ab4 Deprecate Entry creation.
As the `Entry` is still created by `Reader` we need a way to create a
entry without raising a deprecated warning.
To do so we create a second constructor with a dummy argument.
This second constructor is private and is not marked as deprecated so we
can use it.
2022-01-13 14:23:29 +01:00
Matthieu Gautier
39732e2bcf Deprecate methods on Book.
- `update(const Reader& reader)` is replaced by
  `update(const zim::Archive& archive)`
- `getFavicon*()` is replaced by `getIllustration(48)->*`
2022-01-12 18:07:46 +01:00
Kelson
7dfafe0196 Merge pull request #673 from kiwix/urlEncode 2022-01-12 13:19:54 +01:00
Matthieu Gautier
3052d0787a UrlEncode the content_id.
The HumanReadableId can contains special char (`&`/`=`/...)
As it is used as to create a url in the opds template,
we must url encode it.

- We don't need to encode the book id as it is a uuid, it never contains
special char.
- We don't need to encode the book url as it is read from the library and
the url must already be correctly encoded in the library.xml.
(tests modified accordingly)
2022-01-11 17:53:29 +01:00
Kelson
d4db090bb9 Merge pull request #672 from kiwix/LadakhiISO
Add new method for language code
2022-01-11 15:39:49 +01:00
Nikhil Tanwar
0633f68f80 Add new values from wikipedia
Added new values. This now covers all iso 639-1 codes from https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
2022-01-11 15:32:06 +01:00
Nikhil Tanwar
2a5db3e7ab Use iso6391to3.js for language tag value
Improves upon the previous method of truncating language to first 2 values which was showing wrong values
2022-01-11 15:31:55 +01:00
Matthieu Gautier
468a080b09 Merge pull request #669 from kiwix/remove_meta_endpoints 2022-01-10 13:44:23 +01:00
Matthieu Gautier
1705f938b5 Extend unittest to check 404 error on wrong raw endpoints.
Check that `/raw` endpoint behaves correctly with wrong book name or
wrong kind.
2022-01-10 13:13:27 +01:00
Matthieu Gautier
0112e6102d Remove the meta endpoint in the server.
Now we have `/raw` and `catalog/v2/illustration` endpoints we don't need
to keep the meta endpoint.
2022-01-10 13:13:27 +01:00
Kelson
e4d99f0374 Merge pull request #668 from kiwix/fileExists
Introduce kiwix::fileReadable
2022-01-09 20:16:56 +01:00
Emmanuel Engelhart
aa50845e22 Fix small typos in comments 2022-01-09 20:05:30 +01:00
Nikhil Tanwar
3dbcbe542b Add tests for kiwix::fileExists and kiwix::fileReadable 2022-01-10 00:18:44 +05:30
Nikhil Tanwar
854058f842 Introduce kiwix::fileReadable
kiwix::fileExists only checks for file existence now
kiwix::fileReadable will check if the file is readable (implicitly checking for file existence also)
2022-01-05 20:16:38 +05:30
Matthieu Gautier
c9eb3196f7 Merge pull request #646 from kiwix/raw_endpoints 2022-01-05 15:31:35 +01:00
Matthieu Gautier
dc15a9a824 Add raw endpoint.
As the name suggests it, this endpoint is not smart :
It returns the content as it is and only if it is present
(no compatibility or whatever).

The only "smart" thing is to return a redirect if the entry is a redirect.
2022-01-05 15:12:41 +01:00
Matthieu Gautier
160a74f5f8 Extend ItemResponse and ContentResponse to return raw content. 2022-01-05 15:12:41 +01:00
Matthieu Gautier
78c10346f2 Merge pull request #645 from kiwix/illustration_api 2022-01-04 14:34:00 +01:00
Matthieu Gautier
6f1799db9f Use the new endpoint in the OPDS stream. 2022-01-04 14:16:46 +01:00
Matthieu Gautier
e108fb0e47 Add /catalog/v2/illustration endpoint 2022-01-04 14:16:46 +01:00
Matthieu Gautier
9482bfb95b Add a method to get the a book illustration for a specific size. 2022-01-04 14:16:46 +01:00
Matthieu Gautier
66c40817ee Fix the OPDS stream to handle custom ROOT prefix
As we render the entry's xml in a separated steps, we need to pass the
rootLocation to all the internal rendering.

Testing with and without root is not so easy.
I've simply made all server tests using a ROOT prefix.
We can assume that if the ROOT is present everywhere we need it, it will not
when we don't need. (As long as we don't hardcode "ROOT" in the server.)
2022-01-04 11:15:18 +01:00
Matthieu Gautier
22e5327dcf Do not create a dummy illustration if library.xml doesn't contain one.
Fix #644
2022-01-04 11:12:32 +01:00
Kelson
d29611ed75 Merge pull request #667 from kiwix/issue603
Safer scroll bound detection
2022-01-03 19:16:27 +01:00
renaud gaudin
b1bc883bf5 Fixed #603: safer scroll bound detection
At least on Retina Macbook Pros but most likely on other configurations,
the viewport's sizes is not exactly consistent to integer.
For instance, on a maximized Firefox, document.body.offsetHeight is 1,600.
When looking at the <html> on the inspector, I'd get 1,599.6, so **roughly** the same
but not exactly. Those inconsistencies are present on every level so being too strict
about those is probably not adequate.

This fixes #603 but allowing a 2% margin on the scroll position
to match the _end of screen_ and thus trigger the loading of additional cards.
This means that for the example above, it triggers at 1,568 instead of never reaching 1,600.

2% might be too large but it seems safe considering the potential of various resolutions
we may encounter and I don't see any side effect.
2022-01-03 14:41:19 +00:00
Kelson
a8577d7a01 Merge pull request #666 from kiwix/aria2Security
Make aria2 secret a random value
2022-01-03 09:46:11 +01:00
Nikhil Tanwar
8bdcb90818 Make aria2 secret a random value
Apps using this service will not have a default aria secret (previously 'kiwixariarpc')
2022-01-03 09:35:04 +01:00
Kelson
91a4491b74 Merge pull request #665 from kiwix/dependences_version
Dependences version
2022-01-02 12:31:25 +01:00
Emmanuel Engelhart
f36d8e9851 New kiwix::getVersions() and printVersions() 2022-01-02 12:22:11 +01:00
Kelson
5e1988640c Merge pull request #664 from kiwix/missingMetadataFix 2021-12-28 20:35:16 +01:00
Nikhil Tanwar
aa3bd8560b Add fallback image if thumbnail metadata is not available in zim 2021-12-29 00:15:34 +05:30
Nikhil Tanwar
9c047844c0 Add fallback if metadata (title, description, language, tags)
This provides a workaround the crash happening because of missing metadata.
Language div is set to be hidden if no language data is found
2021-12-29 00:11:57 +05:30
Matthieu Gautier
843d315b93 Merge pull request #663 from kiwix/fix_win32 2021-12-23 18:38:25 +01:00
Matthieu Gautier
f1035fa472 Fix win32 compilation.
WSASocket return a `INVALID_SOCKET` if something goes wrong,
not SOCKET_ERROR.
2021-12-23 18:32:43 +01:00
Kelson
6c95458f5e Merge pull request #622 from juuz0/issue613
Provide HTTP URL for the server
2021-12-22 18:35:34 +01:00
Nikhil Tanwar
9554ab5db0 Make getNetworkInterfaces() and getBestPublicIp() available via tools.h
Remove HTTP URL helper line - should be done in kiwix-serve
Add getters at server level - getAddress and getPort
2021-12-22 22:38:16 +05:30
Nikhil Tanwar
4b563e567e Provide HTTP URL for the server
Added a line to display the IP (use best if nothing is provided) along with port.
2021-12-22 22:08:25 +05:30
Matthieu Gautier
a77fae0993 Merge pull request #651 from kiwix/less_confusing_404_errors 2021-12-22 17:30:29 +01:00
Veloman Yunkan
ed2f914e10 Minor cleanup
The code for obtaining the archive now looks the same for the /meta,
/suggest, /search and /random endpoints.
2021-12-22 17:12:34 +01:00
Veloman Yunkan
872ddd9cb3 Cleaned up InternalServer::handle_suggest()
As a result of this clean-up the /suggest endpoint too stopped
generating confusing 404 Not Found errors (which, like in /meta's case
is not that important). Another functional change is that the "term"
parameter became optional.
2021-12-22 17:12:34 +01:00
Veloman Yunkan
20b5a2b971 Less confusing 404 errors from /meta endpoint
Before this fix the /meta endpoint could return a 404 Not Found page
saying

  The requested URL "/meta" was not found on this server.

Error cases producing such a result were:

- `/meta?content=NON-EXISTING-BOOK&name=metaname`

- `/meta?content=book&name=BAD-META-NAME`

Now a proper message is shown for each of those cases.

This fix is being done just for consistency (the /meta endpoint is not
a user-facing one and the scripts don't bother about error texts).
2021-12-22 17:12:34 +01:00
Veloman Yunkan
d8c525289b Changed the signature of Response::build_404()
Now Response::build_404() takes the URL instead of the entire
RequestContext object. An empty url suppresses the

 The requested URL "url" was not found on this server.

part of the error text.
2021-12-22 17:12:34 +01:00
Veloman Yunkan
f7b853373c Less confusing 404 errors from /random endpoint
Before this fix the /random endpoint could return a 404 Not Found page
saying

  The requested URL "/random" was not found on this server.

Error cases producing such a result were:

- `/random?content=NON-EXISTING-BOOK` (can happen when a server is
restarted or the library is reloaded and the current book is no longer
available).

- Failure of the libkiwix routine for picking a random article.

Now a proper message is shown for each of those cases.
2021-12-22 17:12:34 +01:00
Kelson
d21879ebda Merge pull request #662 from kiwix/legoktm-patch-1
PPA: Add Ubuntu Jammy
2021-12-20 07:57:34 +01:00
Kunal Mehta
35dc59d4bb PPA: Add Ubuntu Jammy 2021-12-19 16:45:16 -08:00
Kelson
a89924a23f Merge pull request #661 from kiwix/javascript_quotting_fix
Fix javascript/DOM quotting bug
2021-12-19 11:33:43 +01:00
Emmanuel Engelhart
7f6a6055a9 Fix javascript/DOM quotting bug 2021-12-19 11:27:45 +01:00
Matthieu Gautier
90910c5cab Merge pull request #648 from kiwix/unique_readers_in_searcher 2021-12-16 17:17:46 +01:00
Veloman Yunkan
250f46c7f9 fixup! Searcher::add_reader() rejects duplicate readers 2021-12-16 16:51:03 +01:00
Veloman Yunkan
0be00b791f Searcher::add_reader() rejects duplicate readers
A O(N) linear search was added to `Searcher::add_reader()` deliberately.
This doesn't seem to be an operation that may lead to performance
problems.
2021-12-16 16:51:03 +01:00
Matthieu Gautier
ff050dc811 Merge pull request #659 from kiwix/better-kiwix-version 2021-12-14 10:49:52 +01:00
Emmanuel Engelhart
9f3459f3f3 Better libkiwix version variable name 2021-12-13 18:22:40 +01:00
Kelson
7ea12697f4 Merge pull request #649 from kiwix/improve-kiwix-serve-topbar-accessibility
Improve kiwix-serve topbar accessibility
2021-12-11 17:27:43 +01:00
Emmanuel Engelhart
ab53e0cff1 Improve kiwix-serve topbar accessibility 2021-12-11 14:56:51 +01:00
Matthieu Gautier
27414f7731 Merge pull request #642 from kiwix/deadlock_in_Library_writeBookmarksToFile 2021-12-06 11:37:02 +01:00
Veloman Yunkan
e1db9164c8 Fixed deadlock in Library::writeBookmarksToFile() 2021-12-05 20:31:21 +04:00
Kelson
fa8d0f8e07 Merge pull request #640 from kiwix/docs
Add basic documentation for libkiwix.
2021-12-03 16:48:14 +01:00
Matthieu Gautier
3589a51fff Add basic documentation for libkiwix.
It follows the same logic of libzim documentation.
2021-12-01 16:06:36 +01:00
Matthieu Gautier
01ac0b2fe1 Merge pull request #636 from kiwix/library_reloading 2021-11-30 18:08:46 +01:00
Veloman Yunkan
405ea29900 Added Library::addOrUpdateBook() alias 2021-11-30 18:20:27 +04:00
Veloman Yunkan
7161db8e2a Manager::reload() also removes books from Library 2021-11-30 18:20:27 +04:00
Veloman Yunkan
262e13845c Enter Library::removeBooksNotUpdatedSince() 2021-11-30 18:20:27 +04:00
Veloman Yunkan
1d5383435d Noted a potential bug in Library::addBook() 2021-11-30 18:20:27 +04:00
Veloman Yunkan
ad2eb52553 Thread safe dumping of the OPDS feed 2021-11-30 18:20:27 +04:00
Veloman Yunkan
473d2d2a69 Introduced Library::getBookByIdThreadSafe() 2021-11-30 18:20:27 +04:00
Veloman Yunkan
02b9e32d18 Library became almost thread-safe
Library became thread-safe with the exception of `getBookById()`
and `getBookByPath()` methods - thread safety in those accessors is
rendered meaningless by their return type (they return a reference
to a book which can be removed any time later by another thread).
2021-11-30 18:20:27 +04:00
Veloman Yunkan
c2927ce6f7 Library got a yet unused mutex
Introducing a mutex in `Library` necessitates manually implementing the
move constructor and assignment operator. It's better to still delegate
that work to the compiler to eliminate any possibility of bugs when new
data members are added to `Library`. The trick is to move the data into
an auxiliary class `LibraryBase` and derive `Library` from it.
2021-11-30 18:20:27 +04:00
Veloman Yunkan
b712c732f2 Dropped Library::getBookBy*() non-const functions 2021-11-30 18:20:27 +04:00
Veloman Yunkan
298247ca9b Renamed NameMapperProxy -> UpdatableNameMapper 2021-11-30 18:20:27 +04:00
Veloman Yunkan
3aeeeeee76 Manager::reload() 2021-11-30 18:20:27 +04:00
Veloman Yunkan
226dac2604 LibraryManipulator is now merely a notifier
Originally `LibraryManipulator` was an abstract class completely decoupled
from `Library`. Its `addBookToLibrary()` and `addBookmarkToLibrary()`
methods could be defined in an arbitrary way. Now `LibraryManipulator` has to be
bound to a library object, those methods are no longer virtual, they always
update the library and allow for some additional actions via virtual
functions `bookWasAddedToLibrary()` and `bookmarkWasAddedToLibrary()`.
2021-11-30 18:20:27 +04:00
Veloman Yunkan
76a5e3a877 Library::addBook() updates the reader cache 2021-11-30 18:20:27 +04:00
Veloman Yunkan
2d6a7fe88d Testing of NameMapperProxy 2021-11-30 18:20:23 +04:00
Veloman Yunkan
6199c11505 NameMapperProxy respects the withAlias flag 2021-11-30 18:18:16 +04:00
Veloman Yunkan
8fffa59974 Added NameMapperProxy from kiwix/kiwix-desktop#714
The right place for NameMapperProxy introduced by kiwix/kiwix-desktop#714 is in
libkiwix (so that it can be reused in kiwix-serve).
2021-11-30 18:18:16 +04:00
Veloman Yunkan
4ccbdcb740 Code deduplication in NameMapperTest 2021-11-30 18:17:43 +04:00
Veloman Yunkan
5f3c34ed93 NameMapper's API is now const 2021-11-22 21:06:27 +04:00
Veloman Yunkan
d62c4fd521 Testing of HumanReadableNameMapper 2021-11-22 20:54:44 +04:00
Veloman Yunkan
339f845fb0 Bugfix in Book::getHumanReadableIdFromPath() 2021-11-22 20:54:44 +04:00
Veloman Yunkan
3296a020a1 Testing of Book::getHumanReadableIdFromPath()
New unit test BookTest.getHumanReadableIdFromPath revealed a bug in
`Book::getHumanReadableIdFromPath()`.
2021-11-22 20:54:44 +04:00
Veloman Yunkan
571e417d1e Manager is now safe to copy 2021-11-20 20:38:39 +04:00
Veloman Yunkan
913a368a12 Made Manager's ctors explicit 2021-11-20 20:34:13 +04:00
Veloman Yunkan
0e48baf9f9 Simplified Library::getReaderById()
Reused `Library::getArchiveById()` in `Library::getReaderById()`.
2021-11-19 20:17:12 +04:00
Matthieu Gautier
c7b88398bd Merge pull request #630 from kiwix/multi_illustration_books 2021-11-19 14:08:21 +01:00
Veloman Yunkan
4a01081e83 Thread-safe Book::Illustration::getData() 2021-11-19 16:44:25 +04:00
Veloman Yunkan
eb6a0d6456 Enter Book::getIllustrations() 2021-11-18 14:39:00 +04:00
Veloman Yunkan
e2544799a1 Shorter Book::update() 2021-11-18 14:39:00 +04:00
Veloman Yunkan
9f42884507 Book's illustrations are now immutable 2021-11-18 14:39:00 +04:00
Veloman Yunkan
8a6adddc16 Non-throwing Book::getDefaultIllustration() 2021-11-18 14:39:00 +04:00
Veloman Yunkan
c8da5eea2b Dropped Book::getMutableDefaultIllustration()
Now a Book is created without a default illustration.
2021-11-18 14:38:00 +04:00
Veloman Yunkan
bd29c4c7ef Book::updateFromOpds() resets Book::m_illustrations 2021-11-18 14:37:12 +04:00
Veloman Yunkan
e52a4a646b Book::updateFromXml() resets Book::m_illustrations 2021-11-18 14:36:42 +04:00
Veloman Yunkan
537ba7e6b9 Book::update() reads illustrations from ZIM file 2021-11-18 14:35:49 +04:00
Veloman Yunkan
f4bc3c8ced Book::Illustration got dimensions 2021-11-18 14:34:51 +04:00
Veloman Yunkan
5263f6880c Internally Book supports multiple illustrations 2021-11-18 14:34:51 +04:00
Veloman Yunkan
c129952605 Added a couple of notes on data consistency 2021-11-18 14:34:48 +04:00
Veloman Yunkan
9f0db6b7fa Book::Illustration::getData() 2021-11-18 14:33:50 +04:00
Veloman Yunkan
7d8a83cc97 Encapsulated access to Book::m_illustration 2021-11-18 14:32:52 +04:00
Veloman Yunkan
ec5a423924 Enter Book::Illustration
`Book::m_favicon` and its 2 friends are replaced with a single
`Book::m_illustration` data member.
2021-11-18 13:31:08 +04:00
Veloman Yunkan
811b73a4f1 Moved 2 small method definitions to cpp 2021-11-18 13:27:27 +04:00
Veloman Yunkan
3e5372ef29 RIP Book::{setFavicon(),setFaviconMimeType()} 2021-11-18 13:25:34 +04:00
Veloman Yunkan
abfd9d88d8 Book.updateTest creates the source book from XML
... thus eliminating the need for the Book::setFavicon*() methods.
2021-11-18 12:57:21 +04:00
Veloman Yunkan
4f65811011 Moved Book.updateTest below Book.updateFromXMLTest
Book.updateTest is going to be modified so that it relies on
functionality tested by Book.updateFromXMLTest. Hence the order of the
tests better reflect that dependency.
2021-11-18 12:57:21 +04:00
Veloman Yunkan
59e5c7ff4e Enhanced Book.updateFromXMLTest with favicon 2021-11-18 12:57:21 +04:00
Matthieu Gautier
ef1ad4bf47 Merge pull request #628 from kiwix/clickable-kiwix-serve-tile 2021-11-16 14:07:33 +01:00
renaud gaudin
8d50d5e293 added border-radius to download button 2021-11-16 12:33:05 +00:00
renaud gaudin
d76e670d5b CSS codefactor fix 2021-11-16 12:10:15 +00:00
Emmanuel Engelhart
8f5ffc5ef5 Duplicated ids are not allowed in a HTML doc 2021-11-16 12:10:15 +00:00
Emmanuel Engelhart
98f9c57e12 Better use double-quote for HTML attributes 2021-11-16 12:10:15 +00:00
Emmanuel Engelhart
db06b6b797 Renave static/home.css to static/index.css 2021-11-16 12:10:15 +00:00
Emmanuel Engelhart
513c547d99 Remove shadow around kiwix-serve home language in tile 2021-11-16 12:10:15 +00:00
Emmanuel Engelhart
c3d2e01157 Move kiwix-serve home download button at top of the tile 2021-11-16 12:10:15 +00:00
Emmanuel Engelhart
4adad9b281 Make kiwix-serve home tiles clickable 2021-11-16 12:10:15 +00:00
Matthieu Gautier
251f3a01ed Merge pull request #635 from kiwix/fix_httplib 2021-11-16 09:45:44 +01:00
Matthieu Gautier
0fcf166111 Fix warning in httplib.
The bug has been fixed upstream at yhirose/cpp-httplib#1091
This commit is a backport of the fix.
2021-11-16 09:39:56 +01:00
Matthieu Gautier
eea6f9fe27 Revert "Fix maybe initialized warning in httplib."
This reverts commit 1f6fb238ba.
2021-11-16 09:36:21 +01:00
Matthieu Gautier
0b9ff92e42 Merge pull request #634 from kiwix/fix_httplib 2021-11-10 14:35:11 +01:00
Matthieu Gautier
1f6fb238ba Fix maybe initialized warning in httplib.
Patch has been send upstream :
https://github.com/yhirose/cpp-httplib/pull/1085
2021-11-08 16:58:11 +01:00
Kelson
9479c0685d Merge pull request #623 from kiwix/update-ci
Re-introduce Ubuntu Impish in CI
2021-10-21 12:41:23 +02:00
Emmanuel Engelhart
09a55d71d6 Re-introduce Ubuntu Impish in CI 2021-10-21 12:36:05 +02:00
Matthieu Gautier
503eb5c4ce Merge pull request #621 from kiwix/fix_ci_docker_version 2021-10-19 11:49:20 +02:00
Matthieu Gautier
f714ff8d3e New docker image version is 31. 2021-10-18 18:09:36 +02:00
Matthieu Gautier
08e3d52957 Merge pull request #607 from kiwix/issue/571 2021-10-12 17:40:27 +02:00
Manan Jethwani
30e4c549e4 exposed fileExist, getMimeTypeForFile and getFileCoontent functions 2021-10-12 19:44:38 +05:30
Manan Jethwani
b7b385d87b added custom index template 2021-10-12 19:44:05 +05:30
Matthieu Gautier
e46b0c07b5 Merge pull request #617 from kiwix/adapt_new_libzim_api 2021-09-30 14:52:17 +02:00
Matthieu Gautier
cd9fb541fc Fix method call for new libzim API.
`add_archive` is now `addArchive`.
2021-09-29 11:55:22 +02:00
Matthieu Gautier
3b942bb745 Merge pull request #602 from kiwix/partial_opds_entries
OPDS feed with partial entries
2021-09-09 12:07:02 +02:00
Veloman Yunkan
c0bda426b4 Removed duplication across two mustache templates
Deduplicated the mustache templates static/templates/catalog_v2_entries.xml
and static/templates/catalog_v2_complete_entry.xml (the latter was
renamed to static/templates/catalog_v2_entry.xml).
2021-09-09 12:19:22 +04:00
Veloman Yunkan
b3f7556096 Added partial entries feed to the OPDS root feed 2021-09-09 12:19:22 +04:00
Veloman Yunkan
4c657c082e /catalog/v2/partial_entries OPDS API endpoint 2021-09-09 12:19:22 +04:00
Veloman Yunkan
e773a29f29 Rearranged elements in OPDS entry XML 2021-09-09 12:19:22 +04:00
Veloman Yunkan
e15a0f4338 /catalog/v2/entry/<entry_id> OPDS API endpoint 2021-09-09 12:19:22 +04:00
Veloman Yunkan
12d9b69806 OPDSDumper::dumpOPDSCompleteEntry() 2021-09-09 12:19:22 +04:00
Veloman Yunkan
027854e4f4 Extracted getSingleBookData() in opds_dumper.cpp 2021-09-09 12:19:22 +04:00
Matthieu Gautier
417e7471ac Merge pull request #614 from kiwix/disable-impish 2021-09-09 10:10:50 +02:00
Matthieu Gautier
51ac1240f8 PPA: Temporarily disable Impish builds 2021-09-09 10:05:18 +02:00
Kelson
ea6413ff88 Merge pull request #591 from kiwix/suggestion_range
Allow kiwix-serve to get suggestions of custom range
2021-08-20 08:09:35 +02:00
Maneesh P M
61209ea0d7 Allow kiwix-serve to get suggestions of custom range
This will allow handle_suggest API to accept two arguments `start` and
`suggestionLength` that will allow handle_suggest to retrieve
suggestions in the given range rather than the default 0-10 range.
2021-08-19 21:05:39 +05:30
Kelson
e9eaadde9e Merge pull request #567 from kiwix/suggestion_api_fix 2021-08-14 19:21:29 +02:00
Maneesh P M
8a4080baba Update libkiwix with new libzim api 2021-08-14 22:26:39 +05:30
Kelson
ba05999cba Merge pull request #604 from kiwix/issue/603 2021-08-11 06:50:30 +02:00
Manan Jethwani
a4c3cad018 fixed books availablity on larger screens and added zoom level support 2021-08-10 21:45:10 +05:30
Kelson
83e757a530 Merge pull request #600 from kiwix/dynamic_select_box_value
Use OPDS API to populate categories/languages select boxes on Kiwix Serve welcome page
2021-08-07 15:33:52 +02:00
Manan Jethwani
5e8f3a5505 added use of lang and category api for select boxes on welcome page 2021-08-07 02:39:50 +05:30
Manan Jethwani
fe93035a4c updated welcome page to support OPDS multiple Icon 2021-08-07 02:35:29 +05:30
Kelson
6e26c5aa75 Merge pull request #577 from kiwix/opds_multiple_icons
Support for multiple illustrations in OPDS entry
2021-08-05 23:22:07 +02:00
Veloman Yunkan
452283cfe6 Handling of /meta?name=Illustration_WxH@1 requests 2021-08-05 22:28:09 +04:00
Veloman Yunkan
e5168d8b3d Support for multiple illustrations in OPDS entry 2021-08-05 22:21:13 +04:00
Matthieu Gautier
b8aee8a42c Merge pull request #597 from kiwix/fix_get_results 2021-08-04 15:57:58 +02:00
Maneesh P M
9addd82d2d Fix usage of zim::Searcher::getResults() in libkiwix
The correct usage does not require the user to calculate an `end` using
the `pageLength`. We can directly use getResults(start, pageLength)
2021-08-04 19:20:50 +05:30
Maneesh P M
e74e7f5623 Add unit test for incremental searching
With this, we eventually want to see the usage of getResults giving
a FAILING TEST. This happens because the second argument to
getResults is NOT `end` of the range, but `maxResultCount` to retrieve.
This will be fixed in the next commit.
2021-08-04 19:20:05 +05:30
Matthieu Gautier
a032d65eb8 Merge pull request #576 from kiwix/extend_libkiwix_structures_to_use_libzim 2021-08-03 11:50:24 +02:00
Maneesh P M
19afe9442f Remove OriginId functions since they are not useful right now 2021-08-03 11:42:58 +02:00
Maneesh P M
a3ba7619df Update Manager to use Archive instead of Reader
kiwix::Manager uses Reader to import a zim file, it should be using
zim::Archive directly.
2021-08-03 11:42:58 +02:00
Maneesh P M
8b12434ff2 Update kiwix::book to use libzim structure
Some methods in kiwix::Book uses wrapper structure reader. This usage should
be extended from the native libzim structure zim::Archive
2021-08-03 11:42:58 +02:00
Matthieu Gautier
b4f7dfa5a2 Merge pull request #553 from kiwix/catalog_languages_endpoint 2021-08-03 11:41:31 +02:00
Veloman Yunkan
ab3095745e Languages OPDS feed includes book counts 2021-08-03 11:32:38 +02:00
Veloman Yunkan
45adda44b3 Got rid of <content> node in languages OPDS entry 2021-08-03 11:32:38 +02:00
Veloman Yunkan
96cf7e78a5 OPDSDumper::categoriesOPDSFeed() with no args 2021-08-03 11:32:38 +02:00
Veloman Yunkan
dd118df612 Got rid of langMap in opds_dumper.cpp
Language code to human friendly name translation is now done with the
help of the ICU library. It works if the line

```
-include $(LANGSRCDIR)/resfiles.mk
```

in the file `source/data/Makefile.in` of the icu4c dependency is not
commented out. Currently, the said line is commented out (along with
some other include's) by the `icu4c_custom_data.patch` patch of the
`kiwix-build` tool.
2021-08-03 11:32:38 +02:00
Veloman Yunkan
8a4248e48e Language code in /catalog/v2/languages entries 2021-08-03 11:32:38 +02:00
Veloman Yunkan
5f90f5ee2a Preliminary version of /catalog/v2/languages 2021-08-03 11:32:38 +02:00
Veloman Yunkan
64b55dbdc7 Made test library.xml a multi-language library 2021-08-03 11:32:38 +02:00
Veloman Yunkan
18871b4b15 Helper function Library::getBookPropValueSet()
Introduced a helper function `Library::getBookPropValueSet()` and
deduplicated Library::getBooks{Languages,Creators,Publishers}() methods.
2021-08-03 11:32:38 +02:00
Veloman Yunkan
b2027b397c List of languages entry in /catalog/v2/root.xml
Added a new entry in /catalog/v2/root.xml that points to a
not-yet-existing list of languages navigation feed.
2021-08-03 11:32:38 +02:00
Kelson
49322f5961 Merge pull request #596 from kiwix/better_filter
improved browser lang filter working
2021-07-31 23:24:14 +02:00
Manan Jethwani
0466b9759c improved browser lang filter working 2021-07-30 12:57:59 +05:30
Kelson
20cdefcdb8 Merge pull request #593 from kiwix/remove_groovy_package
Remove groovy deb package
2021-07-28 21:40:02 +02:00
Emmanuel Engelhart
6ea40f57da Remove groovy deb package 2021-07-28 21:34:24 +02:00
Kelson
a312d2218d Merge pull request #590 from kiwix/root_prefix_addition
corrected relative links in preview and icon url
2021-07-25 08:59:24 +02:00
Manan Jethwani
15839df594 corrected relative links in preview and icon url 2021-07-24 19:26:22 +05:30
Kelson
03a929e88e Merge pull request #583 from kiwix/download-modal
Modal download  box on Kiwix Serve welcome page
2021-07-13 16:47:58 +02:00
Manan Jethwani
646502f9cf changed font style for modal 2021-07-13 20:00:43 +05:30
Manan Jethwani
a8a96a99f4 corrected working of magnet link 2021-07-13 00:23:38 +05:30
Manan Jethwani
a517d3b529 added modal for downloading zim file on welcome page 2021-07-12 17:59:26 +05:30
Kelson
60f0f81286 Merge pull request #559 from kiwix/Css_revamp
Revamped Kiwix Serve Welcome page layout
2021-07-08 12:36:19 +02:00
Manan Jethwani
2ed9a50eca fixed button allignment 2021-07-08 12:33:28 +02:00
Manan Jethwani
bce922ab89 bug fix for loader 2021-07-08 12:33:28 +02:00
Manan Jethwani
ad7a63a471 minor change in UI 2021-07-08 12:33:28 +02:00
Manan Jethwani
6e8200637e corrected search button in mobile view 2021-07-08 12:33:28 +02:00
Manan Jethwani
cc45c840d1 fixed minor codefactor issue 2021-07-08 12:33:28 +02:00
Manan Jethwani
0590f27fa1 corrected select box and search bar design 2021-07-08 12:33:28 +02:00
Manan Jethwani
dd27c3a873 changed tile background color 2021-07-08 12:33:28 +02:00
Manan Jethwani
736841818d fixed font and other minor issues in title cards 2021-07-08 12:33:28 +02:00
Manan Jethwani
c1868e22f4 minor codefactor fix 2021-07-08 12:33:28 +02:00
Manan Jethwani
aabfc1d82e fixed card design 2021-07-08 12:33:28 +02:00
Manan Jethwani
2effb3490e minoor changes in responsive behaviour 2021-07-08 12:33:28 +02:00
Manan Jethwani
55672b0288 revamped basic layout and cards 2021-07-08 12:33:28 +02:00
Kelson
0abbeabfe2 Merge pull request #568 from kiwix/ppa-impish
PPA: Build for Ubuntu Impish
2021-07-08 10:24:52 +02:00
Kunal Mehta
1bf52e8ebe PPA: Build for Ubuntu Impish 2021-07-08 09:50:55 +02:00
Kelson
e2db1b3688 Merge pull request #574 from kiwix/remove_mustache_public_header
Fix public headers inclusion (+ small other fixes)
2021-07-07 18:00:17 +02:00
Matthieu Gautier
0b6b6716de Rename split argument from trimEmpty to dropEmpty. 2021-07-07 14:43:13 +02:00
Matthieu Gautier
18b6433322 Correct method declaration in SuggestionItem 2021-07-07 14:43:13 +02:00
Matthieu Gautier
b70c92cade Move back used helper functions to the public API.
- Add docstring
- Move the declaration in kiwix namespace.
- Adapt our include to include the right headers.
2021-07-07 14:43:13 +02:00
Matthieu Gautier
09d843da3a Add a (empty) include/tools.h header.
This header will contain our public tool functions.
2021-07-07 14:43:13 +02:00
Matthieu Gautier
fa83a61a54 Move all public *Tools.h in src.
This by definition remove all the tool functions from the public API.
2021-07-07 14:43:13 +02:00
Matthieu Gautier
967eb10cbf Merge pull request #578 from kiwix/fix_ci_deps
Use correct deps archive in the CI.
2021-07-07 10:59:53 +02:00
Matthieu Gautier
feeee25eac Use correct deps archive in the CI.
Now that project is named libkiwix, the dependencies archive is also
renamed.
2021-07-07 10:53:07 +02:00
Matthieu Gautier
1c0b4502cd Merge pull request #536 from kiwix/internally_drop_reader_searcher 2021-07-06 16:18:10 +02:00
Maneesh P M
6f639144ab Add unit tests for Searcher and Reader
Even though we will be removing the wrappers soon, the test coverage
should be complete and we could simply remove these files later.
2021-07-03 14:07:14 +05:30
Maneesh P M
a94a03cd22 Remove unwanted reader functions
Removing the functions in InternalServer that are no longer needed.
2021-07-03 14:07:14 +05:30
Maneesh P M
bc821638da Drop wrapper structures from handle_search
Since we now have SearcherRenderer that can work with native libzim
structure, we will drop the wrapper and use them instead.
2021-07-03 14:07:12 +05:30
Maneesh P M
bcece66960 Add SearchRenderer handles for libzim structures
Introduces a new member mp_search that houses the zim::Search object,
adds a new constructor for this purpose. This commit also add an
overload for getHtml that takes start and end integers as arguments
since they are not part of the search object we include.
2021-07-03 14:05:50 +05:30
Maneesh P M
c046f64d83 Drop Reader and Entry wrappers from handle_content 2021-07-03 14:05:50 +05:30
Maneesh P M
75b4d311d7 Drop Reader from InternalServer::handle_random 2021-07-03 14:04:04 +05:30
Maneesh P M
a236751c74 Drop usage of Reader from InternalServer::handle_suggest 2021-07-03 14:04:04 +05:30
Maneesh P M
7d68926539 Drop usage of Reader from InternalServer::handle_meta
This is essentially a code move of meta handlers from using Reader
functions to directly using Archive.
2021-07-03 14:04:02 +05:30
Maneesh P M
940368b8ac Add m_archives and getArchiveById to Library
These members will mirror the functionality offered by equivalent usage
of Reader class.
2021-07-03 14:02:31 +05:30
Kelson
0594e60df3 Merge pull request #527 from kiwix/catalog_search_url_generation
OpdsCatalog::getSearchUrl()
2021-06-30 21:35:43 +02:00
Veloman Yunkan
b5c1b26761 OpdsCatalog::getSearchUrl() 2021-06-30 18:27:00 +02:00
Kelson
4124ad30d5 Merge pull request #561 from kiwix/issue/557
Keep Kiwix Serve filter values over time
2021-06-25 08:05:21 +02:00
Manan Jethwani
3c5d73027d created separate variable for time delta 2021-06-24 15:41:28 +05:30
Manan Jethwani
d88bdd3ebf corrected filter on no results 2021-06-23 14:15:22 +05:30
Manan Jethwani
5cfe34a5c2 corrected filter working 2021-06-22 19:36:22 +05:30
Manan Jethwani
ad133bc9a3 added cookies for filter effect 2021-06-21 19:57:52 +05:30
Kelson
8eabae6286 Merge pull request #560 from kiwix/fix_compilation_illustration 2021-06-19 13:15:44 +02:00
Maneesh P M
f3c96b23fd Use getIllustrationItem instead of getFaviconEntry method
With openzim/libzim#540 we now have a new function to get
illustration(previously favicon in 48x48 size and unity scale) in
multiple sizes. We need to replace getFaviconEntry with this new
getIllustrationItem method.
2021-06-19 10:23:24 +05:30
Kelson
a92e9d8756 Merge pull request #490 from soumyankar/404ContentHomeButton
404 content home button
2021-06-17 09:59:45 +02:00
Vertigo
8d39b2c4c1 Added content ZIM home button on 404 2021-06-17 12:51:27 +05:30
Kelson
6d237ff1d5 Merge pull request #492 from kiwix/opds_categories_feed 2021-06-08 20:01:37 +02:00
Veloman Yunkan
78083f1f4a Moved OPDS templates under static/templates 2021-06-08 20:37:00 +04:00
Veloman Yunkan
dd60235010 Fixed the self link in the output of /catalog/v2/entries 2021-06-08 20:37:00 +04:00
Veloman Yunkan
e799f2ff1e OPDSDumper::dumpOPDSFeed() works via mustache
This changes the output of `/catalog/search` as follows:

- Entire search query (rather than only the value of the `q` parameter)
  is put in the <title> node.

- Search performed with an empty query presents itself as "All zims".

- The feed id remains stable for identical searches on the same
  library.
2021-06-08 20:37:00 +04:00
Veloman Yunkan
312f2cb560 Moved handle_catalog_v2*() methods into a new file 2021-06-08 20:37:00 +04:00
Veloman Yunkan
fa42cbc48f Pagination info in /catalog/v2/entries 2021-06-08 20:37:00 +04:00
Veloman Yunkan
f1797993af Reused InternalServer::search_catalog() 2021-06-08 20:37:00 +04:00
Veloman Yunkan
f886c8c07b Root url is normalized once in the constructor 2021-06-08 20:37:00 +04:00
Veloman Yunkan
9ca6bd006f /catalog/v2/categories goes through OPDSDumper too 2021-06-08 20:37:00 +04:00
Veloman Yunkan
cdacc0caf1 /catalog/v2/entries going through OPDSDumper
OPDSDumper sensed threats to its job security, so it lobbied to be
involved in handling the /catalog/v2 endpoints, too.
2021-06-08 20:37:00 +04:00
Veloman Yunkan
dfad1c3815 /catalog/v2/searchdescription.xml 2021-06-08 20:37:00 +04:00
Veloman Yunkan
07252a127a /catalog/v2/entries is also a search endpoint 2021-06-08 20:37:00 +04:00
Veloman Yunkan
b60e3ffb26 RequestContext::get_optional_param() 2021-06-08 20:37:00 +04:00
Veloman Yunkan
70d42aec98 A small simplification 2021-06-08 20:37:00 +04:00
Veloman Yunkan
4aa3c792aa Extracted get_search_filter() 2021-06-08 20:37:00 +04:00
Veloman Yunkan
208dece7e3 Reordered several statements
Reordered several statements so that the next couple of commits are a
little simpler.
2021-06-08 20:37:00 +04:00
Veloman Yunkan
19b59fd72f Serving /catalog/v2/entries
/catalog/v2/entries is intended to play the combined role of
/catalog/root.xml and /catalog/search of the old OPDS API. Currently,
the latter role is not yet implemented.

Implementation note: instead of tweaking and reusing
`OPDSDumper::dumpOPDSFeed()`, the generation of the OPDS feed is done via `mustache`
and a new template `static/catalog_v2_entries.xml`.
2021-06-08 20:37:00 +04:00
Veloman Yunkan
92c2de8d46 Enter InternalServer::m_library_id
The new field is intended to serve as a seed for generating semi-stable
OPDS feed ids that only need to change when the library is updated.
2021-06-08 20:37:00 +04:00
Veloman Yunkan
feeb9f206e /catalog/v2/* XMLs are OPDS 1.2 2021-06-08 20:37:00 +04:00
Veloman Yunkan
a1520ce7f1 Fixing the xenial build
Under Ubuntu 16.04/xenial, ccache seems to have issues with multiline
raw string literals used inside macros.
2021-06-08 20:37:00 +04:00
Veloman Yunkan
2e53b51696 Serving /catalog/v2/categories 2021-06-08 20:37:00 +04:00
Veloman Yunkan
b259afa408 Library::getBooksCategories()
Note: no unit test added
2021-06-08 20:37:00 +04:00
Veloman Yunkan
3c3cf08a1a Serving /catalog/v2/root.xml
Note: This commit somewhat relaxes validation of non variable
`<updated>` elements in the OPDS feed - the contents of any `<updated>`
element is replaced with the YYYY-MM-DDThh:mm:ssZ placeholder.
2021-06-08 16:03:43 +04:00
Veloman Yunkan
54b78eaf56 Moved gen_date_str() to tools/otherTools.cpp 2021-06-08 16:03:43 +04:00
Veloman Yunkan
1e0ff1fbb0 Fixed the double colon in OPDS date string 2021-06-08 16:03:43 +04:00
Veloman Yunkan
5b272ac49c Fixed handling of /catalogBLABLA/root.xml & alike
Also removed an unneeded namespace qualifier.
2021-06-08 16:03:43 +04:00
Veloman Yunkan
0a3d293ae0 Broke Server.404 with /catalogBLABLABLA/root.xml
The new negative test-point demonstrates that Kiwix server doesn't
distinguish /catalogBLABLABLA from /catalog.
2021-06-08 16:03:43 +04:00
Kelson
86ef2e2199 Merge pull request #550 from kiwix/remove-bintray
Remove Bintray badge
2021-06-07 16:01:37 +02:00
Emmanuel Engelhart
a0332e7599 Remove Bintray badge 2021-06-07 15:55:19 +02:00
Kelson
2ef488816c Merge pull request #534 from kiwix/filter_library
Add filters to kiwix-serve welcome page
2021-06-07 15:46:37 +02:00
Manan Jethwani
1ccafe2d97 minor changes in fadeout effect 2021-06-07 15:38:31 +02:00
Manan Jethwani
d6c62b3cd3 corrected spinner and fadeout effect 2021-06-07 15:37:20 +02:00
Manan Jethwani
f39c558d2a added fade out 2021-06-07 15:37:20 +02:00
Manan Jethwani
5b46ad5934 added spinned 2021-06-07 15:37:20 +02:00
Manan Jethwani
49dbd0aa52 fixed reset filters link 2021-06-07 15:37:20 +02:00
Manan Jethwani
179f0faeb1 added minor features 2021-06-07 15:37:20 +02:00
Manan Jethwani
bb92f26b60 added filter functionality 2021-06-07 15:37:20 +02:00
Matthieu Gautier
3a4e8303a0 Merge pull request #541 from kiwix/adding_dynamic_and_subset_loading
Dynamic and subset loading of catalogue in kiwix-serve
2021-06-07 15:35:13 +02:00
Manan Jethwani
063bb8cd65 added dynamic and subset loading of zim-files in kiwix-serve 2021-06-01 19:33:42 +05:30
Kelson
b54e5ab969 Merge pull request #543 from kiwix/add-libmicrohttpd-compilation-hint
Add libmicrohttpd compilation hint
2021-05-30 15:51:40 +02:00
Emmanuel Engelhart
2632a21d24 Move Repology to wiki 2021-05-30 15:46:45 +02:00
Emmanuel Engelhart
5c97b1fff9 gtest is need for testing 2021-05-30 15:43:21 +02:00
Emmanuel Engelhart
4f7175ad59 Libkiwix, not Kiwix library 2021-05-30 15:42:28 +02:00
Emmanuel Engelhart
f4b8d0c303 Add libmicrohttpd compilation hint 2021-05-30 15:35:40 +02:00
Matthieu Gautier
188694f2a1 Merge pull request #510 from kiwix/add_function_zimId 2021-05-26 15:15:13 +02:00
Maneesh P M
e2f6d91d51 Remove get_readerIndex in favor of get_zimId
The function get_readerIndex was used to get the zimId using an ordered
vector of readers. Now we can use get_zimId directly.
2021-05-26 14:45:25 +02:00
Maneesh P M
c35f6f9142 Add get_zimId method to Result
get_zimId method allows the user to get the uuid of the archive from
which a result is retrieved directly from the search result itself.
2021-05-26 14:45:25 +02:00
Matthieu Gautier
7f0d3004c9 Merge pull request #505 from kiwix/suggestion_snippets 2021-05-26 11:04:52 +02:00
Maneesh P M
5567d8ca49 Replace std::vector<std::string> with SuggestionItem
Each sugestions used to be stored as vector of strings to hold various values
such as title, path etc inside them. With this commit, we use the new
dedicated class `SuggestionItem` to do the same.
2021-05-26 10:53:39 +02:00
Maneesh P M
5315034afe Introduce SuggestionItem class
This is a helper class that allows to create and manage individual
suggestion item and their data.
2021-05-26 10:53:00 +02:00
Maneesh P M
3288cd80e5 Render suggestion snippet properly
To render the snippets properly, we need to use the _renderItem property
of the autocomple ui.
2021-05-26 10:53:00 +02:00
Maneesh P M
56434de79e Set label to title snippet if present
With openzim/libzim#545 we now support snippet generation of titles
which can be used as the display label on the ui for highlighted titles
via the "label" field.
The old version used plain title which is still available in the value
field.
2021-05-26 10:52:58 +02:00
Matthieu Gautier
e9ba151e6f Merge pull request #539 from kiwix/fix_windows_build
Avoid windows header to define min/max macros.
2021-05-26 09:52:43 +02:00
Matthieu Gautier
5f83944699 Avoid windows header to define min/max macros.
PR #507 use std::min.
But on windows, the header define min and max macros and so the
compilation is broken.
Add `-DNOMINMAX` define to avoid that.
2021-05-26 09:20:17 +02:00
Kelson
9c0ae835e2 Merge pull request #537 from kiwix/search_iterator_api_rename
Update libkiwix with search iterator rename in libzim
2021-05-26 08:49:26 +02:00
Maneesh P M
e5fac30cee Update libkiwix with search iterator rename in libzim
Search iterator API in libzim has been shifted to use camel case naming.
This has to be accomodated in libkiwix as well.
2021-05-26 08:39:13 +02:00
Matthieu Gautier
7ef08b670b Merge pull request #538 from kiwix/revert-530-adding_dynamic_and_subset_loading
Revert "Kiwix Serve welcome page dynamic and subset loading (OPDS based)"
2021-05-25 17:33:23 +02:00
Matthieu Gautier
2736a46cfe Revert "Kiwix Serve welcome page dynamic and subset loading (OPDS based)" 2021-05-25 17:30:05 +02:00
Kelson
672b4fc907 Merge pull request #530 from kiwix/adding_dynamic_and_subset_loading
Kiwix Serve welcome page dynamic and subset loading (OPDS based)
2021-05-25 16:22:54 +02:00
Manan Jethwani
012973d14a added dynamic and subset loading of zim-files in kiwix-serve 2021-05-25 02:41:12 +05:30
Kelson
67984cca5b Merge pull request #535 from kiwix/further-kiwix-lib-renaming
Rename kiwix-lib in libkiwix
2021-05-23 21:57:03 +02:00
Emmanuel Engelhart
d4e35c7067 Rename kiwix-lib in libkiwix 2021-05-23 21:46:52 +02:00
Matthieu Gautier
6e37cabaea Merge pull request #529 from kiwix/libkiwix-github-url-fix
Fix Libkiwix Github repository URLS
2021-05-21 16:09:15 +02:00
Emmanuel Engelhart
c8b7f8772a Fix Libkiwix Github repository URLS 2021-05-20 08:56:44 +02:00
Veloman Yunkan
fc7484ac86 Merge pull request #528 from kiwix/fix_suggestion_search_result_page
Check if bookName is available in url parameters
2021-05-19 00:36:15 +04:00
Maneesh P M
c236f3a32b Check if bookName is available in url parameters
In certain pages like the search result page, bookName is not of the
form `/bookName/endpoint?parameters`. Rather it is available as a query
parameter. From these pages bookName should be assigned from parameters.
2021-05-19 01:12:29 +05:30
Matthieu Gautier
3c7faddb6e Merge pull request #526 from kiwix/lizim_search_api_change
Fixed the libkiwix build broken by the changed libzim search API
2021-05-17 15:06:15 +02:00
Veloman Yunkan
cd02b4de3b Dummy application of new libzim search API
Didn't take any advantage of the new libzim search API. Just fixed the
libkiwix build in the most straightforward way.
2021-05-15 23:34:51 +04:00
Kelson
5188355878 Merge pull request #525 from kiwix/html-only-root-link
Insert root link only if html content
2021-05-14 16:51:33 +02:00
Emmanuel Engelhart
05cc3d015f Insert root link only if html content 2021-05-14 14:49:28 +02:00
Kelson
39b62c6108 Merge pull request #522 from kiwix/libkiwix_ci_fix
Fixed CI after repository rename
2021-05-13 11:09:57 +02:00
Veloman Yunkan
9c43353b72 Fixed CI after repository rename 2021-05-13 12:49:44 +04:00
Matthieu Gautier
b82fff9855 Merge pull request #507 from kiwix/fix_for_issue_504
/catalog/search handles out-of-bounds pagination
2021-05-10 11:47:49 +02:00
Veloman Yunkan
68189de162 /catalog/search handles out-of-bounds pagination 2021-05-10 11:25:06 +02:00
Matthieu Gautier
6aab9b6981 Merge pull request #506 from kiwix/library_filtering_by_empty_query
Empty query acts as a match-all query
2021-05-10 10:27:37 +02:00
Veloman Yunkan
41276341d0 Empty query acts as a match-all query
After switching to Xapian-based search in the library/catalog, an empty
query stopped acting as a match-all query. This commit restores the old
behaviour in that regard.
2021-05-09 15:14:43 +02:00
Kelson
02c3dff142 Merge pull request #508 from kiwix/remove_204_status_code
Revert "added 204 code for empty return of search"
2021-05-09 07:49:43 +02:00
Maneesh P M
be6b58c6ad Revert "added 204 code for empty return of search"
Returning status code 204 in case of an empty results doesn't show the
empty results page as described in #466. Reverting the changes in #396
fixes the issue.
2021-05-09 10:47:18 +05:30
Matthieu Gautier
ab0ffb55bc Merge pull request #502 from kiwix/no-meta4-file
No metalink file on fs
2021-05-04 13:52:54 +02:00
Emmanuel Engelhart
950e742116 No metalink file on fs 2021-05-04 13:15:43 +02:00
Matthieu Gautier
69257610e8 Merge pull request #495 from kiwix/replace_buggy_zim
Replace buggy example.zim with a correct one
2021-04-28 16:28:37 +02:00
maneeshpm
700d4becb9 Replace buggy example.zim with a correct one
The existing example.zim has an improper main page entry hence an
unusable index as reported in openzim/libzim#521
To replace the buggy zim, a new zim has been generated using the latest
zimwriterfs with latest libzim.

-------------------------------------------------------------------------
The directory used to create zim is given below and are two pages from
wikibooks site:

htmlContent
├── favicon.png
├── FreedomBox for Communities_Offline Wikipedia - Wikibooks, open books for an open world_files
│   ├── index.php
│   ├── load.php
│   ├── poweredby_mediawiki_88x31.png
│   └── wikimedia-button.png
├── FreedomBox for Communities_Offline Wikipedia - Wikibooks, open books for an open world.html
├── Wikibooks_files
│   ├── 234px-Megakaryocyte1.svg.png
│   ├── 287px-ChewyGingerCookies.jpg
│   ├── 36px-Commons-logo.svg.png
│   ├── 40px-Wikiquote-logo.svg.png
│   ├── 41px-Wikispecies-logo.svg.png
│   ├── 46px-Wikisource-logo.svg.png
│   ├── 48px-MediaWiki-2020-icon.svg.png
│   ├── 48px-Phacility_phabricator_logo.png
│   ├── 48px-Wikimedia_Cloud_Services_logo.svg.png
│   ├── 48px-Wikimedia_Community_Logo.svg.png
│   ├── 48px-Wikivoyage-Logo-v3-icon.svg.png
│   ├── 51px-Wiktionary-logo.svg.png
│   ├── 53px-Wikipedia-logo-v2.svg.png
│   ├── 59px-Wikiversity_logo_2017.svg.png
│   ├── 86px-Wikidata-logo.svg.png
│   ├── 88px-Wikinews-logo.svg.png
│   ├── Haskell-logo.png
│   ├── index.php
│   ├── load.php
│   ├── poweredby_mediawiki_88x31.png
│   └── wikimedia-button.png
└── Wikibooks.html

The command for writing the zim:
$ zimwriterfs --welcome=Wikibooks.html --favicon=favicon.png --language=en --title=Wikibooks --description=testZim --creator=test --publisher=test --verbose ./htmlContent  ./out/example.zim
2021-04-28 15:24:50 +02:00
Matthieu Gautier
dd795bd56d Merge pull request #498 from kiwix/issue/446
fixed suggestion system
2021-04-28 14:40:06 +02:00
Manan Jethwani
5fdc51b23e fixed suggestion system 2021-04-28 14:34:24 +02:00
Matthieu Gautier
32cc6b0dcb Merge pull request #499 from kiwix/const_correct_library
const-correct kiwix::Library
2021-04-28 14:31:28 +02:00
Veloman Yunkan
3879b82112 const-correct kiwix::Library
- Made most methods of kiwix::Library const.
- Also added const versions of getBookById() and getBookByPath()
  methods.
2021-04-28 11:42:55 +04:00
Matthieu Gautier
7336dcab1d Merge pull request #488 from kiwix/fully_xapian_powered_catalog_search 2021-04-27 15:10:40 +02:00
Veloman Yunkan
63e9a09259 Cleaned up/beautified Library::updateBookDB() 2021-04-27 16:59:21 +04:00
Veloman Yunkan
4178c169dd Xapian documents in book DB store only the book id 2021-04-27 16:59:21 +04:00
Veloman Yunkan
59e9a0cd77 Merged XmlLibraryTest with LibraryTest
The library set up by LibraryTest now contains two valid books
initialized via XML. Therefore XmlLibraryTest is not needed as a
separate test suite.
2021-04-27 16:59:21 +04:00
Veloman Yunkan
f751aff2fb Full case/diacritics insensitivity in catalog filtering
Catalog filtering should now be case/diacritics insensitive for all
fields. However it is not validated for language, name and category
fields, and is validated for tags, creator & publisher only for text
supplied in the filter (but not for values read from the book).
2021-04-27 16:59:21 +04:00
Veloman Yunkan
87dc9d2723 Made catalog filtering by query diacritics insensitive
Catalog filtering by titles/description was sensitive to diacritics
present in the query string. Fixed that.

Also enhanced the unit test to validate the insensitivity to diacritics
present in either the title/description or the query string.
2021-04-27 16:59:21 +04:00
Veloman Yunkan
9c7366890d Catalog filtering by tags works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan
19e195cb7d Filter::Tags typedef 2021-04-27 16:59:21 +04:00
Veloman Yunkan
3d5fd8f585 Catalog filtering by creator works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan
d3d5abe14d Handling of non-words in publisher query
This change fixes the failure of the LibraryTest.filterByPublisher
unit-test broken by the previous commit.

The previous approach used in `publisherQuery()` for building a phrase
query enforcing the specified prefix for all terms fails if

1. the input phrase contains a non-word term that Xapian's query parser
   doesn't like (e.g. a standalone ampersand character, 1/2, a#1, etc);
2. the input phrase contains at least three terms that Xapian's query
   parser has no issue with.

Using the `quest` tool (coming with xapian-tools under Ubuntu) the
issue can be demonstrated as follows:

```
$ quest -o phrase -d some_xapian_db "Energy & security"
Parsed Query: Query((energy@1 PHRASE 11 Zsecur@2))
Exactly 0 matches
MSet:

$ quest -o phrase -d some_xapian_db "Energy & security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db 'Energy 1/2 security act'
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

$ quest -o phrase -d some_xapian_db "Energy a#1 security act"
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries
```

The problem comes from parsing the query with the default operation set
to `OP_PHRASE` (exemplified by the `-o phrase` option in above
invocations of `quest`). A workaround is to parse the phrase with a
default operation of `OP_OR` and then combine all the terms with
`OP_PHRASE`.

Besides stemming should be disabled in order to target an exact phrase
match (save for the non-word terms, if any, that are ignored by the
query parser).
2021-04-27 16:59:21 +04:00
Veloman Yunkan
e805f68994 Enhanced & broke LibraryTest.filterByPublisher 2021-04-27 16:59:21 +04:00
Veloman Yunkan
a759ab989f Catalog filtering by publisher works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan
7ccd9ffcce Catalog filtering by language works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan
0c0a37073b Catalog filtering by category works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan
415c65cf03 Catalog filtering by book name works via Xapian 2021-04-27 16:59:21 +04:00
Veloman Yunkan
8287f351e7 Final logic of Library::filterViaBookDB()
Moved the `filter.hasQuery()` check inside `buildXapianQuery()`.
`Library::filterViaBookDB()` only cares if the query that is going to be
run on the book DB would match all documents. The rest of changes
related to enhancing the usage of Xapian for the catalog search will
happen inside `buildXapianQuery()` and `updateBookDB()`.
2021-04-27 16:59:21 +04:00
Veloman Yunkan
ea779ac200 Extracted buildXapianQuery() 2021-04-27 16:59:21 +04:00
Veloman Yunkan
80cd1fc989 Renamed 2 functions in Filter and Library 2021-04-27 16:59:21 +04:00
Veloman Yunkan
2d76f8395e Dropped unused functions from Filter's private API
This should have been done back in PR #460
2021-04-27 16:59:21 +04:00
Veloman Yunkan
29a6a34ecf Delimited kiwix::Filter's public and private APIs 2021-04-27 16:59:21 +04:00
Veloman Yunkan
2f3f1a4859 Improved LibraryTest.filterByMultipleCriteria 2021-04-27 16:59:21 +04:00
Veloman Yunkan
b9be742085 LibraryTest.filterByMaxSize 2021-04-27 16:59:21 +04:00
Veloman Yunkan
95c354a5fa LibraryTest.filterByCategory 2021-04-27 16:59:21 +04:00
Veloman Yunkan
cdd272fc5a LibraryTest.filterByName 2021-04-27 16:59:21 +04:00
Veloman Yunkan
ef962a9174 LibraryTest.filterByPublisher 2021-04-27 16:59:21 +04:00
Veloman Yunkan
f063d350c6 LibraryTest.{filterLocal,filterRemote} 2021-04-27 16:59:21 +04:00
Veloman Yunkan
d8fe593f59 Extended the unit-test library with 2 XML entries 2021-04-27 16:59:21 +04:00
Veloman Yunkan
22b8625033 Enter EXPECT_FILTER_RESULTS()
This diff is easier to view if whitespace change is ignored.
2021-04-27 16:59:21 +04:00
Veloman Yunkan
0f277ffa34 Enhanced the LibraryTest.filterByTags unit-test 2021-04-27 16:59:21 +04:00
Veloman Yunkan
068f7e5e95 New unit-test LibraryTest.filterByCreator 2021-04-27 16:59:21 +04:00
Veloman Yunkan
8c810d2d2f Enhanced the LibraryTest.filterByQuery unit-test 2021-04-27 16:59:21 +04:00
Veloman Yunkan
8c18a37961 Split LibraryTest.filterCheck into several tests 2021-04-27 16:59:21 +04:00
Veloman Yunkan
db3e0d7f72 Enhanced the LibraryTest.filterCheck unit-test
Now the LibraryTest.filterCheck unit-test validates the actual entries
returned by `Library::filter` (previously only the count of the results
was checked).
2021-04-27 16:59:21 +04:00
Matthieu Gautier
d134ad417f Merge pull request #497 from MananJethwani/issue/481
removed redirect to articles in search
2021-04-20 17:13:01 +02:00
Manan Jethwani
965b9622c2 removed redirect to articles in search 2021-04-20 20:23:42 +05:30
Kelson
11db5dec4e Merge pull request #494 from kiwix/ripple_effect_of_libzim_pr524
get_url() was renamed in zim::search_iterator
2021-04-16 12:38:15 +02:00
Veloman Yunkan
9d4370403b get_url() was renamed in zim::search_iterator 2021-04-16 13:30:36 +04:00
Kelson
cb57178c23 Merge pull request #491 from kiwix/fix_macos_ci
Update brew before installing packages
2021-04-12 18:41:00 +02:00
Matthieu Gautier
9ba5ab4678 Update brew before installing packages
brew changed his backend repository, we must update brew itself first.
2021-04-12 18:31:29 +02:00
Matthieu Gautier
a597870025 Merge pull request #465 from soumyankar/master 2021-04-12 18:17:57 +02:00
Vertigo
611146aa37 Added Search Link for bad bookName/articleName on 404 2021-04-12 21:31:47 +05:30
Matthieu Gautier
6d2f227c42 Merge pull request #486 from kiwix/fix_for_issue462 2021-04-12 15:18:57 +02:00
Veloman Yunkan
0c7d19ab45 Testing of Manager.readXml() 2021-04-12 15:14:12 +02:00
Veloman Yunkan
b54215f146 Manager::readOpds() doesn't modify its input 2021-04-12 15:14:12 +02:00
Veloman Yunkan
9033f2f28e Manager::readXml() doesn't modify its input 2021-04-12 15:14:12 +02:00
Matthieu Gautier
5c289abd0e Merge pull request #485 from kiwix/fix_for_issue478 2021-04-12 15:05:13 +02:00
Veloman Yunkan
ec9186b174 Library::removeBookById() updates the search DB
This fix makes the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test pass.
2021-04-09 17:06:45 +04:00
Veloman Yunkan
aaaa5a637e Library::filter() doesn't create empty books
This changes how the `XmlLibraryTest.removeBookByIdUpdatesTheSearchDB`
unit-test fails.
2021-04-09 17:06:45 +04:00
Veloman Yunkan
49940a30d0 XmlLibraryTest.removeBookByIdUpdatesTheSearchDB
The new unit-test fails with a reason not expected before it was
written. The `Library::filter()` operation returns a correct result
after the call to `removeBookById()` (this was a surprise!) but it has
a side-effect of re-adding an empty book with the id still surviving
in the search DB (the emptiness of this re-created book doesn't allow
it to pass the other filtering criteria, which explains why the result
of `Library::filter()` is correct). Had to add a special check
to the new unit-test against that hidden side-effect of
`Library::removeBookById()` + `Library::filter()` combination.
2021-04-09 17:06:45 +04:00
Veloman Yunkan
24ed96a38c Library.removeBookById() drops the reader too
This fix makes the `XmlLibraryTest.removeBookByIdDropsTheReader`
unit-test pass.
2021-04-09 17:05:56 +04:00
Veloman Yunkan
ccdc316217 Two unit-tests for Library::removeBookById
The `XmlLibraryTest.removeBookByIdDropsTheReader` unit-test fails,
demonstrating a bug in `kiwix::Library::removeBookById()`.
2021-04-09 16:59:55 +04:00
Matthieu Gautier
ba44033273 Merge pull request #464 from MananJethwani/issue/kiwix-tools/205
adding kind and path attributes to suggest response object and using it in autocomplete
2021-04-07 18:08:29 +02:00
Manan Jethwani
5cb276a933 adding kind and path attributes to suggest response object and using it in autocomplete 2021-04-07 21:04:33 +05:30
Matthieu Gautier
e4be97a032 Merge pull request #470 from kiwix/xapian_should_not_be_exposed 2021-04-07 16:55:24 +02:00
Veloman Yunkan
aa2a031ba4 Xapian headers are not exposed through libkiwix 2021-04-07 18:24:33 +04:00
Kelson
803cb1c2c5 Merge pull request #476 from MananJethwani/correcting#474
changed method of injecting root link
2021-03-24 10:20:54 +01:00
Manan Jethwani
7872734f44 changed method of injecting root link 2021-03-24 14:17:58 +05:30
Kelson
d061697de7 Merge pull request #474 from MananJethwani/issue/473
injecting root link directly and renamed head_part to head_taskbar
2021-03-24 09:24:42 +01:00
Manan Jethwani
c557bb271b injecting root link directly and renamed head_part to head_taskbar 2021-03-24 02:10:16 +05:30
Kelson
1f45c42c32 Merge pull request #472 from MananJethwani/issue/407
added root functionality for block external link feature
2021-03-23 10:54:22 +01:00
Manan Jethwani
93264f7409 added root functionality for block external link feature 2021-03-23 03:17:14 +05:30
Matthieu Gautier
baed447dd3 Merge pull request #460 from kiwix/xapian_based_catalog_search 2021-03-17 14:45:56 +01:00
Veloman Yunkan
20b487da8d Added Xapian as direct dependency 2021-03-17 14:32:03 +01:00
Veloman Yunkan
e214efecd4 Language code conversion via ICU
Language code is converted from ISO 639-3 to ISO 639 (which is
understood by Xapian) via ICU. The previous approach via an explicit
map had its advantages since Xapian has more than one stemmer
implementations for some languages (selectable via Xapian-specific
identifiers). This commit relies on the defaults associated with the
ISO 639 language codes.
2021-03-17 14:32:03 +01:00
Veloman Yunkan
09233bf4f3 Support for partial queries in catalog search
The search text in the catalog query is interpreted as partial by
default, but partial query mode can be disabled in C++. The latter
possibility is not exposed via the /catalog/search kiwix-serve endpoint,
though.
2021-03-17 14:32:03 +01:00
Veloman Yunkan
47c67a4202 LibraryServerTest.catalog_search_with_word_exclusion 2021-03-17 14:32:03 +01:00
Veloman Yunkan
6b600a18eb LibraryServerTest.catalog_prefix_search 2021-03-17 14:32:03 +01:00
Veloman Yunkan
9e887cadf1 Added some diversity to test/data/library.xml 2021-03-17 14:32:03 +01:00
Veloman Yunkan
a599fb3892 Initial version of Xapian-based catalog search 2021-03-17 14:32:03 +01:00
Veloman Yunkan
a17fc0ef2d Library::getBooksByTitleOrDescription() 2021-03-17 14:32:03 +01:00
Veloman Yunkan
db06b2c7ca Library::BookIdCollection typedef 2021-03-17 14:32:03 +01:00
Veloman Yunkan
a20f9e2ce1 Library::filter() works in two stages
1. Get the subset of books matching the q (title/description) parameter
   of the search

2. Filter out books not matching the other parameters of the search.

Stage 1. currently works in the old way, but will be replaced by Xapian
based search in subsequent commits.
2021-03-17 14:32:03 +01:00
Kelson
f7c867f8a7 Merge pull request #459 from kiwix/opds_category_support
Support for book categories in OPDS feed
2021-03-17 12:37:16 +01:00
Veloman Yunkan
b7b0bdbdd8 Both Book::update() methods update the category 2021-03-17 14:10:57 +04:00
Veloman Yunkan
a870e05621 Slight enhancement of BookTest.updateTest 2021-03-17 14:10:57 +04:00
Veloman Yunkan
4abc4f8518 Support for book category attribute in library.xml 2021-03-17 14:10:57 +04:00
Veloman Yunkan
6b2067c236 Reading category element from OPDS stream 2021-03-17 14:10:57 +04:00
Veloman Yunkan
e55bf514e8 Dedicated 'category' parameter in catalog search 2021-03-17 14:10:57 +04:00
Veloman Yunkan
80d4f7e349 Extracted InternalServer::search_catalog() 2021-03-17 14:10:57 +04:00
Veloman Yunkan
f270724b1f Testing of a library entry without a category 2021-03-17 14:10:44 +04:00
Veloman Yunkan
58186ffb26 kiwix::Book::getCategory() 2021-03-17 14:09:48 +04:00
Veloman Yunkan
6d43fd065f Less boilerplate in LibraryServerTest unit-tests 2021-03-17 14:02:27 +04:00
Veloman Yunkan
071d2bedd3 LibraryServerTest.catalog_search_by_text 2021-03-17 14:02:27 +04:00
Veloman Yunkan
0b1740e6c5 LibraryServerTest.catalog_search_by_tag 2021-03-17 14:02:27 +04:00
Veloman Yunkan
9913f748e2 LibraryServerTest.catalog_searchdescription_xml 2021-03-17 14:02:27 +04:00
Veloman Yunkan
c5c40cb189 New unit-test LibraryServerTest.catalog_root_xml 2021-03-17 14:02:27 +04:00
Veloman Yunkan
ae32ff40c0 Dropped an extra colon from book <updated> dates 2021-03-17 14:02:27 +04:00
Veloman Yunkan
26331b401e Fixed the month in OPDS feed <updated> date
`tm::tm_mon` varies in the [0, 11] range.
2021-03-17 14:02:27 +04:00
Matthieu Gautier
0f368791a2 Merge pull request #461 from MananJethwani/issue/444 2021-03-15 16:57:10 +01:00
Manan Jethwani
fb26f6b9c5 moved autocomplete from head_part.html to taskbar.js 2021-03-15 18:10:10 +05:30
Matthieu Gautier
32643fbd94 Merge pull request #463 from soumyankar/master
Change Mustache-3.0 to Mustache-4.1
2021-03-15 11:38:57 +01:00
Vertigo
faa9e1f8b5 Change Mustache-3.0 to Mustache-4.1 2021-03-14 11:43:48 +05:30
Kelson
bd781f8e8b Merge pull request #458 from kiwix/html_decoded_suggestions
Correctly encoding/decoding HTML entities in search suggestions
2021-03-04 14:34:47 +01:00
Veloman Yunkan
c7d77395e7 label field of suggestions is also HTML escaped
Without this if the suggestion text contains a double quote, the
response stops being a valid json.
2021-03-04 14:18:58 +01:00
Veloman Yunkan
a7fea462b0 HTML decoding of suggestions in the frontend
Since the `value` field of the search suggestion results is HTML
escaped/encoded in the backend (see static/templates/suggestion.json) it
must be decoded in the frontend.
2021-03-04 14:18:58 +01:00
Matthieu Gautier
89d7e68a39 Merge pull request #446 from kiwix/libzim_random
Use the new libzim's getRandomEntry instead of implementing it ourselves.
2021-03-03 16:04:00 +01:00
Matthieu Gautier
67caae6c32 Use the new libzim's getRandomEntry instead of implementing it ourselves. 2021-03-02 14:16:09 +01:00
Kelson
d3f2e08b35 Merge pull request #429 from kiwix/open_zimfile_by_fd
JNI interface to opening ZIM archives (including embedded ones) by fd
2021-02-26 09:20:58 +01:00
Veloman Yunkan
839fc10a4f Fixed the Windows build
Opening ZIM archives by file descriptor (as well as embedded
ZIM archives) is not supported under Windows.
2021-02-10 14:19:47 +01:00
Veloman Yunkan
5a8b825c70 Testing of JNIKiwixReader.getDirectAccessInformation() 2021-02-10 14:19:47 +01:00
Veloman Yunkan
7a465e66d7 Renamed org.kiwix.kiwixlib.{Pair->DirectAccessInfo} 2021-02-10 14:19:47 +01:00
Veloman Yunkan
5a99634dfd Java wrapper test checks favicon.png too 2021-02-10 14:19:47 +01:00
Veloman Yunkan
e028bcbb04 Android's java.io.FileDescriptor is different 2021-02-10 14:19:47 +01:00
Veloman Yunkan
9cdf7a44c0 JNIKiwixReader can open an embedded ZIM archive 2021-02-10 14:19:47 +01:00
Veloman Yunkan
4d23e44de7 JNIKiwixReader ctor taking a file descriptor
... and a corresponding unit test
2021-02-10 14:19:47 +01:00
Veloman Yunkan
98d69ef59b Added testReader unit-test for the java wrapper 2021-02-10 14:19:47 +01:00
Veloman Yunkan
e40827fbac Renamed the java wrapper unit test runner script 2021-02-10 14:19:47 +01:00
Veloman Yunkan
a798e0c0a1 Made the java wrapper unit test run & pass
The kiwixlib java wrapper unit test can be run manually via the
src/wrapper/java/org/kiwix/testing/compile_test.sh script.

The test ZIM files in src/wrapper/java/org/kiwix/testing were created
using the create_test_zimfiles. They must be updated/re-generated and
committed in git whenever their source data or the create_test_zimfiles
script changes. Note: small.zim.embedded is not used at this point, it
was created for testing the enhancement coming in a few commits.
2021-02-10 14:19:47 +01:00
Matthieu Gautier
971374e049 Merge pull request #455 from kiwix/api-break-bumpup-major-version
Bump-up version to 10.0.0
2021-02-10 10:47:06 +01:00
Emmanuel Engelhart
f7608c378e Bump-up version to 10.0.0 2021-02-10 10:31:01 +01:00
Kelson
c13a43010a Merge pull request #453 from kiwix/fixed-typo
Typo fix in error message
2021-02-09 09:41:00 +01:00
Emmanuel Engelhart
eea10ec3f5 Typo fix in error message 2021-02-09 09:38:18 +01:00
Kelson
c9bc2b48b0 Merge pull request #452 from kiwix/version-update
Bump-up version (and libzim required version)
2021-02-08 21:30:11 +01:00
Veloman Yunkan
b1689e0d3c Bump-up version (and libzim required version) 2021-02-07 19:53:46 +04:00
Kelson
c70c370ae0 Merge pull request #435 from kiwix/ppa-soname-bump
PPA: Update for soname bump
2021-01-27 09:25:39 +01:00
Kunal Mehta
c0ec5eeffd PPA: Update for soname bump 2021-01-27 09:14:36 +01:00
Matthieu Gautier
c81c2a4630 Merge pull request #445 from kiwix/no_pthread 2021-01-26 18:06:16 +01:00
Matthieu Gautier
24b2e6e585 Remove unnecessary include. 2021-01-26 17:53:25 +01:00
Matthieu Gautier
3fd1310008 Use c++11 std::thread instead of pthread. 2021-01-26 17:53:25 +01:00
Matthieu Gautier
17bc1a3e1a [TEST] Do not try to use the bookId if it is wrong. 2021-01-26 17:53:25 +01:00
Matthieu Gautier
f5d9a3714d Merge pull request #449 from kiwix/handle_no_counter
Do not crash if zim file has no `Counter` metadata.
2021-01-26 17:52:33 +01:00
Matthieu Gautier
4749656828 Do not crash if zim file has no Counter metadata. 2021-01-26 15:15:27 +01:00
Kelson
e0bcafd89a Merge pull request #442 from kiwix/better-taskbar-insertion
Better taskbar insertion
2021-01-18 13:32:11 +01:00
Emmanuel Engelhart
84895c4036 Better </head> detection regex 2021-01-18 13:16:56 +01:00
Emmanuel Engelhart
a8bf9dd5b4 Better Kiwix Serve Taskbar insertion (after charset definition) 2021-01-18 11:18:53 +01:00
Emmanuel Engelhart
a61c94ef10 Add GPLv3 header 2021-01-18 10:54:33 +01:00
Kelson
a8a864f70e Merge pull request #434 from kiwix/mandatory_xapian
Build kiwix-lib only if meson is compiled with xapian.
2021-01-15 10:03:57 +01:00
Matthieu Gautier
a231dfd8e4 Build kiwix-lib only if meson is compiled with xapian. 2021-01-14 12:07:17 +01:00
Kelson
321d08e3d5 Merge pull request #440 from kiwix/better-taskbar-introduction
Fix taskbar insertion in case of '<head>' attributes
2021-01-12 11:48:57 +01:00
Emmanuel Engelhart
8c43fd8d36 Fix taskbar insertion in case of '<head>' attributes 2021-01-11 14:37:19 +01:00
Kelson
9e032b6eea Merge pull request #439 from kiwix/mediacount-video-audio
Support 'video/*' * 'audio/*' mimetypes in getMediaCount()
2021-01-08 13:01:42 +01:00
Emmanuel Engelhart
3e2810dff4 Support 'video/*' * 'audio/*' mimetypes in getMediaCount() 2021-01-07 12:32:32 +01:00
Kelson
18afa97674 Merge pull request #437 from kiwix/count-webp-media
More robust getMediaCount()
2021-01-03 21:15:55 +01:00
Emmanuel Engelhart
44c4aa931a Better use kiwix::startsWith() 2021-01-03 15:17:03 +01:00
Emmanuel Engelhart
95b32b168d More robust getMediaCount() 2021-01-01 17:05:32 +01:00
Kelson
2659f323cd Merge pull request #433 from kiwix/small_fixes
Fixes related to libzim next.
2020-12-11 06:46:40 +01:00
Matthieu Gautier
1002c15e0d Remove unnecessary checks.
`Reader` cannot be created with a null `zimArchive`.
We don't have to check for zimArchive being not null.
2020-12-09 14:25:02 +01:00
Matthieu Gautier
d51000c4a9 Use new libzim method hasFulltextIndex to check for fulltext index. 2020-12-09 14:25:02 +01:00
Matthieu Gautier
ba302bed33 Use new libzim method getFaviconEntry to get the favicon. 2020-12-09 14:25:02 +01:00
Kelson
9941d245e1 Merge pull request #432 from swills/fbsd-build-fix1
fix build on FreeBSD
2020-12-09 14:24:04 +01:00
Steve Wills
6900b4e506 fix build on FreeBSD
With this header, sockaddr_in and INADDR_ANY are not defined
2020-12-07 09:38:46 -05:00
Kelson
6fa20f6dcf Merge pull request #431 from kiwix/legoktm-patch-1
PPA: Build for Ubuntu Hirsute
2020-12-05 09:29:47 +01:00
Kunal Mehta
8eacb0d635 PPA: Build for Ubuntu Hirsute 2020-12-03 22:35:46 -08:00
Kelson
84c320d5d3 Merge pull request #428 from kiwix/libzim_next
Adapt kiwx-lib to the new api of libzim (libzim_next).
2020-12-03 09:11:28 +01:00
Matthieu Gautier
1a5a2e7a8e Adapt kiwix-lib to the new libzim api. 2020-12-02 12:16:48 +01:00
Matthieu Gautier
d87079ec13 Remove deprecated method in the reader. 2020-11-24 19:00:52 +01:00
Matthieu Gautier
fbd4332c87 New version 9.4.1 2020-11-17 15:42:13 +01:00
Matthieu Gautier
7c4517ca3c Merge pull request #427 from kiwix/fix_deps_zlib 2020-11-17 12:05:50 +01:00
Matthieu Gautier
bc4b6846ef Add missing dependency declaration in the README. 2020-11-17 11:54:36 +01:00
Matthieu Gautier
f408fecdd0 Add missing zlib dependency.
libzim were dependent of zlib and we were (wrongly) using its
dependency declaration to link with zlib.

Now that libzim doesn't depends on zlib, we need to fix our build system
and explicitly depend on it.
2020-11-17 11:54:10 +01:00
Kelson
038e86e0d8 Merge pull request #419 from kiwix/legoktm-patch-1
debian: Have libkiwix-dev depend on libmicrohttpd-dev
2020-11-04 15:20:36 +01:00
Kunal Mehta
41cee1cfe0 debian: Have libkiwix-dev depend on libmicrohttpd-dev
As of 9c925f6778, it's now required in the pkg-config file.
2020-11-04 15:14:09 +01:00
Kelson
7e2d174cfb Merge pull request #422 from kiwix/rgaudin/taskbar-height
Adjust body padding-top for taskbar
2020-11-04 15:13:18 +01:00
renaud gaudin
52da58a294 Adjust body padding-top for taskbar
taskbar is placed *above* content using a `padding-top: 3em;` rule
Currently, in regular case, padding-top is too large and leaves ~4/5px between the
taskbar and the content.
This fixes it by using a `calc()` rule to eliminate this extra space
2020-11-04 11:53:59 +00:00
Matthieu Gautier
0f8caba3a5 Merge pull request #418 from kiwix/fix_counter_parsing 2020-10-30 13:42:52 +01:00
Veloman Yunkan
0f8fe1f63f Alternative implementation of parseMimetypeCounter() 2020-10-29 14:11:27 +04:00
Matthieu Gautier
ed32e16db2 Use a reference in test/server.cpp loop.
This is mainly to make the macos CI pass.
2020-10-28 16:08:37 +01:00
Matthieu Gautier
08464f23bc Better parsing of M/Counter
Mimetype may contain a parameters.
Then, the mimetype would be something like "text/html;foo=bar;foz=baz"

It will contains a `;` and `=` and it conflicts with the same operators
we use to separate the items in our list.

We have to use a more advanced algorithm which takes the context into
account.

Fix #416
2020-10-28 16:03:18 +01:00
Matthieu Gautier
ef42abea4b Add some tests of parseMimetypeCounter 2020-10-28 14:44:23 +01:00
Matthieu Gautier
4407dd12bd Move mimetypeCounter parsing in its own function. 2020-10-28 14:08:06 +01:00
Matthieu Gautier
d546ae38c4 Merge pull request #417 from kiwix/fix_macos_ci_install
[CI] Fix macos packages installation
2020-10-27 15:18:23 +01:00
Matthieu Gautier
7a11ec6ea4 [CI] Fix macos packages installation 2020-10-27 15:13:19 +01:00
Matthieu Gautier
095c86cf90 Merge pull request #415 from kiwix/stop_server 2020-10-16 14:12:47 +02:00
Matthieu Gautier
632583ede2 Add missing include 2020-10-07 18:43:57 +02:00
Matthieu Gautier
9c925f6778 kiwix-lib pc need libmicrohttpd 2020-10-07 17:48:30 +02:00
Matthieu Gautier
61f9d4ab3a Stop the internal server only if it exists. 2020-10-07 14:36:45 +02:00
Kelson
de6b8ba4de Merge pull request #413 from hashworks/archCommunityPkg
Add repology package badge
2020-08-30 22:05:37 +02:00
hashworks
ab7349dbc8 Switch to repology package badge 2020-08-30 20:59:14 +02:00
hashworks
292004703e Add shield and link to official Arch Linux package
I recently became an Arch Linux Trusted User and moved kiwix-lib
to the official community repository.
2020-08-30 13:22:13 +02:00
Kelson
c3e1e46d58 No AUR libkiwix package anymore 2020-08-30 09:56:38 +02:00
Matthieu Gautier
aba2f35092 New version 9.4.0 2020-08-28 16:04:40 +02:00
Matthieu Gautier
b3a3dfb79f Merge pull request #411 from kiwix/fix_subprocess_destructor 2020-08-28 16:02:25 +02:00
Matthieu Gautier
470bfc3f1f Better variable name for outStream. 2020-08-28 15:27:03 +02:00
Matthieu Gautier
ea3180cb8c Better error printing. 2020-08-28 15:27:03 +02:00
Matthieu Gautier
72d3f8f8e2 Fix segmentation fault with curl requests.
Use a heap allocated buffer (with lifetime of Aria2 class) instead of
a stack allocated one.

Original fix made by @ZaWertun. Kudos to him.

Fix #kiwix/kiwix-desktop#123, kiwix/kiwix-desktop#513
and kiwix/kiwix-desktop#423
2020-08-26 12:42:16 +02:00
Matthieu Gautier
af9e03904c Use std::mutex and std::unique_lock instead of pthread mutex/lock.
It simplify a bit the code and ensure that mutex is correctly unlock
even in case of exception.
2020-08-26 12:30:56 +02:00
Matthieu Gautier
39611cbd60 Wait for waitingThread to exit before destroying the subprocess memory.
WaitingThread read some shared memory with the SubProcess
(`mutex`, `m_running`).
When we destroy the SubProcess, we must be sure that WaitingThread has
correctly finished else we may have invalid read/write on freed memory.
2020-08-26 12:26:04 +02:00
Kelson
4e98a76c19 Merge pull request #408 from kiwix/ppa-bionic
PPA: Build and publish packages for Ubuntu BIonic
2020-08-21 19:17:22 +02:00
Kunal Mehta
14767e6edb debian: Support already gzipped manpages
meson stopped automatically compressing in 0.49.0, but bionic uses 0.45.0.
2020-08-21 19:12:20 +02:00
Kunal Mehta
39bc8828cd PPA: Build packages for Ubuntu Bionic 2020-08-21 19:12:20 +02:00
Kelson
c183f57670 Merge pull request #401 from kiwix/rgaudin/external-block-warn
Added comment marking dependency of a JS variable with warc2zim
2020-08-19 18:32:22 +02:00
renaud gaudin
009eb7f905 Added comment marking dependency of a JS variable with warc2zim
warc2zim's service worker captures all requests and then decides what to do based on
availability of the URL in the ZIM or not.
To allow the external URL blocking mechanism, it needs to known whether this was
enabled or not (as those in-iframe links won't be caught).
It detects this by looking for the `window.block_path` variable that is set in the
`block_external.js` script.

As this is fragile, we're adding a comment so that a future maintainer knows that
a third party tools relies on it.
2020-08-19 18:31:46 +02:00
Matthieu Gautier
eb7a8beb77 Merge pull request #397 from kiwix/refactor_response 2020-08-13 11:22:06 +02:00
Matthieu Gautier
6f0d3003ac Remove m_compress member. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
ee17b0739a Fix compilation on CI native dyn.
On the CI, the native_dyn docker image is setup with a packaged version
on libmicrohttpd for which `MHD_HTTP_RANGE_NOT_SATISFIABLE` is not
defined.

When the CI will be fixed, we can revert this commit.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
47436f7bdd Move some header setting in response's constructors.
It make easier to understand what is somehow constant and what depends
of the context.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
3352c95314 Remove the RedirectResponse and use a basic Response with header. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
77123ac74c Move the adding of 304 headers in 304 factory.
This avoid us to create a ContentResponse just to have some correct
headers.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
9078f0ac6e Remove ResponseMode. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
8d6567d067 Create a utility builder for 416 response.
Also add a map in the response to store specific headers.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
6d5cddca12 Fix android compilation
Android clang complains about the fact it cannot move the
`std::unique_ptr<ContentResponse>` into a `std::unique_ptr<Response>&&`
(for the implicit `std::unique_ptr<Response>` constructor).
Let's help him a bit.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
a3939e9a05 Move all the content code in the ContentResponse. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
eee621d15b Move small utilities method to create response in Response class. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
7b2ee37437 Move the entry response to its own class. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
f014fb2895 Introduce a ContentResponse.
This is only an "interface" for now as other type of response (entry) may
be "transformed" to a ContentResponse.
We cannot move all the code in the class.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
1011d1ff0b Move the redirection response in its own class.
The redirection is the easiest to move, let's start with this one.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
9e351b279e Remove get_default_response in favor of a static Response method.
We want to build different kind of response depending of the context.
2020-08-13 11:16:41 +02:00
Matthieu Gautier
a0bdc0821c Move internalServer code into its own source files. 2020-08-13 11:16:41 +02:00
Matthieu Gautier
a819d9e3e0 Make the server handle pointer to response instead of plain response.
This is a preparatory work.
We will specialize the response and so we need a pointer to response
instead of plain response.
2020-08-13 11:16:41 +02:00
rgaudin
49f3c56680 Merge pull request #400 from kiwix/rgaudin/home-favicon-size
Set fixed size for favicon in home page listing
2020-08-13 09:59:25 +02:00
renaud gaudin
1657b1744c Set fixed size for favicon in home page listing
While [spec](https://wiki.openzim.org/wiki/Metadata#Favicon) says that the favicon
should be a 48x48 image, ZIM creators might not respect it.

If a ZIM contains a larger favicon, the UI is broken. This fixes it ans ensures all
favicon have equal sizes, removing the unpleasing lack of harmony that we can see sometimes.

Note that ZIM will smaller size favicon would get blurry as those would be upscaled.
2020-08-12 14:47:37 +02:00
Matthieu Gautier
a126482d69 Merge pull request #399 from MananJethwani/search_pagination 2020-08-12 11:11:40 +02:00
manan jethwani
c74b935a9b added pageLength for search_pagination 2020-08-12 02:08:02 +05:30
Matthieu Gautier
015444cfd5 Merge pull request #406 from kiwix/fix_taskbar_impl 2020-08-11 18:37:42 +02:00
Matthieu Gautier
a55d504017 Fix getArticleCount.
With #403, the article mimetype may be different than "text/html".
It can also be "text/html; raw=true".
(And in fact it already could have any kind of optional argument).
2020-08-11 18:27:54 +02:00
Matthieu Gautier
87b5adcaf4 Make the response responsible to detect if we must introduce taskbar.
The response detect if taskbar must be added depending of the mimetype.

Now, `set_taskbar` can be call unconditionally
(no need to check for the mimetype)

And we don't need to call set_taskbar if we have no information to set.
2020-08-11 18:27:54 +02:00
Matthieu Gautier
c14c148af7 Merge pull request #405 from kiwix/support_for_meson_0_45 2020-08-11 18:27:37 +02:00
Veloman Yunkan
56e46c43f8 Upped meson version in the README 2020-08-11 18:17:18 +02:00
Veloman Yunkan
c4e6313c90 x in a --> a.contains(x) in meson.build files 2020-08-11 18:17:18 +02:00
Veloman Yunkan
18e46969b7 Workaround for configure_file(copy:true)
The `copy` keyword argument of `configure_file()` first appears in
meson 0.47. Now performing file copy via python.
2020-08-11 18:17:18 +02:00
Veloman Yunkan
e8cc6e4205 Copying data files via a loop 2020-08-11 18:17:18 +02:00
Veloman Yunkan
5f0bcd2bfa Renamed wikipedia_en_ray_charles_mini_2020-03.zim 2020-08-11 18:17:18 +02:00
Veloman Yunkan
a3bef76083 example.zim is copied only if gtest is available 2020-08-11 18:17:18 +02:00
Kelson
66563d93cc Merge pull request #403 from kiwix/rgaudin/no-taskbar
Prevent taskbar and blocker at article level
2020-08-07 11:55:09 +02:00
renaud gaudin
3f25a3d005 Fixed #391: prevent taskbar and blocker at article level
Some HTML articles are meant to be displayed through a viewer. In this case,
we know we don't want the server to inject the taskbar nor the link blocker
because the content is not a user-ready web page but a partial element of it.

Such articles still need to be `text/html` to be parsed properly by browsers.

This changes the way we decide to display the tasbar or not.
Previously, we were adding it to every article with a MIME __starting with__ `text/html`.
Now, we're additionally preventing it on `text/html` MIME if there is a `;raw=true` string inside.

This leaves articles with MIME `text/html;raw=true` (warc2zim convention) outside
of the taskbar target.

For similar reasons, the external-link blocker is set to apply to the same set of articles.
Previously, it was applied to all articles which was an (unoticable) mistake.
2020-08-07 09:26:24 +02:00
Kelson
98a10d2ca1 Merge pull request #402 from kiwix/legoktm-aria2
Move aria2 from a build dependency to runtime
2020-08-05 23:08:21 +02:00
Kunal Mehta
fdc59b1ec9 debian: Move aria2 from build dep to runtime dependency
Fixes https://github.com/kiwix/kiwix-desktop/issues/505.
2020-08-05 14:02:03 -07:00
Kunal Mehta
1b399fc0a2 README: Clarify aria2 is only a runtime dependency 2020-08-05 13:58:53 -07:00
Kelson
5f8c099829 Merge pull request #396 from MananJethwani/add_204_status_code
Added 204 code for empty return of search.
2020-08-01 10:28:56 +02:00
MananJethwani
599aaa4c1b added code for status code 204 for empty return of search. 2020-08-01 01:45:42 +05:30
Matthieu Gautier
1884081ebe Merge pull request #388 from kiwix/case_insensitive_request_headers
Request header case is ignored
2020-07-30 16:27:14 +02:00
Veloman Yunkan
3d425f44de Request header case is ignored
Originally reported against case sensitivity of the Range header
(see issue #387), this fix applies to all request headers (since
according to RFC 7230 all header fields are case-insensitive, see
https://tools.ietf.org/html/rfc7230#section-3.2). However, a
corresponding unit-test was added only for the Range header.
2020-07-30 16:01:51 +02:00
Kelson
d8991c5459 Merge pull request #389 from kiwix/ppa-sync
PPA: Drop eoan and sync with proper Debian package
2020-07-24 11:21:41 +02:00
Kunal Mehta
c7cb87dd57 debian: Sync with proper Debian package
* Switch to debhelper compat 13
* Increase test timeout via `meson test`
* Depend on same version of libzim as meson.build specifies
* Restore mistakenly deleted libkiwix-dev.manpages
  * We still need this file to instruct Debian to associate it with the
    libkiwix-dev package.
2020-07-24 02:01:34 -07:00
Kunal Mehta
1145af43e7 PPA: Stop building for Ubuntu eoan, its EOL
Launchpad no longer accepts uploads for eoan.
2020-07-24 01:58:30 -07:00
Kelson
d5b2742fbb Merge pull request #384 from kiwix/legoktm-fail-fast
PPA: Set fail-fast: false
2020-07-24 08:45:04 +02:00
Kunal Mehta
a0eb972595 PPA: Set fail-fast: false
See discussion/rationale at <https://github.com/openzim/libzim/pull/370>.
2020-07-16 06:02:41 -07:00
Matthieu Gautier
3bf70f4315 New version 9.3.1 2020-07-15 16:42:16 +02:00
Matthieu Gautier
a8130bd4f2 Fix licensing in meson.build. 2020-07-15 16:41:57 +02:00
Matthieu Gautier
a35d95207e Merge pull request #381 from kiwix/manpage
Have meson manage/install the kiwix-compile-resources.1 man page
2020-07-15 14:39:02 +02:00
Kunal Mehta
b18c8e079e Have meson manage/install the kiwix-compile-resources.1 man page 2020-07-15 14:23:05 +02:00
Matthieu Gautier
4e9f563b45 Merge pull request #383 from kiwix/fix_path_windows 2020-07-15 12:00:58 +02:00
Matthieu Gautier
7ece383004 Add support for samba path on windows.
Fix kiwix/kiwix-desktop#429
2020-07-15 11:40:02 +02:00
Matthieu Gautier
4ca9558e30 Fix library test on windows. 2020-07-15 11:40:02 +02:00
Kelson
89582d526e Merge pull request #382 from kiwix/legoktm-sh4-atomic
Pass -latomic on sh4 architecture too
2020-07-15 06:39:40 +02:00
Kunal Mehta
5913f7efab Pass -latomic on sh4 architecture too
Same as #372.

I originally missed this because meson didn't know about sh4 until
recently (c.f. https://github.com/mesonbuild/meson/pull/7359), but
this should work fine on older meson versions too.
2020-07-14 17:22:55 -07:00
Matthieu Gautier
1367c5630f Merge pull request #368 from kiwix/jquery-ui-source
Add non-minified version of jquery-ui.js
2020-07-13 17:30:43 +02:00
Kunal Mehta
0ca27d8edf Add non-minified version of jquery-ui.js
Debian wants to have the source files for minified scripts. Also the
jqueryui.com downloader is broken, so if we ever needed to modify/fix the
library, it would be good to have a non-minified version on hand.

The non-minified version comes from
<https://github.com/components/jqueryui/blob/1.11.0/jquery-ui.js>, which
was the source for the now-deprecated bower package manager.
2020-07-13 16:45:24 +02:00
Kelson
60c6cc35d5 Merge pull request #379 from kiwix/ppa
Automatically build Debian packages and publish to PPA
2020-07-10 15:21:27 +02:00
Kunal Mehta
6b783b3998 Automatically build and publish packages via Github Actions
We can currently only build for Ubuntu versions because we're not yet
publishing libzim for Debian.

Development builds (on commits to master) will build against master libzim
while release builds (on tag pushes) will build against the most recent
release of libzim.
2020-07-10 02:48:23 -07:00
Kunal Mehta
55515f2fc6 Add Debian packaging
It may make sense to move kiwix-compile-resources.1 into the main source
and have meson manage it in the future.
2020-07-10 02:14:38 -07:00
Kelson
7fe07c65fd Merge pull request #378 from kiwix/increase-test-timeout
Increase test timeout to 160s
2020-07-10 10:16:22 +02:00
Kelson
ee204a9b5e Increase test timeout to 160s 2020-07-09 10:02:36 +02:00
Kelson
e743e04b94 Merge pull request #377 from kiwix/libmicrohttpd-compilation-fix
Fix compilation with libmicrohttpd v0.97.1
2020-07-08 14:56:13 +02:00
Kelson
cf8e8b94eb Fix compilation with libmicrohttpd v0.97.1 2020-07-08 14:42:46 +02:00
Matthieu Gautier
d9557da813 Merge pull request #376 from kiwix/dont_include_config
Do not include `kiwix_config.h` in public header.
2020-07-06 16:55:20 +02:00
Matthieu Gautier
c19b983914 Do not include kiwix_config.h in public header.
This define `VERSION` and may conflict with dependent projects.

If some want to get the version of kiwix-lib they can include
`kiwix_config.h` directly.
2020-07-06 16:03:06 +02:00
Matthieu Gautier
f997fdb232 Release 9.3.0 2020-07-02 15:17:46 +02:00
Matthieu Gautier
f0b037f37f Merge pull request #374 from kiwix/new_api_multithread_suggestion
Add new thread safe suggestion API.
2020-07-02 14:12:12 +02:00
Matthieu Gautier
4d307e18eb Add new thread safe suggestion API.
Previous API were using an internal vector to store the suggestions search
results.

The new API takes a vector as out argument. So user can call the functions
without having to protect the search.

We should change the android API to reflect the change but it is a bit
more complex to do at JNI level. As android do not call it multithreaded
we are safe for now. And we need the new API asap for kiwix-desktop.

So we keep the same API on android for now, the new api will be made
in next version.
2020-07-01 17:16:13 +02:00
Matthieu Gautier
e05bd8efd6 Release 9.2.3 2020-07-01 11:33:30 +02:00
Matthieu Gautier
71462696bd Merge pull request #372 from kiwix/link-atomic
Pass -latomic for architectures that need it
2020-06-29 14:43:16 +02:00
Kunal Mehta
fb79cde729 Pass -latomic for architectures that need it
Some architectures, specifically armel, mipsel, m68k & powerpc in
Debian, need to explicitly link to atomic.

Use meson to see if the target's CPU family is one of those, and if so,
pass -latomic to the linker.

Tested on armel and mipsel machines to verify passing -latomic works, and
on armhf and amd64 to ensure normal builds aren't broken.

Fixes #371.
2020-06-29 00:18:13 -07:00
Matthieu Gautier
c986290d83 Merge pull request #359 from kiwix/packaged-mustache
Support building against packaged libkainjow-mustache
2020-06-12 11:13:05 +02:00
Kunal Mehta
af9afab821 Support building against packaged libkainjow-mustache
The Debian/Ubuntu package for mustache.hpp installs it to
/usr/include/kainjow/mustache.hpp. Have meson look for it in that include
directory as well before erroring out.

Fixes #318.
2020-06-12 11:09:34 +02:00
Matthieu Gautier
14af7b756e Merge pull request #366 from kiwix/fix_build_windows
Include missing `algorithm` header.
2020-06-10 16:00:36 +02:00
Matthieu Gautier
ff605873ed Include missing algorithm header.
`min` and `max` functions are defined here.
2020-06-10 15:27:51 +02:00
Matthieu Gautier
6c49c7ee0a Merge pull request #362 from kiwix/build_ci_bionic 2020-06-09 12:13:34 +02:00
Matthieu Gautier
157d1664cf Fix test compilation on bionic 2020-06-09 12:10:05 +02:00
Matthieu Gautier
6f92b7e120 Build the CI also on bionic.
Bionic has a more recent compiler who will catch more issue with the
code.
2020-06-09 12:10:05 +02:00
Matthieu Gautier
fd62acd232 Merge pull request #365 from kiwix/server_corner_cases_unit_test 2020-06-08 15:28:59 +02:00
Veloman Yunkan
1cdf830217 Testing of byte-range requests of 0-sized entries 2020-06-03 14:18:22 +04:00
Veloman Yunkan
0b48ab20bb Enhanced the server unit-test with corner cases 2020-06-03 13:45:31 +04:00
Matthieu Gautier
081a2b2fa6 New version 9.2.2 2020-06-03 10:47:39 +02:00
Matthieu Gautier
bf93d10cde Merge pull request #364 from kiwix/issue363
Fix for the failing assertion in the ByteRange constructor
2020-06-03 10:43:16 +02:00
Veloman Yunkan
05ef5d5f51 Assertion in ByteRange allows 0-sized content
The assertion in the ByteRange constructor was written under the assumption that the content must have non-zero size. Now it allows that corner case.
2020-06-02 21:53:47 +04:00
Matthieu Gautier
4cdae3ca98 New version 9.2.1 2020-06-02 10:18:12 +02:00
Matthieu Gautier
7dcaeed33a Merge pull request #360 from kiwix/http_byte_range 2020-06-01 17:41:59 +02:00
Veloman Yunkan
f52b220d01 Dropped RequestContext::has_range() 2020-05-26 14:10:26 +04:00
Veloman Yunkan
50a850f3a9 Fixed a comment 2020-05-26 14:04:18 +04:00
Veloman Yunkan
886ae17274 Fixed a CodeFactor issue 2020-05-26 13:59:47 +04:00
Veloman Yunkan
a9b6d481cc ServerTest.RangeHasPrecedenceOverCompression 2020-05-26 13:58:20 +04:00
Veloman Yunkan
85d6daabac Rolled back minor unneeded changes 2020-05-26 13:10:50 +04:00
Veloman Yunkan
5f1918d005 Split a long line 2020-05-26 13:04:03 +04:00
Veloman Yunkan
16bd79fa1b Final clean-up of byte_range.{h,cpp} 2020-05-26 12:50:08 +04:00
Veloman Yunkan
c2ebdefe8d Handling of unsatisfiable ranges 2020-05-26 02:11:26 +04:00
Veloman Yunkan
37032892a4 Fixed compilation error under win32_*
ERROR is a macro under Windows
2020-05-26 01:58:17 +04:00
Veloman Yunkan
6b43438b74 Fixed compilation error under native_dyn
MHD_HTTP_RANGE_NOT_SATISFIABLE is not defined in the older version of
libmicrohttpd (that is used under CI/Linux native_dyn).
2020-05-26 01:54:36 +04:00
Veloman Yunkan
7301bf89bb Some refactoring of byte-range parsing 2020-05-26 01:50:29 +04:00
Veloman Yunkan
ff23b28e7c Removed unnecessary qualifier 2020-05-26 01:41:37 +04:00
Veloman Yunkan
931e95f391 Invalid byte ranges result in 416 responses 2020-05-26 01:40:07 +04:00
Veloman Yunkan
f7571b5b69 Content-Range header is set only for partial content 2020-05-25 17:42:18 +04:00
Veloman Yunkan
801ad18a89 ByteRange::resolve() 2020-05-25 17:27:35 +04:00
Veloman Yunkan
67a347c0c4 Moved byte-range parsing to byte_range.cpp 2020-05-25 17:21:10 +04:00
Veloman Yunkan
693905eb68 Default constructed ByteRange is a full range 2020-05-25 17:17:56 +04:00
Veloman Yunkan
f3e79c6b4c Introduced src/server/byte_range.cpp 2020-05-25 16:43:44 +04:00
Veloman Yunkan
52f207eaa6 Support for single-ended byte ranges 2020-05-25 16:37:01 +04:00
Veloman Yunkan
67294217a8 ByteRange::Kind 2020-05-25 16:23:44 +04:00
Veloman Yunkan
d111a40ce8 Response::m_byteRange 2020-05-23 20:35:22 +04:00
Veloman Yunkan
0c5bb3fcfe Moved ByteRange to a header file of its own 2020-05-23 20:08:53 +04:00
Veloman Yunkan
3fba8c20a0 Converted RequestContext::ByteRange to a class
Also renamed the `range_pair` data member of `RequestContext` to `byteRange_`
2020-05-23 19:59:47 +04:00
Veloman Yunkan
54db6049b7 Byte-range parsing not exposed in the header file 2020-05-23 18:58:19 +04:00
Veloman Yunkan
81c38d6b2b parse_byte_range() without side-effects 2020-05-23 18:53:16 +04:00
Veloman Yunkan
e6a86c02ae Got rid of RequestContext::accept_range 2020-05-23 17:15:42 +04:00
Veloman Yunkan
a0f7f32570 Re-ordered function definitions 2020-05-23 17:11:26 +04:00
Veloman Yunkan
c39fce8839 RequestContext::parse_byte_range() 2020-05-23 17:09:51 +04:00
Veloman Yunkan
bd2d0bc489 Unit-test for valid single range requests 2020-05-22 17:39:00 +04:00
Veloman Yunkan
de37489c53 Range header starts with a unit spec
After this commit valid ranges of the form "bytes=firstbyte-lastbyte" should
be handled correctly.
2020-05-22 17:17:31 +04:00
Veloman Yunkan
2a35a86de6 Fixed the size value used creating a response
In case of a partial response the size of the response is different
from the served entry size.
2020-05-22 16:49:35 +04:00
Veloman Yunkan
0a30a77c08 Handling of out of bound byte ranges 2020-05-22 16:46:38 +04:00
Veloman Yunkan
1a99bacfe3 Byte ranges are inclusive
The second component of a byte range, if present, designates the
index of the last byte to be included in the partial response.
2020-05-22 16:30:43 +04:00
257 changed files with 18787 additions and 17086 deletions

View File

@@ -1,27 +1,31 @@
name: CI
on: [push]
on:
push:
branches:
- master
pull_request:
jobs:
Macos:
runs-on: macos-latest
steps:
- name: Checkout code
uses: actions/checkout@v1
- name: Setup python 3.5
uses: actions/setup-python@v1
uses: actions/checkout@v2
- name: Setup python 3.10
uses: actions/setup-python@v2
with:
python-version: '3.5'
python-version: '3.10'
- name: Install packages
uses: mstksg/get-package@v1
with:
brew: gcovr pkg-config ninja
run: |
brew update
brew install gcovr pkg-config ninja
- name: Install python modules
run: pip3 install meson==0.49.2 pytest
- name: Install deps
shell: bash
run: |
ARCHIVE_NAME=deps2_osx_native_dyn_kiwix-lib.tar.xz
ARCHIVE_NAME=deps2_osx_native_dyn_libkiwix.tar.xz
wget -O- http://tmp.kiwix.org/ci/${ARCHIVE_NAME} | tar -xJ -C $HOME
- name: Compile
shell: bash
@@ -37,23 +41,14 @@ jobs:
export LD_LIBRARY_PATH=$HOME/BUILD_native_dyn/INSTALL/lib:$HOME/BUILD_native_dyn/INSTALL/lib64
cd build
meson test --verbose
ninja coverage
env:
SKIP_BIG_MEMORY_TEST: 1
- name: Publish coverage
shell: bash
run: |
curl https://codecov.io/bash -o codecov.sh
bash codecov.sh -n osx_native_dyn -Z
rm codecov.sh
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
Linux:
strategy:
fail-fast: false
matrix:
target:
name:
- native_static
- native_dyn
- android_arm
@@ -61,34 +56,36 @@ jobs:
- win32_static
- win32_dyn
include:
- target: native_static
image_variant: xenial
- name: native_static
target: native_static
image_variant: bionic
lib_postfix: '/x86_64-linux-gnu'
- target: native_dyn
image_variant: xenial
- name: native_dyn
target: native_dyn
image_variant: bionic
lib_postfix: '/x86_64-linux-gnu'
- target: android_arm
image_variant: xenial
lib_postfix: '/x86_64-linux-gnu'
- target: android_arm64
image_variant: xenial
lib_postfix: '/x86_64-linux-gnu'
- target: win32_static
image_variant: f31
- name: android_arm
target: android_arm
image_variant: bionic
lib_postfix: '/arm-linux-androideabi'
- name: android_arm64
target: android_arm64
image_variant: bionic
lib_postfix: '/aarch64-linux-android'
- name: win32_static
target: win32_static
image_variant: f35
lib_postfix: '64'
- target: win32_dyn
image_variant: f31
- name: win32_dyn
target: win32_dyn
image_variant: f35
lib_postfix: '64'
env:
HOME: /home/runner
runs-on: ubuntu-latest
container:
image: "kiwix/kiwix-build_ci:${{matrix.image_variant}}-26"
image: "kiwix/kiwix-build_ci:${{matrix.image_variant}}-31"
steps:
- name: Extract branch name
shell: bash
run: echo "##[set-output name=branch;]$(echo ${GITHUB_REF#refs/heads/})"
id: extract_branch
- name: Checkout code
shell: python
run: |
@@ -98,13 +95,13 @@ jobs:
'git', 'clone',
'https://github.com/${{github.repository}}',
'--depth=1',
'--branch', '${{steps.extract_branch.outputs.branch}}'
'--branch', '${{ github.head_ref || github.ref_name }}'
]
check_call(command, cwd=environ['HOME'])
- name: Install deps
shell: bash
run: |
ARCHIVE_NAME=deps2_${OS_NAME}_${{matrix.target}}_kiwix-lib.tar.xz
ARCHIVE_NAME=deps2_${OS_NAME}_${{matrix.target}}_libkiwix.tar.xz
wget -O- http://tmp.kiwix.org/ci/${ARCHIVE_NAME} | tar -xJ -C /home/runner
- name: Compile
shell: bash
@@ -121,9 +118,9 @@ jobs:
MESON_OPTION="$MESON_OPTION --cross-file $HOME/BUILD_${{matrix.target}}/meson_cross_file.txt"
fi
if [[ "${{matrix.target}}" =~ android_.* ]]; then
MESON_OPTION="$MESON_OPTION -Dandroid=true"
MESON_OPTION="$MESON_OPTION -Dstatic-linkage=true"
fi
cd $HOME/kiwix-lib
cd $HOME/libkiwix
meson . build ${MESON_OPTION}
cd build
ninja
@@ -134,7 +131,7 @@ jobs:
if: startsWith(matrix.target, 'native_')
shell: bash
run: |
cd $HOME/kiwix-lib/build
cd $HOME/libkiwix/build
meson test --verbose
ninja coverage
env:
@@ -143,7 +140,7 @@ jobs:
- name: Publish coverage
shell: bash
run: |
cd $HOME/kiwix-lib
cd $HOME/libkiwix
curl https://codecov.io/bash -o codecov.sh
bash codecov.sh -n "${OS_NAME}_${{matrix.target}}" -Z
rm codecov.sh

91
.github/workflows/package.yml vendored Normal file
View File

@@ -0,0 +1,91 @@
name: Packages
on: [push, pull_request]
jobs:
build-deb:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
distro:
- ubuntu-jammy
- ubuntu-impish
- ubuntu-focal
- ubuntu-bionic
steps:
- uses: actions/checkout@v2
# Determine which PPA we should upload to
- name: PPA
id: ppa
run: |
if [[ $REF == refs/tags* ]]
then
echo "::set-output name=ppa::kiwixteam/release"
else
echo "::set-output name=ppa::kiwixteam/dev"
fi
env:
REF: ${{ github.ref }}
- uses: legoktm/gh-action-auto-dch@master
with:
fullname: Kiwix builder
email: release+launchpad@kiwix.org
distro: ${{ matrix.distro }}
- uses: legoktm/gh-action-build-deb@ubuntu-jammy
if: matrix.distro == 'ubuntu-jammy'
name: Build package for ubuntu-jammy
id: build-ubuntu-jammy
with:
args: --no-sign
ppa: ${{ steps.ppa.outputs.ppa }}
- uses: legoktm/gh-action-build-deb@ubuntu-impish
if: matrix.distro == 'ubuntu-impish'
name: Build package for ubuntu-impish
id: build-ubuntu-impish
with:
args: --no-sign
ppa: ${{ steps.ppa.outputs.ppa }}
- uses: legoktm/gh-action-build-deb@ubuntu-focal
if: matrix.distro == 'ubuntu-focal'
name: Build package for ubuntu-focal
id: build-ubuntu-focal
with:
args: --no-sign
ppa: ${{ steps.ppa.outputs.ppa }}
- uses: legoktm/gh-action-build-deb@ubuntu-bionic
if: matrix.distro == 'ubuntu-bionic'
name: Build package for ubuntu-bionic
id: build-ubuntu-bionic
with:
args: --no-sign
ppa: ${{ steps.ppa.outputs.ppa }}
- uses: actions/upload-artifact@v2
with:
name: Packages for ${{ matrix.distro }}
path: output
- uses: legoktm/gh-action-dput@master
name: Upload dev package
# Only upload on pushes to master
if: github.event_name == 'push' && github.event.ref == 'refs/heads/master' && startswith(matrix.distro, 'ubuntu-')
with:
gpg_key: ${{ secrets.LAUNCHPAD_GPG }}
repository: ppa:kiwixteam/dev
packages: output/*_source.changes
- uses: legoktm/gh-action-dput@master
name: Upload release package
# Only upload on pushes to master or tag
if: github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags') && startswith(matrix.distro, 'ubuntu-')
with:
gpg_key: ${{ secrets.LAUNCHPAD_GPG }}
repository: ppa:kiwixteam/release
packages: output/*_source.changes

5
.gitignore vendored
View File

@@ -2,3 +2,8 @@
*.swp
subprojects/googletest-release*
*.class
build/
.vscode/
builddir/
.cache/
.clangd/

186
ChangeLog
View File

@@ -1,3 +1,189 @@
libkiwix 11.0.0
===============
* [server] Add support for internationalization (@veloman-yunkan #679)
* [server] Use gzip compression instead of deflat (mgautierfr #757)
* [server] Version the static resources. This allow better invalidating
browser cache when resources are changed (@veloman-yunkan #712)
* [server|front] Use integer to query the host for page length (@juuz #772)
* [server] Improve multizim search API:
- Improvement of the cache system
- Better API to select on which books to search in.
- SysAdmin is now able to limit the number of book we search in for a multizim search
* [server] Introduce a opensearch API for multizim fulltext search
* [wrapper] Remove java wrapper
* Testing:
- Testing of search result pages content (@veloman-yunkan #765)
- Better testing structure of xml search result (@veloman-yunkan #780)
libkiwix 10.1.1
===============
* Correctly detect the number of article for older zims (<=6) (@mgautier #743)
* [server] Fix fulltext search (@mgautierfr #724)
* [server][internal] New way to build Error message (@veloman-yunkan #732 #738 #744)
* Fix CI (@mgautierfr #736)
libkiwix 10.1.0
===============
This release is an important one as it fixes a Xss vulnerability introduced
in libkiwix 10.0.0
* [SECURITY] Fix a Xss attack vulnerability (introduced in 10.0.0) (@juuz0 #721)
* [server] Add a option to set a limit on the number of connexion per IP (@kelson42 #700)
* [server] Do not display a lang tag in the UI if the book has no language (@juuz0 #706)
* [server] Add the book title associated to a search results (@thavelick #705, @mgautierfr #718)
* Add `dc:issued` to opds output stream (@veloman-yunkan #715)
* Add handling of several languages not provided by ICU (@juuz0 #701)
* [server] Add a caching system for search and suggestion (@maneeshpm #620)
* Fix cross-compilation (@kelson42 #703)
* Add unit-testing of suggestions and error pages (@veloman-yunkan #709 #710 #727)
* Better testing system of html response (@veloman-yunkan #725)
libkiwix 10.0.1
===============
* [server] The catalog search interpret `count=0` as no limit.
This was the case for a long time. This was changed unintentionally
(@veloman-yunkan #686)
* [server] Correctly generere a human friendly title in the server frontend.
(@juuz0 #687, @kelson42 #689)
* [server] Fix download button if there is no url do download from.
(@juuz0 #691)
* Add non-minified isotope.pkdg.js
Needed for debian packaging as we need the source and minified version is
not the source (@legoktm #693)
* [server] Add a tooltip with the full language for the lang tag.
* CI fixes (@kelson42 @legoktm)
libkiwix 10.0.0
===============
This release is huge release.
The project has been renamed to libkiwix, it is more coherent with the library name.
* Server front page :
- Use js in the front page to display the available book,
using the OPDS stream as source. The front page is now populated only with
the visible books and user can search for books. (@MananJethwany #530, #541, #534)
(@kelson42 #628)
- Revamp css (@MananJethwany #559)
- Correctly Convert 3iso language code to 2iso (@juuz0 #672)
* Server suggestions search :
- Add pagination for suggestion search (@maneeshpm #591)
- Fix suggestion system (@MananJethwany #498)
- Provide the kind and path (when adapted) to the suggestion answer (@MananJethwany #464)
- The displayed suggestion have now highligth on the searched terms (@maneeshpm #505)
- Properly handle html encoding of suggestions (@veloman-yunkan #458)
* Server improvements :
- Remove meta endpoints (@mgautier #669)
- Add raw endpoints to get the raw content of a zim (@mgautierfr #646)
- Add details on 404 error pages (@soumyankar #490)
- Fix headbar insertion when `<head>` tag has attributes (@kelson42 #440)
- Better headbar insertion (after charset definition) (@kelson42 #442)
* New OPDS Stream v2 :
- Add a list of categories (@veloman-yunkan)
- Support for partial entries (@veloman-yunkan #602)
- Support multiple icons size in the OPDS stream (@veloman-yunkan #577 #630)
- Add language endpoint to catalog (@veloman-yunkan #553)
- Add illustration API to get the illustration of a book (@mgautierfr #645)
- OPDS search can now filter books by category (@veloman-yunkan #459)
* Library improvements :
- Allow the libray to be live reloaded when the library.xml changes (@veloman-yunkan #636)
- Properly handle removing of book from the library (@veloman-yunkan #485)
- Use xapian to search for books in the library (@veloman-yunkan #460, #488)
* Added methods/functions :
- Fix `fileExist` and introduce `fileReadable` (@juuz0 #668)
- Add `getVersions` and `printVersions` functions (@kelson42 #665)
- Add `getNetworkInterfaces()` and `getBestPublicIP()` functions (@juuz0 #622)
- Add `get_zimid()` method to the search result (@maneeshpm #510)
* Various improvements :
- Better secret value for aria2c rpc (@juuz0 #666)
- Avoid duplicated Archive/Reader in the Searcher (@veloman-yunkan #648)
- Add basic documentation (@mgautierfr #640)
- Do not use Reader internally (@maneeshpm #536 #576)
- Remove dependency headers from our public headers (@mgautierfr #574)
- Downloader now don't write metalink on the filesystem (@kelson42 #502)
- Support opening a zim file using a fd (@veloman-yukan #429)
- Use C++11 std::thread instead of pthread (@mgautierfr #445)
- [READER] Do not crash if zim file has no `Counter` metadata (@mgautierfr #449)
- Ensure libzim dependency is compiled with xapian (@mgautierfr #434)
- Support video and audio mimetype in `getMediaCount` (@kelson42 #439)
- Better parsing of the counterMap (@kelson42 #437)
- Adapt libkiwix to libzim 7.0.0 (@mgautierfr #428)
- Remove deprecated methods (@mgautierfr)
- CI: Build package for Ubuntu Hirsute, Impish and Jammy (@legoktm #431 #568) and remove Groovy
- Fix compilation for FreeBSD (@swills g#432)
- Many fixes and improvement (@MananJethwany, @maneeshpm, @veloman-yunkan, @mgautierfr)
kiwix-lib 9.4.1
===============
* Fix `M/Counter` parsing.
* [SERVER] Adjust body padding-top for taskbar
* Fix potential crash when stoping a server not started.
* Various fix in build system and the CI.
kiwix-lib 9.4.0
===============
* [SERVER] Make the headers handling case insensitive.
* [SERVER] Make server answer 204 http status code for empty search
* [PACKAGING] Made CI build deb packages.
* [SERVER] Add a way to prevent taskbar and external link bloquer at article
level.
* Fix meson file to be compatible with meson 0.45
* [SERVER] Update search requests to use pageStart/pageLength instead of
pageStart/pageEnd arguments.
* [SERVER] Set a fixed favicon size in the main page.
* [SERVER] Refactor the response system code to better handling future new
libzim api.
* Fix segmentation fault around exchange with aria2 process making
kiwix-desktop crash at exit.
kiwix-lib 9.3.1
===============
* Fix handling of samba path on windows.
* Do not include `kiwix_config.h` in public header.
* Fix compilation with libmicrohttpd v0.97.1
* Increase default test timeout to 160seconds/test.
* Add automatic debian packaging.
* Use non-minified version of jquery-ui.js
* Pass `-latomic` compile option for sh4 architecture.
* Make mesion install `kiwix-compile-resources` man page.
kiwix-lib 9.3.0
===============
* Add a thread safe method to search suggestions.
Old methods are now deprecated.
kiwix-lib 9.2.3
===============
* Add test on byte-range
* Fix compilation on bionic and windows.
* Allow building using debian packaged kainjow-mustache
* Pass `-latomic` compile option for architectures that need it
kiwix-lib 9.2.2
===============
* Fix handling on empty content in byte range management (wrong assert)
kiwix-lib 9.2.1
===============
* Fix support of byte range request.
kiwix-lib 9.2
=============

113
README.md
View File

@@ -1,15 +1,16 @@
Kiwix library
=============
Libkiwix
========
The Kiwix library provides the [Kiwix](https://kiwix.org) software
suite core. It contains the code shared by all Kiwix ports (Windows,
The Libkiwix provides the [Kiwix](https://kiwix.org) software suite
core. It contains the code shared by all Kiwix ports (Windows,
GNU/Linux, macOS, Android, iOS, ...).
[![Download](https://api.bintray.com/packages/kiwix/kiwix/kiwixlib/images/download.svg)](https://bintray.com/kiwix/kiwix/kiwixlib/_latestVersion)
[![AUR version](https://img.shields.io/aur/version/kiwix-lib)](https://aur.archlinux.org/packages/kiwix-lib/)
[![Build Status](https://github.com/kiwix/kiwix-lib/workflows/CI/badge.svg?query=branch%3Amaster)](https://github.com/kiwix/kiwix-lib/actions?query=branch%3Amaster)
[![CodeFactor](https://www.codefactor.io/repository/github/kiwix/kiwix-lib/badge)](https://www.codefactor.io/repository/github/kiwix/kiwix-lib)
[![Codecov](https://codecov.io/gh/kiwix/kiwix-lib/branch/master/graph/badge.svg)](https://codecov.io/gh/kiwix/kiwix-lib)
[![Release](https://img.shields.io/github/v/tag/kiwix/libkiwix?label=release&sort=semver)](https://download.kiwix.org/release/libkiwix/)
[![Repositories](https://img.shields.io/repology/repositories/libkiwix?label=repositories)](https://github.com/kiwix/libkiwix/wiki/Repology)
[![Build Status](https://github.com/kiwix/libkiwix/workflows/CI/badge.svg?query=branch%3Amaster)](https://github.com/kiwix/libkiwix/actions?query=branch%3Amaster)
[![Doc](https://readthedocs.org/projects/libkiwix/badge/?style=flat)](https://libkiwix.readthedocs.org/en/latest/?badge=latest)
[![CodeFactor](https://www.codefactor.io/repository/github/kiwix/libkiwix/badge)](https://www.codefactor.io/repository/github/kiwix/libkiwix)
[![Codecov](https://codecov.io/gh/kiwix/libkiwix/branch/master/graph/badge.svg)](https://codecov.io/gh/kiwix/libkiwix)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
Disclaimer
@@ -17,31 +18,38 @@ Disclaimer
This document assumes you have a little knowledge about software
compilation. If you experience difficulties with the dependencies or
with the Kiwix libary compilation itself, we recommend to have a look
to [kiwix-build](https://github.com/kiwix/kiwix-build).
with the Libkiwix compilation itself, we recommend to have a look to
[kiwix-build](https://github.com/kiwix/kiwix-build).
Preamble
--------
Although the Kiwix library can be (cross-)compiled on/for many
sytems, the following documentation explains how to do it on POSIX
ones. It is primarly thought for GNU/Linux systems and has been tested
on recent releases of Ubuntu and Fedora.
Although the Libkiwix can be (cross-)compiled on/for many sytems, the
following documentation explains how to do it on POSIX ones. It is
primarly thought for GNU/Linux systems and has been tested on recent
releases of Ubuntu and Fedora.
Dependencies
------------
The Kiwix library relies on many third parts software libraries. They
are prerequisites to the Kiwix library compilation. Following
libraries need to be available:
The Libkiwix relies on many third party software libraries. They are
prerequisites to the Libkiwix compilation. Following libraries need to
be available:
* [ICU](https://site.icu-project.org/) (package `libicu-dev` on Ubuntu)
* [ZIM](https://openzim.org/) (package `libzim-dev` on Ubuntu)
* [Pugixml](https://pugixml.org/) (package `libpugixml-dev` on Ubuntu)
* [Aria2](https://aria2.github.io/) (package `aria2` on Ubuntu)
* [Mustache](https://github.com/kainjow/Mustache) (Just copy the
header `mustache.hpp` somewhere it can be found by the compiler and/or
set CPPFLAGS with correct `-I` option). Use Mustache version 3 only.
set CPPFLAGS with correct `-I` option). Use Mustache version 4.1 or above.
* [Libcurl](https://curl.se/libcurl) (`libcurl4-gnutls-dev`, `libcurl4-nss-dev` or `libcurl4-openssl-dev` on Ubuntu)
* [Microhttpd](https://www.gnu.org/software/libmicrohttpd) (package `libmicrohttpd-dev` on Ubuntu)
* [Zlib](https://zlib.net/) (package `zlib1g-dev` on Ubuntu)
To test the code:
* [Google Test](https://github.com/google/googletest) (package `googletest` on Ubuntu)
The following dependency needs to be available at runtime:
* [Aria2](https://aria2.github.io/) (package `aria2` on Ubuntu)
These dependencies may or may not be packaged by your operating
system. They may also be packaged but only in an older version. The
@@ -50,13 +58,13 @@ In the worse case, you will have to download and compile bleeding edge
version by hand.
If you want to install these dependencies locally, then use the
`kiwix-lib` directory as install prefix.
`libkiwix` directory as install prefix.
Environment
-------------
The Kiwix library builds using [Meson](https://mesonbuild.com/) version
0.43 or higher. Meson relies itself on Ninja, pkg-config and few other
The Libkiwix builds using [Meson](https://mesonbuild.com/) version
0.45 or higher. Meson relies itself on Ninja, pkg-config and few other
compilation tools.
Install first the few common compilation tools:
@@ -71,7 +79,7 @@ section.
Compilation
-----------
Once all dependencies are installed, you can compile the Kiwix library
Once all dependencies are installed, you can compile the Libkiwix
with:
```bash
meson . build
@@ -79,12 +87,20 @@ ninja -C build
```
By default, it will compile dynamic linked libraries. All binary files
will be created in the "build" directory created automatically by
will be created in the `build` directory created automatically by
Meson. If you want statically linked libraries, you can add
`--default-library=static` option to the Meson command.
Depending of you system, `ninja` may be called `ninja-build`.
The android wrapper uses deprecated methods of libkiwix so it cannot be compiled
with `werror=true` (the default). So you must pass `-Dwerror=false` to meson:
```bash
meson . build -Dwrapper=android -Dwerror=false
ninja -C build
```
Testing
-------
@@ -97,7 +113,7 @@ meson test
Installation
------------
If you want to install the Kiwix library and the headers you just have
If you want to install the Libkiwix and the headers you just have
compiled on your system, here we go:
```bash
ninja -C build install
@@ -140,6 +156,49 @@ cp ninja ../bin
cd ..
```
Custom Index Page
-----------------
to use custom welcome page mention `customIndexPage` argument in `kiwix::internalServer()` or use `kiwix::server->setCustomIndexTemplate()`.
(note - while using custom html file please mention all external links as absolute path.)
to create a HTML template with custom JS you need to have a look at various OPDS based endpoints as mentioned [here](https://wiki.kiwix.org/wiki/OPDS) to load books.
To use JS provided by kiwix-serve you can use the following template to start with ->
```
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title><-- Custom Tittle --></title>
<script src="{{root}}/skin/isotope.pkgd.min.js" defer></script>
<script src="{{root}}/skin/iso6391To3.js"></script>
<script type="text/javascript" src="{{root}}/skin/index.js" defer></script>
</head>
<body>
</body>
</html>
```
- To get books listed using `index.js` add - `<div class="book__list"></div>` under body tag.
- To get number of books listed add - `<h3 class="kiwixHomeBody__results"></h3>` under body tag.
- To add language select box add - `<select id="languageFilter"></select>` under body tag.
- To add language select box add - `<select id="categoryFilter"></select>` under body tag.
- To add search box for books use following form -
```
<form id='kiwixSearchForm'>
<input type="text" name="q" placeholder="Search" id="searchFilter" class='kiwixSearch filter'>
<input type="submit" class="kiwixButton" value="Search"/>
</form>
```
If you compile manually Libmicrohttpd, you might need to compile it
without GNU TLS, a bug here will empeach further compilation
otherwise.
If the compilation still fails, you might need to get a more recent
version of a dependency than the one packaged by your Linux
distribution. Try then with a source tarball distributed by the

View File

@@ -26,10 +26,10 @@ task writePom {
project {
groupId 'org.kiwix.kiwixlib'
artifactId 'kiwixlib'
version '9.2' + (System.env.KIWIXLIB_BUILDVERSION == null ? '' : '-'+System.env.KIWIXLIB_BUILDVERSION)
version '10.1.1' + (System.env.KIWIXLIB_BUILDVERSION == null ? '' : '-'+System.env.KIWIXLIB_BUILDVERSION)
packaging 'aar'
name 'kiwixlib'
url 'https://github.com/kiwix/kiwix-lib'
url 'https://github.com/kiwix/libkiwix'
licenses {
license {
name 'GPLv3'
@@ -44,9 +44,9 @@ task writePom {
}
}
scm {
connection 'https://github.com/kiwix/kiwix-lib.git'
developerConnection 'https://github.com/kiwix/kiwix-lib.git'
url 'https://github.com/kiwix/kiwix-lib'
connection 'https://github.com/kiwix/libkiwix.git'
developerConnection 'https://github.com/kiwix/libkiwix.git'
url 'https://github.com/kiwix/libkiwix'
}
}
}.withXml {

5
debian/changelog vendored Normal file
View File

@@ -0,0 +1,5 @@
libkiwix (0.0.0) unstable; urgency=medium
* Initial release
-- Kunal Mehta <legoktm@debian.org> Wed, 08 Jul 2020 18:12:32 -0700

46
debian/control vendored Normal file
View File

@@ -0,0 +1,46 @@
Source: libkiwix
Priority: optional
Maintainer: Kiwix team <kiwix@kiwix.org>
Build-Depends: debhelper-compat (= 13),
meson,
pkg-config,
libzim-dev (>= 7.2.0~),
libcurl4-gnutls-dev,
libicu-dev,
libgtest-dev,
libkainjow-mustache-dev,
liblzma-dev,
libmicrohttpd-dev,
libpugixml-dev,
zlib1g-dev
Standards-Version: 4.5.0
Section: libs
Homepage: https://github.com/kiwix/libkiwix
Rules-Requires-Root: no
Package: libkiwix-dev
Section: libdevel
Architecture: any
Multi-Arch: same
Depends: libkiwix10 (= ${binary:Version}), ${misc:Depends}, python3,
libzim-dev (>= 7.2.0~),
libicu-dev,
libpugixml-dev,
libcurl4-gnutls-dev,
libmicrohttpd-dev
Description: library of common code for Kiwix (development)
Kiwix is an offline Wikipedia reader. libkiwix provides the
software core for Kiwix, and contains the code shared by all
Kiwix ports (Windows, Linux, OSX, Android, etc.).
.
This package contains development files.
Package: libkiwix10
Architecture: any
Multi-Arch: same
Depends: ${shlibs:Depends}, ${misc:Depends}, aria2
Conflicts: libkiwix0, libkiwix3, libkiwix9
Description: library of common code for Kiwix
Kiwix is an offline Wikipedia reader. libkiwix provides the
software core for Kiwix, and contains the code shared by all
Kiwix ports (Windows, Linux, OSX, Android, etc.).

1
debian/copyright vendored Normal file
View File

@@ -0,0 +1 @@
See COPYING in the repository root.

4
debian/libkiwix-dev.install vendored Normal file
View File

@@ -0,0 +1,4 @@
usr/include
usr/lib/*/libkiwix.so
usr/lib/*/pkgconfig
usr/bin

2
debian/libkiwix-dev.manpages vendored Normal file
View File

@@ -0,0 +1,2 @@
usr/share/man/man1/kiwix-compile-resources.1*
usr/share/man/man1/kiwix-compile-i18n.1*

1
debian/libkiwix10.install vendored Normal file
View File

@@ -0,0 +1 @@
usr/lib/*/libkiwix.so.*

8
debian/rules vendored Executable file
View File

@@ -0,0 +1,8 @@
#!/usr/bin/make -f
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
%:
dh $@ --buildsystem=meson
override_dh_auto_test:
dh_auto_test -- -t 3

1
debian/source/format vendored Normal file
View File

@@ -0,0 +1 @@
3.0 (native)

2
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
api
xml

72
docs/conf.py Normal file
View File

@@ -0,0 +1,72 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
# -- Project information -----------------------------------------------------
project = 'libkiwix'
copyright = '2022, libkiwix-team'
author = 'libkiwix-team'
# -- General configuration ---------------------------------------------------
on_rtd = os.environ.get('READTHEDOCS', None) == 'True'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'breathe',
'exhale'
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
if not on_rtd:
html_theme = 'sphinx_rtd_theme'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
breathe_projects = {
"libkiwix": "./xml"
}
breathe_default_project = 'libkiwix'
exhale_args = {
"containmentFolder": "./api",
"rootFileName": "ref_api.rst",
"rootFileTitle": "Reference API",
"doxygenStripFromPath":"..",
"treeViewIsBootstrap": True,
"createTreeView" : True,
"exhaleExecutesDoxygen": True,
"exhaleDoxygenStdin": "INPUT = ../include"
}
primary_domain = 'cpp'
highlight_language = 'cpp'

15
docs/index.rst Normal file
View File

@@ -0,0 +1,15 @@
.. libkiwix documentation master file, created by
sphinx-quickstart on Fri Jul 24 15:40:50 2020.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to libkiwix's documentation!
==================================
.. toctree::
:maxdepth: 2
:caption: Contents:
usage
api/ref_api
widget

7
docs/meson.build Normal file
View File

@@ -0,0 +1,7 @@
sphinx = find_program('sphinx-build', native:true)
sphinx_target = run_target('doc',
command: [sphinx, '-bhtml',
meson.current_source_dir(),
meson.current_build_dir()])

2
docs/requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
breathe
exhale

15
docs/usage.rst Normal file
View File

@@ -0,0 +1,15 @@
Libkiwix programming
====================
Introduction
------------
libkiwix is written in C++. To use the library, you need the include files of libkiwix have
to link against libzim.
Errors are handled with exceptions. When something goes wrong, libkiwix throws an error,
which is always derived from std::exception.
All classes are defined in the namespace kiwix.
libkiwix is a set of tools to manage zim files and provide some common functionnality.

82
docs/widget.rst Normal file
View File

@@ -0,0 +1,82 @@
Kiwix serve widget
====================
Introduction
------------
The kiwix-serve widget provides an easy to embed way to show the `kiwix-serve` homepage.
Usage
-----
To use the widget, simply add an iframe with its `src` attribute set to the `widget` endpoint.
Example HTML Page ::
<!DOCTYPE html>
<html lang="en">
<head>
<title>Widget Test</title>
</head>
<body>
<iframe src="http://192.168.18.8:8080/widget?disabledesc&disablefilter&disabledownload" width=1000 height=1000></iframe>
</body>
</html>
This creates an iframe with the kiwix-serve homepage contents.
Arguments are explained below.
Possible Arguments
-------------------
Currently, the following arguments are supported.
disabledesc (value = N/A)
Disables the description part of a tile.
disablefilter (value = N/A)
Disables the search filters: language, category, tag and search function.
disableclick (value = N/A)
Disables clicking the book to open it for reading.
disabledownload (value = N/A)
Disables the download button (if avaialable at all) on the tile.
Custom CSS and JS
-----------------
You can add your custom CSS rules and Javascript code to the widget.
To do that, use the following code as template::
<iframe id="receiver" src="http://192.168.18.8:8080/widget?disabledesc=&disablefilter=&disabledownload=" width="1000" height="1000">
<p>Your browser does not support iframes.</p>
</iframe>
<script>
window.onload = function() {
var receiver = document.getElementById('receiver').contentWindow;
function sendMessage() {
let msg = {
css: `
.book__header {
color:red;
}`,
js: `
function widgetTest() {
console.log("Testing widget");
}
widgetTest();
`
}
receiver.postMessage(msg, 'http://192.168.18.8:8080/widget');
}
sendMessage();
}
</script>
The CSS/JS fields are optional, you may send both or only one.

View File

@@ -7,6 +7,7 @@ files=(
"include/common/otherTools.h"
"include/common/regexTools.h"
"include/common/networkTools.h"
"include/common/archiveTools.h"
"include/manager.h"
"include/reader.h"
"include/kiwix.h"
@@ -22,6 +23,7 @@ files=(
"src/common/pathTools.cpp"
"src/common/regexTools.cpp"
"src/common/otherTools.cpp"
"src/common/archiveTools.cpp"
"src/common/networkTools.cpp"
"src/common/stringTools.cpp"
"src/xapianSearcher.cpp"

View File

@@ -21,28 +21,54 @@
#define KIWIX_BOOK_H
#include <string>
#include <vector>
#include <memory>
#include <mutex>
#include "common.h"
namespace pugi {
class xml_node;
}
namespace zim {
class Archive;
}
namespace kiwix
{
class OPDSDumper;
class Reader;
/**
* A class to store information about a book (a zim file)
*/
class Book
{
public:
public: // types
class Illustration
{
friend class Book;
public:
uint16_t width = 48;
uint16_t height = 48;
std::string mimeType;
std::string url;
const std::string& getData() const;
private:
mutable std::string data;
mutable std::mutex mutex;
};
typedef std::vector<std::shared_ptr<const Illustration>> Illustrations;
public: // functions
Book();
~Book();
bool update(const Book& other);
void update(const Reader& reader);
void update(const zim::Archive& archive);
void updateFromXml(const pugi::xml_node& node, const std::string& baseDir);
void updateFromOpds(const pugi::xml_node& node, const std::string& urlHost);
std::string getHumanReadableIdFromPath() const;
@@ -59,6 +85,7 @@ class Book
const std::string& getDate() const { return m_date; }
const std::string& getUrl() const { return m_url; }
const std::string& getName() const { return m_name; }
std::string getCategory() const;
const std::string& getTags() const { return m_tags; }
std::string getTagStr(const std::string& tagName) const;
bool getTagBool(const std::string& tagName) const;
@@ -67,9 +94,13 @@ class Book
const uint64_t& getArticleCount() const { return m_articleCount; }
const uint64_t& getMediaCount() const { return m_mediaCount; }
const uint64_t& getSize() const { return m_size; }
const std::string& getFavicon() const;
const std::string& getFaviconUrl() const { return m_faviconUrl; }
const std::string& getFaviconMimeType() const { return m_faviconMimeType; }
DEPRECATED const std::string& getFavicon() const;
DEPRECATED const std::string& getFaviconUrl() const;
DEPRECATED const std::string& getFaviconMimeType() const;
Illustrations getIllustrations() const;
std::shared_ptr<const Illustration> getIllustration(unsigned int size) const;
const std::string& getDownloadId() const { return m_downloadId; }
void setReadOnly(bool readOnly) { m_readOnly = readOnly; }
@@ -90,17 +121,20 @@ class Book
void setArticleCount(uint64_t articleCount) { m_articleCount = articleCount; }
void setMediaCount(uint64_t mediaCount) { m_mediaCount = mediaCount; }
void setSize(uint64_t size) { m_size = size; }
void setFavicon(const std::string& favicon) { m_favicon = favicon; }
void setFaviconMimeType(const std::string& faviconMimeType) { m_faviconMimeType = faviconMimeType; }
void setDownloadId(const std::string& downloadId) { m_downloadId = downloadId; }
protected:
private: // functions
std::string getCategoryFromTags() const;
const Illustration& getDefaultIllustration() const;
protected: // data
std::string m_id;
std::string m_downloadId;
std::string m_path;
bool m_pathValid = false;
std::string m_title;
std::string m_description;
std::string m_category;
std::string m_language;
std::string m_creator;
std::string m_publisher;
@@ -114,9 +148,11 @@ class Book
uint64_t m_mediaCount = 0;
bool m_readOnly = false;
uint64_t m_size = 0;
mutable std::string m_favicon;
std::string m_faviconUrl;
std::string m_faviconMimeType;
Illustrations m_illustrations;
// Used as the return value of getDefaultIllustration() when no default
// illustration is found in the book
static const Illustration missingDefaultIllustration;
};
}

View File

@@ -23,7 +23,6 @@
#include <string>
#include <vector>
#include <map>
#include <pthread.h>
#include <memory>
#include <stdexcept>

View File

@@ -1,192 +0,0 @@
/*
* Copyright 2018 Matthieu Gautier <mgautier@kymeria.fr>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIX_ENTRY_H
#define KIWIX_ENTRY_H
#include <stdio.h>
#include <zim/article.h>
#include <exception>
#include <string>
#include "common.h"
using namespace std;
namespace kiwix
{
class NoEntry : public std::exception {};
/**
* A entry represent an.. entry in a zim file.
*/
class Entry
{
public:
/**
* Default constructor.
*
* Construct an invalid entry.
*/
Entry() = default;
/**
* Construct an entry making reference to an zim article.
*
* @param article a zim::Article object
*/
Entry(zim::Article article);
virtual ~Entry() = default;
/**
* Get the path of the entry.
*
* The path is the "key" of an entry.
*
* @return the path of the entry.
*/
std::string getPath() const;
/**
* Get the title of the entry.
*
* @return the title of the entry.
*/
std::string getTitle() const;
/**
* Get the content of the entry.
*
* The string is a copy of the content.
* If you don't want to do a copy, use get_blob.
*
* @return the content of the entry.
*/
std::string getContent() const;
/**
* Get the blob of the entry.
*
* A blob make reference to the content without copying it.
*
* @param offset The starting offset of the blob.
* @return the blob of the entry.
*/
zim::Blob getBlob(offset_type offset = 0) const;
/**
* Get the blob of the entry.
*
* A blob make reference to the content without copying it.
*
* @param offset The starting offset of the blob.
* @param size The size of the blob.
* @return the blob of the entry.
*/
zim::Blob getBlob(offset_type offset, size_type size) const;
/**
* Get the info for direct access to the content of the entry.
*
* Some entry (ie binary ones) have their content plain stored
* in the zim file. Knowing the offset where the content is stored
* an user can directly read the content in the zim file bypassing the
* kiwix-lib/libzim.
*
* @return A pair specifying where to read the content.
* The string is the real file to read (may be different that .zim
* file if zim is cut).
* The offset is the offset to read in the file.
* Return <"",0> if is not possible to read directly.
*/
std::pair<std::string, offset_type> getDirectAccessInfo() const;
/**
* Get the size of the entry.
*
* @return the size of the entry.
*/
size_type getSize() const;
/**
* Get the mime_type of the entry.
*
* @return the mime_type of the entry.
*/
std::string getMimetype() const;
/**
* Get if the entry is a redirect entry.
*
* @return True if the entry is a redirect.
*/
bool isRedirect() const;
/**
* Get if the entry is a link target entry.
*
* @return True if the entry is a link target.
*/
bool isLinkTarget() const;
/**
* Get if the entry is a deleted entry.
*
* @return True if the entry is a deleted entry.
*/
bool isDeleted() const;
/**
* Get the entry pointed by this entry.
*
* @return the entry pointed.
* @throw NoEntry if the entry is not a redirected entry.
*/
Entry getRedirectEntry() const;
/**
* Get the final entry pointed by this entry.
*
* Follow the redirection until a "not redirecting" entry is found.
* If the entry is not a redirected entry, return the entry itself.
*
* @return the final entry.
*/
Entry getFinalEntry() const;
/**
* Convert the entry to a boolean value.
*
* @return True if the entry is valid.
*/
explicit operator bool() const { return good(); }
private:
zim::Article article;
mutable zim::Article final_article;
bool good() const { return article.good(); }
};
}
#endif // KIWIX_ENTRY_H

View File

@@ -22,4 +22,4 @@
#include "library.h"
#endif
#endif

View File

@@ -24,6 +24,9 @@
#include <vector>
#include <map>
#include <memory>
#include <mutex>
#include <zim/archive.h>
#include <zim/search.h>
#include "book.h"
#include "bookmark.h"
@@ -35,6 +38,7 @@ namespace kiwix
{
class OPDSDumper;
class Library;
enum supportedListSortBy { UNSORTED, TITLE, SIZE, DATE, CREATOR, PUBLISHER };
enum supportedListMode {
@@ -48,18 +52,23 @@ enum supportedListMode {
};
class Filter {
private:
public: // types
using Tags = std::vector<std::string>;
private: // data
uint64_t activeFilters;
std::vector<std::string> _acceptTags;
std::vector<std::string> _rejectTags;
Tags _acceptTags;
Tags _rejectTags;
std::string _category;
std::string _lang;
std::string _publisher;
std::string _creator;
size_t _maxSize;
std::string _query;
bool _queryIsPartial;
std::string _name;
public:
public: // functions
Filter();
~Filter() = default;
@@ -93,33 +102,87 @@ class Filter {
/**
* Set the filter to only accept book with corresponding tag.
*/
Filter& acceptTags(std::vector<std::string> tags);
Filter& rejectTags(std::vector<std::string> tags);
Filter& acceptTags(const Tags& tags);
Filter& rejectTags(const Tags& tags);
Filter& category(std::string category);
Filter& lang(std::string lang);
Filter& publisher(std::string publisher);
Filter& creator(std::string creator);
Filter& maxSize(size_t size);
Filter& query(std::string query);
Filter& query(std::string query, bool partial=true);
Filter& name(std::string name);
bool hasQuery() const;
const std::string& getQuery() const { return _query; }
bool queryIsPartial() const { return _queryIsPartial; }
bool hasName() const;
const std::string& getName() const { return _name; }
bool hasCategory() const;
const std::string& getCategory() const { return _category; }
bool hasLang() const;
const std::string& getLang() const { return _lang; }
bool hasPublisher() const;
const std::string& getPublisher() const { return _publisher; }
bool hasCreator() const;
const std::string& getCreator() const { return _creator; }
const Tags& getAcceptTags() const { return _acceptTags; }
const Tags& getRejectTags() const { return _rejectTags; }
private: // functions
friend class Library;
bool accept(const Book& book) const;
};
class ZimSearcher : public zim::Searcher
{
public:
explicit ZimSearcher(zim::Searcher&& searcher)
: zim::Searcher(searcher)
{}
std::unique_lock<std::mutex> getLock() {
return std::unique_lock<std::mutex>(m_mutex);
}
virtual ~ZimSearcher() = default;
private:
std::mutex m_mutex;
};
/**
* A Library store several books.
*/
class Library
{
std::map<std::string, kiwix::Book> m_books;
std::map<std::string, std::shared_ptr<Reader>> m_readers;
std::vector<kiwix::Bookmark> m_bookmarks;
// all data fields must be added in LibraryBase
mutable std::mutex m_mutex;
public:
typedef uint64_t Revision;
typedef std::vector<std::string> BookIdCollection;
typedef std::map<std::string, int> AttributeCounts;
typedef std::set<std::string> BookIdSet;
public:
Library();
~Library();
/**
* Library is not a copiable object. However it can be moved.
*/
Library(const Library& ) = delete;
Library(Library&& );
void operator=(const Library& ) = delete;
Library& operator=(Library&& );
/**
* Add a book to the library.
*
@@ -132,6 +195,11 @@ class Library
*/
bool addBook(const Book& book);
/**
* A self-explanatory alias for addBook()
*/
bool addOrUpdateBook(const Book& book) { return addBook(book); }
/**
* Add a bookmark to the library.
*
@@ -148,9 +216,18 @@ class Library
*/
bool removeBookmark(const std::string& zimId, const std::string& url);
Book& getBookById(const std::string& id);
Book& getBookByPath(const std::string& path);
std::shared_ptr<Reader> getReaderById(const std::string& id);
// XXX: This is a non-thread-safe operation
const Book& getBookById(const std::string& id) const;
// XXX: This is a non-thread-safe operation
const Book& getBookByPath(const std::string& path) const;
Book getBookByIdThreadSafe(const std::string& id) const;
std::shared_ptr<zim::Archive> getArchiveById(const std::string& id);
std::shared_ptr<ZimSearcher> getSearcherById(const std::string& id) {
return getSearcherByIds(BookIdSet{id});
}
std::shared_ptr<ZimSearcher> getSearcherByIds(const BookIdSet& ids);
/**
* Remove a book from the library.
@@ -166,7 +243,7 @@ class Library
* @param path the path of the file to write to.
* @return True if the library has been correctly saved.
*/
bool writeToFile(const std::string& path);
bool writeToFile(const std::string& path) const;
/**
* Write the library bookmarks to a file.
@@ -174,7 +251,7 @@ class Library
* @param path the path of the file to write to.
* @return True if the library has been correctly saved.
*/
bool writeBookmarksToFile(const std::string& path);
bool writeBookmarksToFile(const std::string& path) const;
/**
* Get the number of book in the library.
@@ -183,53 +260,56 @@ class Library
* @param remoteBooks If we must count remote books (books with an url)
* @return The number of books.
*/
unsigned int getBookCount(const bool localBooks, const bool remoteBooks);
unsigned int getBookCount(const bool localBooks, const bool remoteBooks) const;
/**
* Get all langagues of the books in the library.
* Get all languagues of the books in the library.
*
* @return A list of languages.
*/
std::vector<std::string> getBooksLanguages();
std::vector<std::string> getBooksLanguages() const;
/**
* Get all languagues of the books in the library with counts.
*
* @return A list of languages with the count of books in each language.
*/
AttributeCounts getBooksLanguagesWithCounts() const;
/**
* Get all categories of the books in the library.
*
* @return A list of categories.
*/
std::vector<std::string> getBooksCategories() const;
/**
* Get all book creators of the books in the library.
*
* @return A list of book creators.
*/
std::vector<std::string> getBooksCreators();
std::vector<std::string> getBooksCreators() const;
/**
* Get all book publishers of the books in the library.
*
* @return A list of book publishers.
*/
std::vector<std::string> getBooksPublishers();
std::vector<std::string> getBooksPublishers() const;
/**
* Get all bookmarks.
*
* @return A list of bookmarks
*/
const std::vector<kiwix::Bookmark> getBookmarks(bool onlyValidBookmarks = true);
const std::vector<kiwix::Bookmark> getBookmarks(bool onlyValidBookmarks = true) const;
/**
* Get all book ids of the books in the library.
*
* @return A list of book ids.
*/
std::vector<std::string> getBooksIds();
/**
* Filter the library and generate a new one with the keep elements.
*
* This is equivalent to `listBookIds(ALL, UNSORTED, search)`.
*
* @param search List only books with search in the title or description.
* @return The list of bookIds corresponding to the query.
*/
DEPRECATED std::vector<std::string> filter(const std::string& search);
BookIdCollection getBooksIds() const;
/**
* Filter the library and return the id of the keep elements.
@@ -237,7 +317,7 @@ class Library
* @param filter The filter to use.
* @return The list of bookIds corresponding to the filter.
*/
std::vector<std::string> filter(const Filter& filter);
BookIdCollection filter(const Filter& filter) const;
/**
@@ -247,43 +327,44 @@ class Library
* @param comparator how to sort the books
* @return The sorted list of books
*/
void sort(std::vector<std::string>& bookIds, supportedListSortBy sortBy, bool ascending);
void sort(BookIdCollection& bookIds, supportedListSortBy sortBy, bool ascending) const;
/**
* List books in the library.
* Return the current revision of the library.
*
* @param mode The mode of listing :
* - LOCAL  : list only local books (with a path).
* - REMOTE : list only remote books (with an url).
* - VALID  : list only valid books (without a path or with a
* path pointing to a valid zim file).
* - NOLOCAL : list only books without valid path.
* - NOREMOTE : list only books without url.
* - NOVALID : list only books not valid.
* - ALL : Do not do any filter (LOCAL or REMOTE)
* - Flags can be combined.
* @param sortBy Attribute to sort by the book list.
* @param search List only books with search in the title, description.
* @param language List only books in this language.
* @param creator List only books of this creator.
* @param publisher List only books of this publisher.
* @param maxSize Do not list book bigger than maxSize.
* Set to 0 to cancel this filter.
* @return The list of bookIds corresponding to the query.
* The revision of the library is updated (incremented by one) only by
* the addBook() operation.
*
* @return Current revision of the library.
*/
DEPRECATED std::vector<std::string> listBooksIds(
int supportedListMode = ALL,
supportedListSortBy sortBy = UNSORTED,
const std::string& search = "",
const std::string& language = "",
const std::string& creator = "",
const std::string& publisher = "",
const std::vector<std::string>& tags = {},
size_t maxSize = 0);
Revision getRevision() const;
/**
* Remove books that have not been updated since the specified revision.
*
* @param rev the library revision to use
* @return Count of books that were removed by this operation.
*/
uint32_t removeBooksNotUpdatedSince(Revision rev);
friend class OPDSDumper;
friend class libXMLDumper;
private: // types
typedef const std::string& (Book::*BookStrPropMemFn)() const;
struct Impl;
private: // functions
AttributeCounts getBookAttributeCounts(BookStrPropMemFn p) const;
std::vector<std::string> getBookPropValueSet(BookStrPropMemFn p) const;
BookIdCollection filterViaBookDB(const Filter& filter) const;
void updateBookDB(const Book& book);
void dropCache(const std::string& bookId);
private: //data
std::unique_ptr<Impl> mp_impl;
};
}
#endif

View File

@@ -38,7 +38,7 @@ class LibXMLDumper
{
public:
LibXMLDumper() = default;
LibXMLDumper(Library* library);
LibXMLDumper(const Library* library);
~LibXMLDumper();
/**
@@ -69,10 +69,10 @@ class LibXMLDumper
*
* @param library The library to dump.
*/
void setLibrary(Library* library) { this->library = library; }
void setLibrary(const Library* library) { this->library = library; }
protected:
kiwix::Library* library;
const kiwix::Library* library;
std::string baseDir;
private:
void handleBook(Book book, pugi::xml_node root_node);

View File

@@ -22,10 +22,10 @@
#include "book.h"
#include "library.h"
#include "reader.h"
#include <string>
#include <vector>
#include <memory>
namespace pugi {
class xml_document;
@@ -34,26 +34,25 @@ class xml_document;
namespace kiwix
{
class LibraryManipulator {
public:
virtual ~LibraryManipulator() {}
virtual bool addBookToLibrary(Book book) = 0;
virtual void addBookmarkToLibrary(Bookmark bookmark) = 0;
};
class LibraryManipulator
{
public: // functions
explicit LibraryManipulator(Library* library);
virtual ~LibraryManipulator();
class DefaultLibraryManipulator : public LibraryManipulator {
public:
DefaultLibraryManipulator(Library* library) :
library(library) {}
virtual ~DefaultLibraryManipulator() {}
bool addBookToLibrary(Book book) {
return library->addBook(book);
}
void addBookmarkToLibrary(Bookmark bookmark) {
library->addBookmark(bookmark);
}
private:
kiwix::Library* library;
Library& getLibrary() const { return library; }
bool addBookToLibrary(const Book& book);
void addBookmarkToLibrary(const Bookmark& bookmark);
uint32_t removeBooksNotUpdatedSince(Library::Revision rev);
protected: // overrides
virtual void bookWasAddedToLibrary(const Book& book);
virtual void bookmarkWasAddedToLibrary(const Bookmark& bookmark);
virtual void booksWereRemovedFromLibrary();
private: // data
kiwix::Library& library;
};
/**
@@ -61,10 +60,12 @@ class DefaultLibraryManipulator : public LibraryManipulator {
*/
class Manager
{
public:
Manager(LibraryManipulator* manipulator);
Manager(Library* library);
~Manager();
public: // types
typedef std::vector<std::string> Paths;
public: // functions
explicit Manager(LibraryManipulator* manipulator);
explicit Manager(Library* library);
/**
* Read a `library.xml` and add book in the file to the library.
@@ -72,10 +73,22 @@ class Manager
* @param path The (utf8) path to the `library.xml`.
* @param readOnly Set if the libray path could be overwritten latter with
* updated content.
* @param trustLibrary use book metadata coming from XML.
* @return True if file has been properly parsed.
*/
bool readFile(const std::string& path, bool readOnly = true, bool trustLibrary = true);
/**
* Sync the contents of the library with one or more `library.xml` files.
*
* The metadata of the library files is trusted unconditionally.
* Any books not present in the input library.xml files are removed
* from the library.
*
* @param paths The (utf8) paths to the `library.xml` files.
*/
void reload(const Paths& paths);
/**
* Load a library content store in the string.
*
@@ -150,8 +163,7 @@ class Manager
uint64_t m_itemsPerPage = 0;
protected:
kiwix::LibraryManipulator* manipulator;
bool mustDeleteManipulator;
std::shared_ptr<kiwix::LibraryManipulator> manipulator;
bool readBookFromPath(const std::string& path, Book* book);
bool parseXmlDom(const pugi::xml_document& doc,

View File

@@ -7,25 +7,12 @@ headers = [
'libxml_dumper.h',
'opds_dumper.h',
'downloader.h',
'reader.h',
'entry.h',
'searcher.h',
'search_renderer.h',
'server.h',
'kiwixserve.h',
'name_mapper.h'
'name_mapper.h',
'tools.h',
'version.h'
]
install_headers(headers, subdir:'kiwix')
install_headers(
'tools/base64.h',
'tools/networkTools.h',
'tools/otherTools.h',
'tools/pathTools.h',
'tools/regexTools.h',
'tools/stringTools.h',
'tools/lock.h',
subdir:'kiwix/tools'
)

View File

@@ -22,6 +22,8 @@
#include <string>
#include <map>
#include <memory>
#include <mutex>
namespace kiwix
{
@@ -31,15 +33,15 @@ class Library;
class NameMapper {
public:
virtual ~NameMapper() = default;
virtual std::string getNameForId(const std::string& id) = 0;
virtual std::string getIdForName(const std::string& name) = 0;
virtual std::string getNameForId(const std::string& id) const = 0;
virtual std::string getIdForName(const std::string& name) const = 0;
};
class IdNameMapper : public NameMapper {
public:
virtual std::string getNameForId(const std::string& id) { return id; };
virtual std::string getIdForName(const std::string& name) { return name; };
virtual std::string getNameForId(const std::string& id) const { return id; };
virtual std::string getIdForName(const std::string& name) const { return name; };
};
class HumanReadableNameMapper : public NameMapper {
@@ -50,11 +52,29 @@ class HumanReadableNameMapper : public NameMapper {
public:
HumanReadableNameMapper(kiwix::Library& library, bool withAlias);
virtual ~HumanReadableNameMapper() = default;
virtual std::string getNameForId(const std::string& id);
virtual std::string getIdForName(const std::string& name);
virtual std::string getNameForId(const std::string& id) const;
virtual std::string getIdForName(const std::string& name) const;
};
class UpdatableNameMapper : public NameMapper {
typedef std::shared_ptr<NameMapper> NameMapperHandle;
public:
UpdatableNameMapper(Library& library, bool withAlias);
virtual std::string getNameForId(const std::string& id) const;
virtual std::string getIdForName(const std::string& name) const;
void update();
private:
NameMapperHandle currentNameMapper() const;
private:
mutable std::mutex mutex;
Library& library;
NameMapperHandle nameMapper;
const bool withAlias;
};
}

View File

@@ -1,6 +1,5 @@
/*
* Copyright (C) 2013 Emmanuel Engelhart <kelson@kiwix.org>
* Copyright (C) 2017 Matthieu Gautier <mgautier@kymeria.fr>
* Copyright 2021 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -18,9 +17,17 @@
* MA 02110-1301, USA.
*/
package org.kiwix.kiwixlib;
#ifndef KIWIX_OPDS_CATALOG_H
#define KIWIX_OPDS_CATALOG_H
public class JNIICU
#include "library.h"
namespace kiwix
{
static public native void setDataDirectory(String icuDataDir);
}
std::string getSearchUrl(const Filter& f);
} // namespace kiwix
#endif // KIWIX_OPDS_CATALOG_H

View File

@@ -26,11 +26,7 @@
#include <pugixml.hpp>
#include "tools/base64.h"
#include "tools/pathTools.h"
#include "tools/regexTools.h"
#include "library.h"
#include "reader.h"
using namespace std;
@@ -51,24 +47,50 @@ class OPDSDumper
/**
* Dump the OPDS feed.
*
* @param id The id of the library.
* @param bookIds the ids of the books to include in the feed
* @param query the query used to obtain the list of book ids
* @return The OPDS feed.
*/
std::string dumpOPDSFeed(const std::vector<std::string>& bookIds);
std::string dumpOPDSFeed(const std::vector<std::string>& bookIds, const std::string& query) const;
/**
* Set the id of the opds stream.
* Dump the OPDS feed.
*
* @param bookIds the ids of the books to include in the feed
* @param query the query used to obtain the list of book ids
* @param partial whether the feed should include partial or complete entries
* @return The OPDS feed.
*/
std::string dumpOPDSFeedV2(const std::vector<std::string>& bookIds, const std::string& query, bool partial) const;
/**
* Dump the OPDS complete entry document.
*
* @param bookId the id of the book
* @return The OPDS complete entry document.
*/
std::string dumpOPDSCompleteEntry(const std::string& bookId) const;
/**
* Dump the categories OPDS feed.
*
* @return The OPDS feed.
*/
std::string categoriesOPDSFeed() const;
/**
* Dump the languages OPDS feed.
*
* @return The OPDS feed.
*/
std::string languagesOPDSFeed() const;
/**
* Set the id of the library.
*
* @param id the id to use.
*/
void setId(const std::string& id) { this->id = id;}
/**
* Set the title oft the opds stream.
*
* @param title the title to use.
*/
void setTitle(const std::string& title) { this->title = title; }
void setLibraryId(const std::string& id) { this->libraryId = id;}
/**
* Set the root location used when generating url.
@@ -77,13 +99,6 @@ class OPDSDumper
*/
void setRootLocation(const std::string& rootLocation) { this->rootLocation = rootLocation; }
/**
* Set the search url.
*
* @param searchUrl the search url to use.
*/
void setSearchDescriptionUrl(const std::string& searchDescriptionUrl) { this->searchDescriptionUrl = searchDescriptionUrl; }
/**
* Set some informations about the search results.
*
@@ -93,27 +108,13 @@ class OPDSDumper
*/
void setOpenSearchInfo(int totalResult, int startIndex, int count);
/**
* Set the library to dump.
*
* @param library The library to dump.
*/
void setLibrary(Library* library) { this->library = library; }
protected:
kiwix::Library* library;
std::string id;
std::string title;
std::string date;
std::string libraryId;
std::string rootLocation;
std::string searchDescriptionUrl;
int m_totalResults;
int m_startIndex;
int m_count;
bool m_isSearchResult = false;
private:
pugi::xml_node handleBook(Book book, pugi::xml_node root_node);
};
}

View File

@@ -1,570 +0,0 @@
/*
* Copyright 2011 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIX_READER_H
#define KIWIX_READER_H
#include <stdio.h>
#include <zim/article.h>
#include <zim/file.h>
#include <zim/fileiterator.h>
#include <zim/zim.h>
#include <exception>
#include <map>
#include <sstream>
#include <string>
#include "common.h"
#include "entry.h"
#include "tools/pathTools.h"
#include "tools/stringTools.h"
using namespace std;
namespace kiwix
{
/**
* The Reader class is the class who allow to get an entry content from a zim
* file.
*/
class Reader
{
public:
/**
* Create a Reader to read a zim file specified by zimFilePath.
*
* @param zimFilePath The path to the zim file to read.
* The zim file can be splitted (.zimaa, .zimab, ...).
* In this case, the file path must still point to the
* unsplitted path as if the file were not splitted
* (.zim extesion).
*/
Reader(const string zimFilePath);
~Reader();
/**
* Get the number of "displayable" entries in the zim file.
*
* @return If the zim file has a /M/Counter metadata, return the number of
* entries with the 'text/html' MIMEtype specified in the metadata.
* Else return the number of entries in the 'A' namespace.
*/
unsigned int getArticleCount() const;
/**
* Get the number of media in the zim file.
*
* @return If the zim file has a /M/Counter metadata, return the number of
* entries with the 'image/jpeg', 'image/gif' and 'image/png' in
* the metadata.
* Else return the number of entries in the 'I' namespace.
*/
unsigned int getMediaCount() const;
/**
* Get the number of all entries in the zim file.
*
* @return Return the number of all the entries, whatever their MIMEtype or
* their namespace.
*/
unsigned int getGlobalCount() const;
/**
* Get the path of the zim file.
*
* @return the path of the zim file as given in the constructor.
*/
string getZimFilePath() const;
/**
* Get the Id of the zim file.
*
* @return The uuid stored in the zim file.
*/
string getId() const;
/**
* Get the url of a random page.
*
* Deprecated : Use `getRandomPage` instead.
*
* @return Url of a random page. The page is picked from all entries in
* the 'A' namespace.
* The main page is excluded from the potential results.
*/
DEPRECATED string getRandomPageUrl() const;
/**
* Get a random page.
*
* @return A random Entry. The entry is picked from all entries in
* the 'A' namespace.
* The main entry is excluded from the potential results.
*/
Entry getRandomPage() const;
/**
* Get the url of the first page.
*
* Deprecated : Use `getFirstPage` instead.
*
* @return Url of the first entry in the 'A' namespace.
*/
DEPRECATED string getFirstPageUrl() const;
/**
* Get the entry of the first page.
*
* @return The first entry in the 'A' namespace.
*/
Entry getFirstPage() const;
/**
* Get the url of the main page.
*
* Deprecated : Use `getMainPage` instead.
*
* @return Url of the main page as specified in the zim file.
*/
DEPRECATED string getMainPageUrl() const;
/**
* Get the entry of the main page.
*
* @return Entry of the main page as specified in the zim file.
*/
Entry getMainPage() const;
/**
* Get the content of a metadata.
*
* @param[in] name The name of the metadata.
* @param[out] value The value will be set to the content of the metadata.
* @return True if it was possible to get the content of the metadata.
*/
bool getMetadata(const string& name, string& value) const;
/**
* Get the name of the zim file.
*
* @return The name of the zim file as specified in the zim metadata.
*/
string getName() const;
/**
* Get the title of the zim file.
*
* @return The title of zim file as specified in the zim metadata.
* If no title has been set, return a title computed from the
* file path.
*/
string getTitle() const;
/**
* Get the creator of the zim file.
*
* @return The creator of the zim file as specified in the zim metadata.
*/
string getCreator() const;
/**
* Get the publisher of the zim file.
*
* @return The publisher of the zim file as specified in the zim metadata.
*/
string getPublisher() const;
/**
* Get the date of the zim file.
*
* @return The date of the zim file as specified in the zim metadata.
*/
string getDate() const;
/**
* Get the description of the zim file.
*
* @return The description of the zim file as specified in the zim metadata.
* If no description has been set, return the subtitle.
*/
string getDescription() const;
/**
* Get the long description of the zim file.
*
* @return The long description of the zim file as specifed in the zim metadata.
*/
string getLongDescription() const;
/**
* Get the language of the zim file.
*
* @return The language of the zim file as specified in the zim metadata.
*/
string getLanguage() const;
/**
* Get the license of the zim file.
*
* @return The license of the zim file as specified in the zim metadata.
*/
string getLicense() const;
/**
* Get the tags of the zim file.
*
* @param original If true, return the original tags as specified in the zim metadata.
* Else, try to convert it to the new 'normalized' format.
* @return The tags of the zim file.
*/
string getTags(bool original=false) const;
/**
* Get the value (as a string) of a specific tag.
*
* According to https://wiki.openzim.org/wiki/Tags
*
* @return The value of the specified tag.
* @throw std::out_of_range if the specified tag is not found.
*/
string getTagStr(const std::string& tagName) const;
/**
* Get the boolean value of a specific tag.
*
* According to https://wiki.openzim.org/wiki/Tags
*
* @return The boolean value of the specified tag.
* @throw std::out_of_range if the specified tag is not found.
* std::domain_error if the value of the tag cannot be convert to bool.
*/
bool getTagBool(const std::string& tagName) const;
/**
* Get the relations of the zim file.
*
* @return The relation of the zim file as specified in the zim metadata.
*/
string getRelation() const;
/**
* Get the flavour of the zim file.
*
* @return The flavour of the zim file as specified in the zim metadata.
*/
string getFlavour() const;
/**
* Get the source of the zim file.
*
* @return The source of the zim file as specified in the zim metadata.
*/
string getSource() const;
/**
* Get the scraper of the zim file.
*
* @return The scraper of the zim file as specified in the zim metadata.
*/
string getScraper() const;
/**
* Get the origId of the zim file.
*
* The origId is only used in the case of patch zim file and is the Id
* of the original zim file.
*
* @return The origId of the zim file as specified in the zim metadata.
*/
string getOrigId() const;
/**
* Get the favicon of the zim file.
*
* @param[out] content The content of the favicon.
* @param[out] mimeType The mimeType of the favicon.
* @return True if a favicon has been found.
*/
bool getFavicon(string& content, string& mimeType) const;
/**
* Get an entry associated to an path.
*
* @param path The path of the entry.
* @return The entry.
* @throw NoEntry If no entry correspond to the path.
*/
Entry getEntryFromPath(const std::string& path) const;
/**
* Get an entry associated to an url encoded path.
*
* Equivalent to `getEntryFromPath(urlDecode(path));`
*
* @param path The url encoded path.
* @return The entry.
* @throw NoEntry If no entry correspond to the path.
*/
Entry getEntryFromEncodedPath(const std::string& path) const;
/**
* Get un entry associated to a title.
*
* @param title The title.
* @return The entry
* throw NoEntry If no entry correspond to the url.
*/
Entry getEntryFromTitle(const std::string& title) const;
/**
* Get the url of a page specified by a title.
*
* @param[in] title the title of the page.
* @param[out] url the url of the page.
* @return True if the page can be found.
*/
DEPRECATED bool getPageUrlFromTitle(const string& title, string& url) const;
/**
* Get the mimetype of a entry specified by a url.
*
* @param[in] url the url of the entry.
* @param[out] mimeType the mimeType of the entry.
* @return True if the mimeType has been found.
*/
DEPRECATED bool getMimeTypeByUrl(const string& url, string& mimeType) const;
/**
* Get the content of an entry specifed by a url.
*
* Alias to `getContentByEncodedUrl`
*/
DEPRECATED bool getContentByUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType) const;
/**
* Get the content of an entry specified by a url encoded url.
*
* Equivalent to getContentByDecodedUrl(urlDecode(url), ...).
*/
DEPRECATED bool getContentByEncodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType,
string& baseUrl) const;
/**
* Get the content of an entry specified by an url encoded url.
*
* Equivalent to getContentByEncodedUrl but without baseUrl.
*/
DEPRECATED bool getContentByEncodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType) const;
/**
* Get the content of an entry specified by a url.
*
* @param[in] url The url of the entry.
* @param[out] content The content of the entry.
* @param[out] title the title of the entry.
* @param[out] contentLength The size of the entry (size of content).
* @param[out] contentType The mimeType of the entry.
* @param[out] baseUrl Return the true url of the entry.
* If the specified entry is a redirection, contains
* the url of the targeted entry.
* @return True if the entry has been found.
*/
DEPRECATED bool getContentByDecodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType,
string& baseUrl) const;
/**
* Get the content of an entry specified by a url.
*
* Equivalent to getContentByDecodedUrl but withou the baseUrl.
*/
DEPRECATED bool getContentByDecodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType) const;
/**
* Search for entries with title starting with prefix (case sensitive).
*
* Suggestions are stored in an internal vector and can be retrieved using
* `getNextSuggestion` method.
*
* @param prefix The prefix to search.
* @param suggestionsCount How many suggestions to search for.
* @param reset If true, remove previous suggestions in the internal vector.
* If false, add suggestions to the internal vector
* (until internal vector size is suggestionCount (or no more
* suggestion))
* @return True if some suggestions where added to the internal vector.
*/
bool searchSuggestions(const string& prefix,
unsigned int suggestionsCount,
const bool reset = true);
/**
* Search for entries for the given prefix.
*
* If the zim file has a internal fulltext index, the suggestions will be
* searched using it.
* Else the suggestions will be search using `searchSuggestions` while trying
* to be smart about case sensitivity (using `getTitleVariants`).
*
* In any case, suggestions are stored in an internal vector and can be
* retrieved using `getNextSuggestion` method.
* The internal vector will be reset.
*
* @param prefix The prefix to search for.
* @param suggestionsCount How many suggestions to search for.
*/
bool searchSuggestionsSmart(const string& prefix,
unsigned int suggestionsCount);
/**
* Check if the url exists in the zim file.
*
* Deprecated : Use `pathExists` instead.
*
* @param url the url to check.
* @return True if the url exits in the zim file.
*/
DEPRECATED bool urlExists(const string& url) const;
/**
* Check if the path exists in the zim file.
*
* @param path the path to check.
* @return True if the path exists in the zim file.
*/
bool pathExists(const string& path) const;
/**
* Check if the zim file has a embedded fulltext index.
*
* @return True if the zim file has a embedded fulltext index
* and is not split (else the fulltext is not accessible).
*/
bool hasFulltextIndex() const;
/**
* Get potential case title variations for a title.
*
* @param title a title.
* @return the list of variantions.
*/
std::vector<std::string> getTitleVariants(const std::string& title) const;
/**
* Get the next suggestion title.
*
* @param[out] title the title of the suggestion.
* @return True if title has been set.
*/
bool getNextSuggestion(string& title);
/**
* Get the next suggestion title and url.
*
* @param[out] title the title of the suggestion.
* @param[out] url the url of the suggestion.
* @return True if title and url have been set.
*/
bool getNextSuggestion(string& title, string& url);
/**
* Get if we can check zim file integrity (has a checksum).
*
* @return True if zim file have a checksum.
*/
bool canCheckIntegrity() const;
/**
* Check is zim file is corrupted.
*
* @return True if zim file is corrupted.
*/
bool isCorrupted() const;
/**
* Parse a full url into a namespace and url.
*
* @param[in] url The full url ("/N/url").
* @param[out] ns The namespace (N).
* @param[out] title The url (url).
* @return True
*/
DEPRECATED bool parseUrl(const string& url, char* ns, string& title) const;
/**
* Return the total size of the zim file.
*
* If zim file is split, return the sum of all parts' size.
*
* @return Size of the size file is KiB.
*/
unsigned int getFileSize() const;
/**
* Get the zim file handler.
*
* @return The libzim file handler.
*/
zim::File* getZimFileHandler() const;
/**
* Get the zim article object associated to a url.
*
* @param[in] url The url of the article.
* @param[out] article The libzim article object.
* @return True if the url is good (article.good()).
*/
DEPRECATED bool getArticleObjectByDecodedUrl(const string& url,
zim::Article& article) const;
protected:
zim::File* zimFileHandler;
zim::size_type firstArticleOffset;
zim::size_type lastArticleOffset;
zim::size_type nsACount;
zim::size_type nsICount;
std::string zimFilePath;
std::vector<std::vector<std::string>> suggestions;
std::vector<std::vector<std::string>>::iterator suggestionsOffset;
private:
std::map<const std::string, unsigned int> parseCounterMetadata() const;
};
}
#endif

View File

@@ -21,11 +21,12 @@
#define KIWIX_SEARCH_RENDERER_H
#include <string>
#include <zim/search.h>
#include "library.h"
namespace kiwix
{
class Searcher;
class NameMapper;
/**
* The SearcherRenderer class is used to render a search result to a html page.
@@ -34,21 +35,43 @@ class SearchRenderer
{
public:
/**
* The default constructor.
* Construct a SearchRenderer from a SearchResultSet.
*
* @param humanReadableName The global zim's humanReadableName.
* Used to generate pagination links.
* The constructed version of the SearchRenderer will not introduce
* the book name for each result. It is better to use the other constructor
* with a Library pointer to have a better html page.
*
* @param srs The `SearchResultSet` to render.
* @param mapper The `NameMapper` to use to do the rendering.
* @param start The start offset used for the srs.
* @param estimatedResultCount The estimatedResultCount of the whole search
*/
SearchRenderer(Searcher* searcher, NameMapper* mapper);
SearchRenderer(zim::SearchResultSet srs, NameMapper* mapper,
unsigned int start, unsigned int estimatedResultCount);
/**
* Construct a SearchRenderer from a SearchResultSet.
*
* @param srs The `SearchResultSet` to render.
* @param mapper The `NameMapper` to use to do the rendering.
* @param library The `Library` to use to look up book details for search results.
* @param start The start offset used for the srs.
* @param estimatedResultCount The estimatedResultCount of the whole search
*/
SearchRenderer(zim::SearchResultSet srs, NameMapper* mapper, Library* library,
unsigned int start, unsigned int estimatedResultCount);
~SearchRenderer();
/**
* Set the search pattern used to do the search
*/
void setSearchPattern(const std::string& pattern);
/**
* Set the search content id.
* Set the querystring used to select books
*/
void setSearchContent(const std::string& name);
void setSearchBookQuery(const std::string& bookQuery);
/**
* Set protocol prefix.
@@ -60,23 +83,38 @@ class SearchRenderer
*/
void setSearchProtocolPrefix(const std::string& prefix);
/**
* set result count per page
*/
void setPageLength(unsigned int pageLength){
this->pageLength = pageLength;
}
std::string renderTemplate(const std::string& tmpl_str);
/**
* Generate the html page with the resutls of the search.
*/
std::string getHtml();
/**
* Generate the xml page with the resutls of the search.
*/
std::string getXml();
protected:
std::string beautifyInteger(const unsigned int number);
Searcher* mp_searcher;
zim::SearchResultSet m_srs;
NameMapper* mp_nameMapper;
std::string searchContent;
Library* mp_library;
std::string searchBookQuery;
std::string searchPattern;
std::string protocolPrefix;
std::string searchProtocolPrefix;
unsigned int resultCountPerPage;
unsigned int pageLength;
unsigned int estimatedResultCount;
unsigned int resultStart;
unsigned int resultEnd;
};

View File

@@ -1,172 +0,0 @@
/*
* Copyright 2011 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIX_SEARCHER_H
#define KIWIX_SEARCHER_H
#include <stdio.h>
#include <stdlib.h>
#include <unicode/putil.h>
#include <algorithm>
#include <cctype>
#include <locale>
#include <string>
#include <vector>
#include <vector>
#include "tools/pathTools.h"
#include "tools/stringTools.h"
#include "kiwix_config.h"
using namespace std;
namespace kiwix
{
class Reader;
class Result
{
public:
virtual ~Result(){};
virtual std::string get_url() = 0;
virtual std::string get_title() = 0;
virtual int get_score() = 0;
virtual std::string get_snippet() = 0;
virtual std::string get_content() = 0;
virtual int get_wordCount() = 0;
virtual int get_size() = 0;
virtual int get_readerIndex() = 0;
};
struct SearcherInternal;
/**
* The Searcher class is reponsible to do different kind of search using the
* fulltext index.
*/
class Searcher
{
public:
/**
* The default constructor.
*/
Searcher();
~Searcher();
/**
* Add a reader (containing embedded fulltext index) to the search.
*
* @param reader The Reader for the zim containing the fulltext index.
* @return true if the reader has been added.
* false if the reader cannot be added (no embedded fulltext index present)
*/
bool add_reader(Reader* reader);
Reader* get_reader(int index);
/**
* Start a search on the zim associated to the Searcher.
*
* Search results should be retrived using the getNextResult method.
*
* @param search The search query.
* @param resultStart the start offset of the search results (used for pagination).
* @param resultEnd the end offset of the search results (used for pagination).
* @param verbose print some info on stdout if true.
*/
void search(const std::string& search,
unsigned int resultStart,
unsigned int resultEnd,
const bool verbose = false);
/**
* Start a geographique search.
* The search return result for entry in a disc of center latitude/longitude
* and radius distance.
*
* Search results should be retrived using the getNextResult method.
*
* @param latitude The latitude of the center point.
* @param longitude The longitude of the center point.
* @param distance The radius of the disc.
* @param resultStart the start offset of the search results (used for pagination).
* @param resultEnd the end offset of the search results (used for pagination).
* @param verbose print some info on stdout if true.
*/
void geo_search(float latitude, float longitude, float distance,
unsigned int resultStart,
unsigned int resultEnd,
const bool verbose = false);
/**
* Start a suggestion search.
* The search made depend of the "version" of the embedded index.
* - If the index is newer enough and have a title namespace, the search is
* made in the titles only.
* - Else the search is made on the whole article content.
* In any case, the search is made "partial" (as adding '*' at the end of the query)
*
* @param search The search query.
* @param verbose print some info on stdout if true.
*/
void suggestions(std::string& search, const bool verbose = false);
/**
* Get the next result of a started search.
* This is the method to use to loop hover the search results.
*/
Result* getNextResult();
/**
* Restart the previous search.
* Next call to getNextResult will return the first result.
*/
void restart_search();
/**
* Get a estimation of the result count.
*/
unsigned int getEstimatedResultCount();
unsigned int getResultStart() { return resultStart; }
unsigned int getResultEnd() { return resultEnd; }
protected:
std::string beautifyInteger(const unsigned int number);
void closeIndex();
void searchInIndex(string& search,
const unsigned int resultStart,
const unsigned int resultEnd,
const bool verbose = false);
std::vector<Reader*> readers;
SearcherInternal* internal;
std::string searchPattern;
unsigned int estimatedResultCount;
unsigned int resultStart;
unsigned int resultEnd;
private:
void reset();
};
}
#endif

View File

@@ -54,23 +54,31 @@ namespace kiwix
void setAddress(const std::string& addr) { m_addr = addr; }
void setPort(int port) { m_port = port; }
void setNbThreads(int threads) { m_nbThreads = threads; }
void setMultiZimSearchLimit(unsigned int limit) { m_multizimSearchLimit = limit; }
void setIpConnectionLimit(int limit) { m_ipConnectionLimit = limit; }
void setVerbose(bool verbose) { m_verbose = verbose; }
void setIndexTemplateString(const std::string& indexTemplateString) { m_indexTemplateString = indexTemplateString; }
void setTaskbar(bool withTaskbar, bool withLibraryButton)
{ m_withTaskbar = withTaskbar; m_withLibraryButton = withLibraryButton; }
void setBlockExternalLinks(bool blockExternalLinks)
{ m_blockExternalLinks = blockExternalLinks; }
int getPort();
std::string getAddress();
protected:
Library* mp_library;
NameMapper* mp_nameMapper;
std::string m_root = "";
std::string m_addr = "";
std::string m_indexTemplateString = "";
int m_port = 80;
int m_nbThreads = 1;
unsigned int m_multizimSearchLimit = 0;
bool m_verbose = false;
bool m_withTaskbar = true;
bool m_withLibraryButton = true;
bool m_blockExternalLinks = false;
int m_ipConnectionLimit = 0;
std::unique_ptr<InternalServer> mp_server;
};
}

220
include/tools.h Normal file
View File

@@ -0,0 +1,220 @@
/*
* Copyright 2021 Matthieu Gautier <mgautier@kymeria.fr>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIX_TOOLS_H
#define KIWIX_TOOLS_H
#include <string>
#include <vector>
#include <map>
namespace kiwix {
/**
* Return the current directory.
*
* @return the current directory (utf8 encoded)
*/
std::string getCurrentDirectory();
/**
* Return the data directory.
*
* The data directory is a directory where to put data (zim files, ...)
* It depends of the platform and it may be changed by user using environment variable.
*
* The resolution order is :
* - `KIWIX_DATA_DIR` env variable (if set).
* - On Windows :
* . `$APPDATA/kiwix` if $APPDATA is set
* . `$USERPROFILE/kiwix` if $USERPROFILE is set
* - Else :
* . `$XDG_DATA_HOME/kiwix`if $XDG_DATA_HOME is set
* . `$HOME/.local/share/kiwx` if $HOWE is set
* - current directory
*
* @return the path of the data directory (utf8 encoded)
*/
std::string getDataDirectory();
/** Return the path of the executable
*
* Some application may be packaged in auto extractible archive (Appimage) and the
* real executable is different of the path of the archive.
* If `realPathOnly` is true, return the path of the real executable instead of the
* archive launched by the user.
*
* @param realPathOnly If we must return the real path of the executable.
* @return the path of the executable (utf8 encoded)
*/
std::string getExecutablePath(bool realPathOnly = false);
/** Tell if the path is a relative path.
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate paths.
*
* @param path A utf8 encoded path.
* @return true if the path is relative.
*/
bool isRelativePath(const std::string& path);
/** Append a path to another one.
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate paths.
*
* @param basePath the base path.
* @param relativePath a path to add to the base path, must be a relative path.
* @return The concatenation of the paths, using the right separator.
*/
std::string appendToDirectory(const std::string& basePath, const std::string& relativePath);
/** Remove the last element of a path.
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate paths.
*
* @param path a path.
* @return The parent directory (or empty string if none).
*/
std::string removeLastPathElement(const std::string& path);
/** Get the last element of a path.
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate paths.
*
* @param path a path.
* @return The base name of the path or empty string if none (ending with a separator).
*/
std::string getLastPathElement(const std::string& path);
/** Compute the absolute path of a relative path based on another one
*
* Equivalent to appendToDirectory followed by a normalization of the path.
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate paths.
*
* @param path the base path (if empty, current directory is taken).
* @param relativePath the relative path.
* @return a absolute path.
*/
std::string computeAbsolutePath(const std::string& path, const std::string& relativePath);
/** Compute the relative path of a path relative to another one
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate paths.
*
* @param path the base path.
* @param absolutePath the absolute path to find the relative path for.
* @return a relative path (pointing to absolutePath, relative to path).
*/
std::string computeRelativePath(const std::string& path, const std::string& absolutePath);
/** Sleep the current thread.
*
* This function is provided as a small helper. It is probably better to use native tools.
*
* @param milliseconds The number of milliseconds to wait for.
*/
void sleep(unsigned int milliseconds);
/** Split a string
*
* This function is provided as a small helper. It is probably better to use native tools.
*
* Assuming text = "foo:;bar;baz,oups;"
*
* split(text, ":;", true, true) => ["foo", ":", ";", "bar", ";", "baz,oups", ";"]
* split(text, ":;", true, false) => ["foo", "bar", "baz,oups"] (default)
* split(text, ":;", false, true) => ["foo", ":", "", ";", "bar", ";", "baz,oups", ";", ""]
* split(text, ":;", false, false) => ["foo", "", "bar", "baz,oups", ""]
*
* @param str The string to split.
* @param delims A string of potential delimiters.
* Each charater in the string can be a individual delimiters.
* @param dropEmpty true if empty part must be dropped from the result.
* @param keepDelim true if delimiter must be included from the result.
* @return a list of part (potentially containing delimiters)
*/
std::vector<std::string> split(const std::string& str, const std::string& delims, bool dropEmpty=true, bool keepDelim = false);
/** Convert language code from iso2 code to iso3
*
* This function is provided as a small helper. It is probably better to use native tools
* to manipulate locales.
*
* @param a2code a iso2 code string.
* @return the corresponding iso3 code.
* @throw std::out_of_range if iso2 code is not known.
*/
std::string converta2toa3(const std::string& a2code);
/** Extracts content from given file.
*
* This function provides content of a file provided it's path.
*
* @param path The absolute path provided in string format.
* @return Content of corresponding file in string format.
*/
std::string getFileContent(const std::string& path);
/** Checks if file exists.
*
* This function returns boolean stating if file exists.
*
* @param path The absolute path provided in string format.
* @return Boolean representing if file exists or not.
*/
bool fileExists(const std::string& path);
/** Checks if file is readable.
*
* This function returns boolean stating if file is readable.
*
* @param path The absolute path provided in string format.
* @return Boolean representing if file is readale or not.
*/
bool fileReadable(const std::string& path);
/** Provides mimetype from filename.
*
* This function provides mimetype from file-name.
*
* @param filename string containing filename.
* @return mimetype from filename in string format.
*/
std::string getMimeTypeForFile(const std::string& filename);
/** Provides all available network interfaces
*
* This function provides the available IPv4 network interfaces
*/
std::map<std::string, std::string> getNetworkInterfaces();
/** Provides the best IP address
* This function provides the best IP address from the list given by getNetworkInterfaces
*/
std::string getBestPublicIp();
}
#endif // KIWIX_TOOLS_H

View File

@@ -1,46 +0,0 @@
#ifndef KIWIXLIB_TOOL_LOCK_H
#define KIWIXLIB_TOOL_LOCK_H
#include <pthread.h>
namespace kiwix {
class Lock
{
public:
explicit Lock(pthread_mutex_t* mutex) :
mp_mutex(mutex)
{
pthread_mutex_lock(mp_mutex);
}
~Lock() {
if (mp_mutex != nullptr) {
pthread_mutex_unlock(mp_mutex);
}
}
Lock(Lock && other) :
mp_mutex(other.mp_mutex)
{
other.mp_mutex = nullptr;
}
Lock & operator=(Lock && other)
{
mp_mutex = other.mp_mutex;
other.mp_mutex = nullptr;
return *this;
}
private:
pthread_mutex_t* mp_mutex;
Lock(Lock const &) = delete;
Lock & operator=(Lock const &) = delete;
};
}
#endif //KIWIXLIB_TOOL_LOCK_H

View File

@@ -1,5 +1,5 @@
/*
* Copyright (C) 2013 Emmanuel Engelhart <kelson@kiwix.org>
* Copyright 2021 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -17,21 +17,18 @@
* MA 02110-1301, USA.
*/
package org.kiwix.kiwixlib;
#ifndef KIWIX_VERSION_H
#define KIWIX_VERSION_H
public class JNIKiwixString
#include <string>
#include <vector>
#include <iostream>
namespace kiwix
{
public String value;
public JNIKiwixString(String value) {
this.value = value;
}
public JNIKiwixString() {
this("");
}
public String getValue() {
return value;
}
typedef std::vector<std::pair<std::string, std::string>> LibVersions;
LibVersions getVersions();
void printVersions(std::ostream& out = std::cout);
}
#endif // KIWIX_VERSION_H

View File

@@ -1,46 +1,62 @@
project('kiwix-lib', 'cpp',
version : '9.2', # Also change this in android-kiwix-lib-publisher/kiwixLibAndroid/build.gradle
license : 'GPL',
project('libkiwix', 'cpp',
version : '11.0.0',
license : 'GPLv3+',
default_options : ['c_std=c11', 'cpp_std=c++11', 'werror=true'])
compiler = meson.get_compiler('cpp')
wrapper = get_option('wrapper')
static_deps = get_option('static-linkage') or get_option('default_library') == 'static'
static_deps = 'android' in wrapper or 'java' in wrapper or get_option('default_library') == 'static'
if 'android' in wrapper
extra_libs = ['-llog']
# See https://github.com/kiwix/libkiwix/issues/371
if ['arm', 'mips', 'm68k', 'ppc', 'sh4'].contains(host_machine.cpu_family())
extra_libs = ['-latomic']
else
extra_libs = []
endif
if 'java' in wrapper
add_languages('java')
if (compiler.get_id() == 'gcc' and build_machine.system() == 'linux') or host_machine.system() == 'freebsd'
# C++ std::thread is implemented using pthread on linux by gcc
thread_dep = dependency('threads')
else
thread_dep = dependency('', required:false)
endif
thread_dep = dependency('threads')
libicu_dep = dependency('icu-i18n', static:static_deps)
libzim_dep = dependency('libzim', version : '>=6.1.1', static:static_deps)
pugixml_dep = dependency('pugixml', static:static_deps)
libcurl_dep = dependency('libcurl', static:static_deps)
microhttpd_dep = dependency('libmicrohttpd', static:static_deps)
zlib_dep = dependency('zlib', static:static_deps)
xapian_dep = dependency('xapian-core', static:static_deps)
if not compiler.has_header('mustache.hpp')
if compiler.has_header('mustache.hpp')
extra_include = []
elif compiler.has_header('mustache.hpp', args: '-I/usr/include/kainjow')
extra_include = ['/usr/include/kainjow']
else
error('Cannot found header mustache.hpp')
endif
libzim_dep = dependency('libzim', version : '>=7.2.0', static:static_deps)
if not compiler.has_header_symbol('zim/zim.h', 'LIBZIM_WITH_XAPIAN')
error('Libzim seems to be compiled without xapian. Xapian support is mandatory.')
endif
extra_cflags = ''
if target_machine.system() == 'windows' and static_deps
if host_machine.system() == 'windows' and static_deps
add_project_arguments('-DCURL_STATICLIB', language : 'cpp')
extra_cflags += '-DCURL_STATICLIB'
endif
all_deps = [thread_dep, libicu_dep, libzim_dep, pugixml_dep, libcurl_dep, microhttpd_dep]
if host_machine.system() == 'windows'
add_project_arguments('-DNOMINMAX', language: 'cpp')
endif
inc = include_directories('include')
all_deps = [thread_dep, libicu_dep, libzim_dep, pugixml_dep, libcurl_dep, microhttpd_dep, zlib_dep, xapian_dep]
inc = include_directories('include', extra_include)
conf = configuration_data()
conf.set('VERSION', '"@0@"'.format(meson.project_version()))
conf.set('LIBKIWIX_VERSION', '"@0@"'.format(meson.project_version()))
if build_machine.system() == 'windows'
extra_link_args = ['-lshlwapi', '-lwinmm']
@@ -53,8 +69,11 @@ subdir('scripts')
subdir('static')
subdir('src')
subdir('test')
if get_option('doc')
subdir('docs')
endif
pkg_requires = ['libzim', 'icu-i18n', 'pugixml', 'libcurl']
pkg_requires = ['libzim', 'icu-i18n', 'pugixml', 'libcurl', 'libmicrohttpd', 'xapian-core']
pkg_conf = configuration_data()
pkg_conf.set('prefix', get_option('prefix'))

View File

@@ -1,2 +1,4 @@
option('wrapper', type:'array', choices:['java', 'android'], value:[],
description: 'The wrapper to generate.')
option('static-linkage', type : 'boolean', value : false,
description : 'Link statically with the dependencies.')
option('doc', type : 'boolean', value : false,
description : 'Build the documentations.')

148
scripts/kiwix-compile-i18n Executable file
View File

@@ -0,0 +1,148 @@
#!/usr/bin/env python3
'''
Copyright 2022 Veloman Yunkan <veloman.yunkan@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or any
later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.
'''
import argparse
import os.path
import re
import json
def to_identifier(name):
ident = re.sub(r'[^0-9a-zA-Z]', '_', name)
if ident[0].isnumeric():
return "_"+ident
return ident
def lang_code(filename):
filename = os.path.basename(filename)
lang = to_identifier(os.path.splitext(filename)[0])
print(filename, '->', lang)
return lang
from string import Template
def expand_cxx_template(t, **kwargs):
return Template(t).substitute(**kwargs)
def cxx_string_literal(s):
# Taking advantage of the fact the JSON string escape rules match
# those of C++
return 'u8' + json.dumps(s)
string_table_cxx_template = '''
const I18nString $TABLE_NAME[] = {
$TABLE_ENTRIES
};
'''
lang_table_entry_cxx_template = '''
{
$LANG_STRING_LITERAL,
ARRAY_ELEMENT_COUNT($STRING_TABLE_NAME),
$STRING_TABLE_NAME
}'''
cxxfile_template = '''// This file is automatically generated. Do not modify it.
#include "server/i18n.h"
namespace kiwix {
namespace i18n {
namespace
{
$STRING_DATA
} // unnamed namespace
#define ARRAY_ELEMENT_COUNT(a) (sizeof(a)/sizeof(a[0]))
extern const I18nStringTable stringTables[] = {
$LANG_TABLE
};
extern const size_t langCount = $LANG_COUNT;
} // namespace i18n
} // namespace kiwix
'''
class Resource:
def __init__(self, filename):
filename = filename.strip()
self.filename = filename
self.lang_code = lang_code(filename)
with open(filename, 'r', encoding='utf-8') as f:
self.data = f.read()
def get_string_table_name(self):
return "string_table_for_" + self.lang_code
def get_string_table(self):
table_entries = ",\n ".join(self.get_string_table_entries())
return expand_cxx_template(string_table_cxx_template,
TABLE_NAME=self.get_string_table_name(),
TABLE_ENTRIES=table_entries)
def get_string_table_entries(self):
d = json.loads(self.data)
for k in sorted(d.keys()):
if k != "@metadata":
key_string = cxx_string_literal(k)
value_string = cxx_string_literal(d[k])
yield '{ ' + key_string + ', ' + value_string + ' }'
def get_lang_table_entry(self):
return expand_cxx_template(lang_table_entry_cxx_template,
LANG_STRING_LITERAL=cxx_string_literal(self.lang_code),
STRING_TABLE_NAME=self.get_string_table_name())
def gen_c_file(resources):
string_data = []
lang_table = []
for r in resources:
string_data.append(r.get_string_table())
lang_table.append(r.get_lang_table_entry())
return expand_cxx_template(cxxfile_template,
STRING_DATA="\n".join(string_data),
LANG_TABLE=",\n ".join(lang_table),
LANG_COUNT=len(resources)
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--cxxfile',
required=True,
help='The Cpp file name to generate')
parser.add_argument('i18n_resource_files', nargs='+',
help='The list of resources to compile.')
args = parser.parse_args()
resources = [Resource(filename) for filename in args.i18n_resource_files]
with open(args.cxxfile, 'w') as f:
f.write(gen_c_file(resources))

View File

@@ -0,0 +1,18 @@
.TH KIWIX-COMPILE-I18N "1" "January 2022" "Kiwix" "User Commands"
.SH NAME
kiwix-compile-i18n \- helper to compile Kiwix i18n (internationalization) data
.SH SYNOPSIS
\fBkiwix\-compile\-i18n\fR [\-h] \-\-cxxfile CXXFILE i18n_resource_files ...\fR
.SH DESCRIPTION
.TP
i18n_resource_files ...
The list of i18n resources to compile.
.TP
\fB\-h\fR, \fB\-\-help\fR
show a help message and exit
.TP
\fB\-\-cxxfile\fR CXXFILE
The Cpp file name to generate
.TP
.SH AUTHOR
Veloman Yunkan <veloman.yunkan@gmail.com>

View File

@@ -102,7 +102,7 @@ class Resource:
master_c_template = """//This file is automaically generated. Do not modify it.
master_c_template = """//This file is automatically generated. Do not modify it.
#include <stdlib.h>
#include <fstream>

View File

@@ -0,0 +1,20 @@
.TH KIWIX-COMPILE-RESOURCES "1" "August 2017" "Kiwix" "User Commands"
.SH NAME
kiwix-compile-resources \- helper to compile and generate some Kiwix resources
.SH SYNOPSIS
\fBkiwix\-compile\-resources\fR [\-h] [\-\-cxxfile CXXFILE] [\-\-hfile HFILE] resource_file\fR
.SH DESCRIPTION
.TP
resource_file
The list of resources to compile.
.TP
\fB\-h\fR, \fB\-\-help\fR
show a help message and exit
.TP
\fB\-\-cxxfile\fR CXXFILE
The Cpp file name to generate
.TP
\fB\-\-hfile\fR HFILE
The h file name to generate
.SH AUTHOR
Matthieu Gautier <mgautier@kymeria.fr>

127
scripts/kiwix-resources Executable file
View File

@@ -0,0 +1,127 @@
#!/usr/bin/env python3
'''
Copyright 2022 Veloman Yunkan <veloman.yunkan@gmail.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or any
later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.
'''
import argparse
import hashlib
import os.path
import re
def read_resource_file(resource_file_path):
with open(resource_file_path, 'r') as f:
return [line.strip() for line in f]
def list_resources(resource_file_path):
for resource_path in read_resource_file(resource_file_path):
print(resource_path)
def compute_resource_revision(resource_path):
with open(os.path.join(OUT_DIR, resource_path), 'rb') as f:
return hashlib.sha1(f.read()).hexdigest()[:8]
resource_revisions = {}
def get_resource_revision(res):
if not res in resource_revisions:
preprocess_resource(res)
resource_revisions[res] = compute_resource_revision(res)
return resource_revisions[res]
RESOURCE_WITH_CACHEID_URL_PATTERN=r'(?P<pre>.*/(?P<resource>skin/[^"?]+)\?)KIWIXCACHEID(?P<post>[^"]*)'
def set_cacheid(resource_matchobj):
pre = resource_matchobj.group('pre')
resource = resource_matchobj.group('resource')
post = resource_matchobj.group('post')
cacheid = 'cacheid=' + get_resource_revision(resource)
return pre + cacheid + post
def preprocess_text(s):
if 'KIWIXCACHEID' in s:
s = re.sub(RESOURCE_WITH_CACHEID_URL_PATTERN, set_cacheid, s)
assert not 'KIWIXCACHEID' in s
return s
def get_preprocessed_resource(srcpath):
"""Get the transformed content of a resource
If the resource at srcpath is modified by preprocessing then this function
returns the transformed content of the resource. Otherwise it returns None.
"""
try:
with open(srcpath, 'r') as resource_file:
content = resource_file.read()
preprocessed_content = preprocess_text(content)
return preprocessed_content if preprocessed_content != content else None
except UnicodeDecodeError:
# It was a binary resource
return None
def symlink_resource(src, resource_path):
if os.path.exists(resource_path):
if os.path.islink(resource_path) and os.readlink(resource_path) == src:
return
os.remove(resource_path)
os.symlink(src, resource_path)
def preprocess_resource(resource_path):
print('Preprocessing', resource_path, '...')
resource_dir = os.path.dirname(resource_path)
if resource_dir != '':
os.makedirs(os.path.join(OUT_DIR, resource_dir), exist_ok=True)
srcpath = os.path.join(BASE_DIR, resource_path)
outpath = os.path.join(OUT_DIR, resource_path)
if os.path.exists(outpath):
os.remove(outpath)
preprocessed_content = get_preprocessed_resource(srcpath)
if preprocessed_content is None:
symlink_resource(srcpath, outpath)
else:
with open(outpath, 'w') as target:
print(preprocessed_content, end='', file=target)
def copy_file(src_path, dst_path):
with open(src_path, 'rb') as src:
with open(dst_path, 'wb') as dst:
dst.write(src.read())
def preprocess_resources(resource_file_path):
resource_filename = os.path.basename(resource_file_path)
for resource in read_resource_file(resource_file_path):
preprocess_resource(resource)
copy_file(resource_file_path, os.path.join(OUT_DIR, resource_filename))
if __name__ == "__main__":
parser = argparse.ArgumentParser()
commands = parser.add_mutually_exclusive_group()
commands.add_argument('--list-all', action='store_true')
commands.add_argument('--preprocess', action='store_true')
parser.add_argument('--outdir')
parser.add_argument('resource_file')
args = parser.parse_args()
BASE_DIR = os.path.dirname(os.path.realpath(args.resource_file))
OUT_DIR = args.outdir
if args.list_all:
list_resources(args.resource_file)
elif args.preprocess:
preprocess_resources(args.resource_file)

View File

@@ -1,4 +1,13 @@
res_manager = find_program('kiwix-resources')
res_compiler = find_program('kiwix-compile-resources')
install_data(res_compiler.path(), install_dir:get_option('bindir'))
install_man('kiwix-compile-resources.1')
i18n_compiler = find_program('kiwix-compile-i18n')
install_data(i18n_compiler.path(), install_dir:get_option('bindir'))
install_man('kiwix-compile-i18n.1')

View File

@@ -3,13 +3,15 @@
#include "aria2.h"
#include "xmlrpc.h"
#include <iostream>
#include <algorithm>
#include <sstream>
#include <thread>
#include <chrono>
#include <tools/otherTools.h>
#include <tools/pathTools.h>
#include <tools/stringTools.h>
#include <downloader.h> // For AriaError
#include "tools.h"
#include "tools/pathTools.h"
#include "tools/stringTools.h"
#include "tools/otherTools.h"
#include "downloader.h" // For AriaError
#ifdef _WIN32
# define ARIA2_CMD "aria2c.exe"
@@ -19,15 +21,20 @@
#endif
#define LOG_ARIA_ERROR() \
{ \
std::cerr << "ERROR: aria2 RPC request failed. (" << res << ")." << std::endl; \
std::cerr << (m_curlErrorBuffer[0] ? m_curlErrorBuffer.get() : curl_easy_strerror(res)) << std::endl; \
}
namespace kiwix {
Aria2::Aria2():
mp_aria(nullptr),
m_port(42042),
m_secret("kiwixariarpc"),
mp_curl(nullptr),
m_lock(PTHREAD_MUTEX_INITIALIZER)
m_secret(getNewRpcSecret()),
m_curlErrorBuffer(new char[CURL_ERROR_SIZE]),
mp_curl(nullptr)
{
m_downloadDir = getDataDirectory();
makeDirectory(m_downloadDir);
@@ -58,11 +65,12 @@ Aria2::Aria2():
// Try to use a potential installed aria2c.
callCmd.push_back(ARIA2_CMD);
}
callCmd.push_back("--follow-metalink=mem");
callCmd.push_back("--enable-rpc");
callCmd.push_back(rpc_secret.c_str());
callCmd.push_back(rpc_port.c_str());
callCmd.push_back(download_dir.c_str());
if (fileExists(session_file)) {
if (fileReadable(session_file)) {
callCmd.push_back(inputFile.c_str());
}
callCmd.push_back(session.c_str());
@@ -84,27 +92,21 @@ Aria2::Aria2():
}
mp_aria = Subprocess::run(callCmd);
mp_curl = curl_easy_init();
char errbuf[CURL_ERROR_SIZE];
curl_easy_setopt(mp_curl, CURLOPT_URL, "http://localhost/rpc");
curl_easy_setopt(mp_curl, CURLOPT_PORT, m_port);
curl_easy_setopt(mp_curl, CURLOPT_POST, 1L);
curl_easy_setopt(mp_curl, CURLOPT_ERRORBUFFER, errbuf);
curl_easy_setopt(mp_curl, CURLOPT_ERRORBUFFER, m_curlErrorBuffer.get());
int watchdog = 50;
while(--watchdog) {
sleep(10);
errbuf[0] = 0;
m_curlErrorBuffer[0] = 0;
auto res = curl_easy_perform(mp_curl);
if (res == CURLE_OK) {
break;
} else if (watchdog == 1) {
std::cerr <<" curl_easy_perform() failed." << std::endl;
fprintf(stderr, "\nlibcurl: (%d) ", res);
if (errbuf[0] != 0) {
std::cerr << errbuf << std::endl;
} else {
std::cerr << curl_easy_strerror(res) << std::endl;
}
LOG_ARIA_ERROR();
}
}
if (!watchdog) {
@@ -115,6 +117,7 @@ Aria2::Aria2():
Aria2::~Aria2()
{
std::unique_lock<std::mutex> lock(m_lock);
curl_easy_cleanup(mp_curl);
}
@@ -126,38 +129,44 @@ void Aria2::close()
size_t write_callback_to_iss(char* ptr, size_t size, size_t nmemb, void* userdata)
{
auto str = static_cast<std::stringstream*>(userdata);
str->write(ptr, nmemb);
auto outStream = static_cast<std::stringstream*>(userdata);
outStream->write(ptr, nmemb);
return nmemb;
}
std::string Aria2::doRequest(const MethodCall& methodCall)
{
pthread_mutex_lock(&m_lock);
auto requestContent = methodCall.toString();
std::stringstream stringstream;
std::stringstream outStream;
CURLcode res;
curl_easy_setopt(mp_curl, CURLOPT_POSTFIELDSIZE, requestContent.size());
curl_easy_setopt(mp_curl, CURLOPT_POSTFIELDS, requestContent.c_str());
curl_easy_setopt(mp_curl, CURLOPT_WRITEFUNCTION, &write_callback_to_iss);
curl_easy_setopt(mp_curl, CURLOPT_WRITEDATA, &stringstream);
res = curl_easy_perform(mp_curl);
if (res == CURLE_OK) {
long response_code;
long response_code;
{
std::unique_lock<std::mutex> lock(m_lock);
curl_easy_setopt(mp_curl, CURLOPT_POSTFIELDSIZE, requestContent.size());
curl_easy_setopt(mp_curl, CURLOPT_POSTFIELDS, requestContent.c_str());
curl_easy_setopt(mp_curl, CURLOPT_WRITEFUNCTION, &write_callback_to_iss);
curl_easy_setopt(mp_curl, CURLOPT_WRITEDATA, &outStream);
m_curlErrorBuffer[0] = 0;
res = curl_easy_perform(mp_curl);
if (res != CURLE_OK) {
LOG_ARIA_ERROR();
throw std::runtime_error("Cannot perform request");
}
curl_easy_getinfo(mp_curl, CURLINFO_RESPONSE_CODE, &response_code);
pthread_mutex_unlock(&m_lock);
if (response_code != 200) {
throw std::runtime_error("Invalid return code from aria");
}
auto responseContent = stringstream.str();
MethodResponse response(responseContent);
if (response.isFault()) {
throw AriaError(response.getFault().getFaultString());
}
return responseContent;
}
pthread_mutex_unlock(&m_lock);
throw std::runtime_error("Cannot perform request");
auto responseContent = outStream.str();
if (response_code != 200) {
std::cerr << "ERROR: Invalid return code (" << response_code << ") from aria :" << std::endl;
std::cerr << responseContent << std::endl;
throw std::runtime_error("Invalid return code from aria");
}
MethodResponse response(responseContent);
if (response.isFault()) {
throw AriaError(response.getFault().getFaultString());
}
return responseContent;
}
std::string Aria2::addUri(const std::vector<std::string>& uris, const std::vector<std::pair<std::string, std::string>>& options)
@@ -188,6 +197,13 @@ std::string Aria2::tellStatus(const std::string& gid, const std::vector<std::str
return doRequest(methodCall);
}
std::string Aria2::getNewRpcSecret()
{
std::string uuid = gen_uuid("");
uuid.erase(std::remove(uuid.begin(), uuid.end(), '-'));
return uuid.substr(0, 9);
}
std::vector<std::string> Aria2::tellActive()
{
MethodCall methodCall("aria2.tellActive", m_secret);

View File

@@ -12,8 +12,8 @@
#include "xmlrpc.h"
#include <memory>
#include <mutex>
#include <curl/curl.h>
#include <pthread.h>
namespace kiwix {
@@ -24,8 +24,9 @@ class Aria2
int m_port;
std::string m_secret;
std::string m_downloadDir;
std::unique_ptr<char[]> m_curlErrorBuffer;
CURL* mp_curl;
pthread_mutex_t m_lock;
std::mutex m_lock;
std::string doRequest(const MethodCall& methodCall);
@@ -36,6 +37,7 @@ class Aria2
std::string addUri(const std::vector<std::string>& uri, const std::vector<std::pair<std::string, std::string>>& options = {});
std::string tellStatus(const std::string& gid, const std::vector<std::string>& statusKey);
static std::string getNewRpcSecret();
std::vector<std::string> tellActive();
std::vector<std::string> tellWaiting();
void saveSession();

View File

@@ -18,13 +18,18 @@
*/
#include "book.h"
#include "reader.h"
#include "tools.h"
#include "tools/base64.h"
#include "tools/regexTools.h"
#include "tools/networkTools.h"
#include "tools/otherTools.h"
#include "tools/stringTools.h"
#include "tools/pathTools.h"
#include "tools/archiveTools.h"
#include <zim/archive.h>
#include <zim/item.h>
#include <pugixml.hpp>
namespace kiwix
@@ -35,11 +40,17 @@ Book::Book() :
m_readOnly(false)
{
}
/* Destructor */
Book::~Book()
{
}
Book::Illustrations Book::getIllustrations() const
{
return m_illustrations;
}
bool Book::update(const kiwix::Book& other)
{
if (m_readOnly)
@@ -48,53 +59,38 @@ bool Book::update(const kiwix::Book& other)
if (m_id != other.m_id)
return false;
m_readOnly = other.m_readOnly;
m_path = other.m_path;
m_pathValid = other.m_pathValid;
m_title = other.m_title;
m_description = other.m_description;
m_language = other.m_language;
m_creator = other.m_creator;
m_publisher = other.m_publisher;
m_date = other.m_date;
m_url = other.m_url;
m_name = other.m_name;
m_flavour = other.m_flavour;
m_tags = other.m_tags;
m_origId = other.m_origId;
m_articleCount = other.m_articleCount;
m_mediaCount = other.m_mediaCount;
m_size = other.m_size;
m_favicon = other.m_favicon;
m_faviconMimeType = other.m_faviconMimeType;
m_faviconUrl = other.m_faviconUrl;
m_downloadId = other.m_downloadId;
*this = other;
return true;
}
void Book::update(const kiwix::Reader& reader)
{
m_path = reader.getZimFilePath();
m_pathValid = true;
m_id = reader.getId();
m_title = reader.getTitle();
m_description = reader.getDescription();
m_language = reader.getLanguage();
m_creator = reader.getCreator();
m_publisher = reader.getPublisher();
m_date = reader.getDate();
m_name = reader.getName();
m_flavour = reader.getFlavour();
m_tags = reader.getTags();
m_origId = reader.getOrigId();
m_articleCount = reader.getArticleCount();
m_mediaCount = reader.getMediaCount();
m_size = static_cast<uint64_t>(reader.getFileSize()) << 10;
void Book::update(const zim::Archive& archive) {
m_path = archive.getFilename();
m_pathValid = true;
m_id = getArchiveId(archive);
m_title = getArchiveTitle(archive);
m_description = getMetaDescription(archive);
m_language = getMetaLanguage(archive);
m_creator = getMetaCreator(archive);
m_publisher = getMetaPublisher(archive);
m_date = getMetaDate(archive);
m_name = getMetaName(archive);
m_flavour = getMetaFlavour(archive);
m_tags = getMetaTags(archive);
m_category = getCategoryFromTags();
m_articleCount = getArchiveArticleCount(archive);
m_mediaCount = getArchiveMediaCount(archive);
m_size = static_cast<uint64_t>(getArchiveFileSize(archive)) << 10;
reader.getFavicon(m_favicon, m_faviconMimeType);
m_illustrations.clear();
for ( const auto illustrationSize : archive.getIllustrationSizes() ) {
const auto illustration = std::make_shared<Illustration>();
const zim::Item illustrationItem = archive.getIllustrationItem(illustrationSize);
illustration->width = illustration->height = illustrationSize;
illustration->mimeType = illustrationItem.getMimetype();
illustration->data = illustrationItem.getData();
// NOTE: illustration->url is left uninitialized
m_illustrations.push_back(illustration);
}
}
#define ATTR(name) node.attribute(name).value()
@@ -106,7 +102,7 @@ void Book::updateFromXml(const pugi::xml_node& node, const std::string& baseDir)
path = computeAbsolutePath(baseDir, path);
}
m_path = path;
m_pathValid = fileExists(path);
m_pathValid = fileReadable(path);
m_title = ATTR("title");
m_description = ATTR("description");
m_language = ATTR("language");
@@ -121,12 +117,19 @@ void Book::updateFromXml(const pugi::xml_node& node, const std::string& baseDir)
m_articleCount = strtoull(ATTR("articleCount"), 0, 0);
m_mediaCount = strtoull(ATTR("mediaCount"), 0, 0);
m_size = strtoull(ATTR("size"), 0, 0) << 10;
m_favicon = base64_decode(ATTR("favicon"));
m_faviconMimeType = ATTR("faviconMimeType");
m_faviconUrl = ATTR("faviconUrl");
std::string favicon_mimetype = ATTR("faviconMimeType");
if (! favicon_mimetype.empty()) {
const auto favicon = std::make_shared<Illustration>();
favicon->data = base64_decode(ATTR("favicon"));
favicon->mimeType = favicon_mimetype;
favicon->url = ATTR("faviconUrl");
m_illustrations.assign(1, favicon);
}
try {
m_downloadId = ATTR("downloadId");
} catch(...) {}
const auto catattr = node.attribute("category");
m_category = catattr.empty() ? getCategoryFromTags() : catattr.value();
}
#undef ATTR
@@ -152,10 +155,14 @@ void Book::updateFromOpds(const pugi::xml_node& node, const std::string& urlHost
m_language = VALUE("language");
m_creator = node.child("author").child("name").child_value();
m_publisher = node.child("publisher").child("name").child_value();
m_date = fromOpdsDate(VALUE("updated"));
const std::string dcIssuedDate = VALUE("dc:issued");
m_date = dcIssuedDate.empty() ? VALUE("updated") : dcIssuedDate;
m_date = fromOpdsDate(m_date);
m_name = VALUE("name");
m_flavour = VALUE("flavour");
m_tags = VALUE("tags");
const auto catnode = node.child("category");
m_category = catnode.empty() ? getCategoryFromTags() : catnode.child_value();
m_articleCount = strtoull(VALUE("articleCount"), 0, 0);
m_mediaCount = strtoull(VALUE("mediaCount"), 0, 0);
for(auto linkNode = node.child("link"); linkNode;
@@ -167,8 +174,11 @@ void Book::updateFromOpds(const pugi::xml_node& node, const std::string& urlHost
m_size = strtoull(linkNode.attribute("length").value(), 0, 0);
}
if (rel == "http://opds-spec.org/image/thumbnail") {
m_faviconUrl = urlHost + linkNode.attribute("href").value();
m_faviconMimeType = linkNode.attribute("type").value();
const auto favicon = std::make_shared<Illustration>();
favicon->data.clear();
favicon->url = urlHost + linkNode.attribute("href").value();
favicon->mimeType = linkNode.attribute("type").value();
m_illustrations.assign(1, favicon);
}
}
@@ -179,7 +189,7 @@ std::string Book::getHumanReadableIdFromPath() const
{
std::string id = m_path;
if (!id.empty()) {
kiwix::removeAccents(id);
id = kiwix::removeAccents(id);
#ifdef _WIN32
id = replaceRegex(id, "", "^.*\\\\");
@@ -201,15 +211,54 @@ void Book::setPath(const std::string& path)
: path;
}
const std::string& Book::getFavicon() const {
if (m_favicon.empty() && !m_faviconUrl.empty()) {
try {
m_favicon = download(m_faviconUrl);
} catch(...) {
std::cerr << "Cannot download favicon from " << m_faviconUrl;
const Book::Illustration Book::missingDefaultIllustration;
std::shared_ptr<const Book::Illustration> Book::getIllustration(unsigned int size) const
{
for ( const auto& ilPtr : m_illustrations ) {
if (ilPtr->width == size && ilPtr->height == size) {
return ilPtr;
}
}
return m_favicon;
throw std::runtime_error("Cannot find illustration");
}
const Book::Illustration& Book::getDefaultIllustration() const
{
try {
return *getIllustration(48);
} catch (...) {
return missingDefaultIllustration;
}
}
const std::string& Book::Illustration::getData() const
{
if (data.empty() && !url.empty()) {
const std::lock_guard<std::mutex> l(mutex);
if ( data.empty() ) {
try {
data = download(url);
} catch(...) {
std::cerr << "Cannot download favicon from " << url;
}
}
}
return data;
}
const std::string& Book::getFavicon() const {
return getDefaultIllustration().getData();
}
const std::string& Book::getFaviconUrl() const
{
return getDefaultIllustration().url;
}
const std::string& Book::getFaviconMimeType() const
{
return getDefaultIllustration().mimeType;
}
std::string Book::getTagStr(const std::string& tagName) const {
@@ -220,4 +269,21 @@ bool Book::getTagBool(const std::string& tagName) const {
return convertStrToBool(getTagStr(tagName));
}
std::string Book::getCategory() const
{
return m_category;
}
std::string Book::getCategoryFromTags() const
{
try
{
return getTagStr("category");
}
catch ( const std::out_of_range& )
{
return "";
}
}
}

View File

@@ -1,3 +1,3 @@
#mesondefine VERSION
#mesondefine LIBKIWIX_VERSION

View File

@@ -133,7 +133,7 @@ Downloader::Downloader() :
m_knownDownloads[gid]->updateStatus();
}
} catch (std::exception& e) {
std::cerr << "aria2 tellActive failed : " << e.what();
std::cerr << "aria2 tellActive failed : " << e.what() << std::endl;
}
try {
for (auto gid : mp_aria->tellWaiting()) {
@@ -141,7 +141,7 @@ Downloader::Downloader() :
m_knownDownloads[gid]->updateStatus();
}
} catch (std::exception& e) {
std::cerr << "aria2 tellWaiting failed : " << e.what();
std::cerr << "aria2 tellWaiting failed : " << e.what() << std::endl;
}
}

View File

@@ -1,140 +0,0 @@
/*
* Copyright 2011 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "reader.h"
#include <time.h>
#include <zim/search.h>
namespace kiwix
{
Entry::Entry(zim::Article article)
: article(article)
{
}
#define RETURN_IF_INVALID(WHAT) if(!good()) { return (WHAT); }
std::string Entry::getPath() const
{
RETURN_IF_INVALID("");
return article.getLongUrl();
}
std::string Entry::getTitle() const
{
RETURN_IF_INVALID("");
return article.getTitle();
}
std::string Entry::getContent() const
{
RETURN_IF_INVALID("");
return article.getData();
}
zim::Blob Entry::getBlob(offset_type offset) const
{
RETURN_IF_INVALID(zim::Blob());
return article.getData(offset);
}
zim::Blob Entry::getBlob(offset_type offset, size_type size) const
{
RETURN_IF_INVALID(zim::Blob());
return article.getData(offset, size);
}
std::pair<std::string, offset_type> Entry::getDirectAccessInfo() const
{
RETURN_IF_INVALID(std::make_pair("", 0));
return article.getDirectAccessInformation();
}
size_type Entry::getSize() const
{
RETURN_IF_INVALID(0);
return article.getArticleSize();
}
std::string Entry::getMimetype() const
{
RETURN_IF_INVALID("");
try {
return article.getMimeType();
} catch (exception& e) {
return "application/octet-stream";
}
}
bool Entry::isRedirect() const
{
RETURN_IF_INVALID(false);
return article.isRedirect();
}
bool Entry::isLinkTarget() const
{
RETURN_IF_INVALID(false);
return article.isLinktarget();
}
bool Entry::isDeleted() const
{
RETURN_IF_INVALID(false);
return article.isDeleted();
}
Entry Entry::getRedirectEntry() const
{
RETURN_IF_INVALID(Entry());
if ( !article.isRedirect() ) {
throw NoEntry();
}
auto targeted_article = article.getRedirectArticle();
if ( !targeted_article.good()) {
throw NoEntry();
}
return targeted_article;
}
Entry Entry::getFinalEntry() const
{
RETURN_IF_INVALID(Entry());
if (final_article.good()) {
return final_article;
}
int loopCounter = 42;
final_article = article;
while (final_article.isRedirect() && loopCounter--) {
final_article = final_article.getRedirectArticle();
if ( !final_article.good()) {
throw NoEntry();
}
}
// Prevent infinite loops.
if (final_article.isRedirect()) {
throw NoEntry();
}
return final_article;
}
}

View File

@@ -9,6 +9,7 @@
# include <unistd.h>
#endif
#include "tools.h"
#include "tools/pathTools.h"
#include "tools/stringTools.h"

View File

@@ -19,54 +19,184 @@
#include "library.h"
#include "book.h"
#include "reader.h"
#include "libxml_dumper.h"
#include "tools.h"
#include "tools/base64.h"
#include "tools/regexTools.h"
#include "tools/pathTools.h"
#include "tools/stringTools.h"
#include "tools/otherTools.h"
#include "tools/concurrent_cache.h"
#include <pugixml.hpp>
#include <algorithm>
#include <set>
#include <cmath>
#include <unicode/locid.h>
#include <xapian.h>
namespace kiwix
{
/* Constructor */
Library::Library()
namespace
{
std::string iso639_3ToXapian(const std::string& lang) {
return icu::Locale(lang.c_str()).getLanguage();
};
std::string normalizeText(const std::string& text)
{
return removeAccents(text);
}
/* Destructor */
Library::~Library()
bool booksReferToTheSameArchive(const Book& book1, const Book& book2)
{
return book1.isPathValid()
&& book2.isPathValid()
&& book1.getPath() == book2.getPath();
}
template<typename Key, typename Value>
class MultiKeyCache: public ConcurrentCache<std::set<Key>, Value>
{
public:
explicit MultiKeyCache(size_t maxEntries)
: ConcurrentCache<std::set<Key>, Value>(maxEntries)
{}
bool drop(const Key& key)
{
std::unique_lock<std::mutex> l(this->lock_);
bool removed = false;
for(auto& cache_key: this->impl_.keys()) {
if(cache_key.find(key)!=cache_key.end()) {
removed |= this->impl_.drop(cache_key);
}
}
return removed;
}
};
} // unnamed namespace
struct Library::Impl
{
struct Entry : Book
{
Library::Revision lastUpdatedRevision = 0;
};
Library::Revision m_revision;
std::map<std::string, Entry> m_books;
using ArchiveCache = ConcurrentCache<std::string, std::shared_ptr<zim::Archive>>;
std::unique_ptr<ArchiveCache> mp_archiveCache;
using SearcherCache = MultiKeyCache<std::string, std::shared_ptr<ZimSearcher>>;
std::unique_ptr<SearcherCache> mp_searcherCache;
std::vector<kiwix::Bookmark> m_bookmarks;
Xapian::WritableDatabase m_bookDB;
unsigned int getBookCount(const bool localBooks, const bool remoteBooks) const;
Impl();
~Impl();
Impl(Impl&& );
Impl& operator=(Impl&& );
};
Library::Impl::Impl()
: mp_archiveCache(new ArchiveCache(std::max(getEnvVar<int>("KIWIX_ARCHIVE_CACHE_SIZE", 1), 1))),
mp_searcherCache(new SearcherCache(std::max(getEnvVar<int>("KIWIX_SEARCHER_CACHE_SIZE", 1), 1))),
m_bookDB("", Xapian::DB_BACKEND_INMEMORY)
{
}
Library::Impl::~Impl()
{
}
Library::Impl::Impl(Library::Impl&& ) = default;
Library::Impl& Library::Impl::operator=(Library::Impl&& ) = default;
unsigned int
Library::Impl::getBookCount(const bool localBooks, const bool remoteBooks) const
{
unsigned int result = 0;
for (auto& pair: m_books) {
auto& book = pair.second;
if ((!book.getPath().empty() && localBooks)
|| (!book.getUrl().empty() && remoteBooks)) {
result++;
}
}
return result;
}
/* Constructor */
Library::Library()
: mp_impl(new Library::Impl)
{
}
Library::Library(Library&& other)
: mp_impl(std::move(other.mp_impl))
{
}
Library& Library::operator=(Library&& other)
{
mp_impl = std::move(other.mp_impl);
return *this;
}
/* Destructor */
Library::~Library() = default;
bool Library::addBook(const Book& book)
{
std::lock_guard<std::mutex> lock(m_mutex);
++mp_impl->m_revision;
/* Try to find it */
updateBookDB(book);
try {
auto& oldbook = m_books.at(book.getId());
oldbook.update(book);
auto& oldbook = mp_impl->m_books.at(book.getId());
if ( ! booksReferToTheSameArchive(oldbook, book) ) {
dropCache(book.getId());
}
oldbook.update(book); // XXX: This may have no effect if oldbook is readonly
// XXX: Then m_bookDB will become out-of-sync with
// XXX: the real contents of the library.
oldbook.lastUpdatedRevision = mp_impl->m_revision;
return false;
} catch (std::out_of_range&) {
m_books[book.getId()] = book;
auto& newEntry = mp_impl->m_books[book.getId()];
static_cast<Book&>(newEntry) = book;
newEntry.lastUpdatedRevision = mp_impl->m_revision;
size_t new_cache_size = static_cast<size_t>(std::ceil(mp_impl->getBookCount(true, true)*0.1));
if (getEnvVar<int>("KIWIX_ARCHIVE_CACHE_SIZE", -1) <= 0) {
mp_impl->mp_archiveCache->setMaxSize(new_cache_size);
}
if (getEnvVar<int>("KIWIX_SEARCHER_CACHE_SIZE", -1) <= 0) {
mp_impl->mp_searcherCache->setMaxSize(new_cache_size);
}
return true;
}
}
void Library::addBookmark(const Bookmark& bookmark)
{
m_bookmarks.push_back(bookmark);
std::lock_guard<std::mutex> lock(m_mutex);
mp_impl->m_bookmarks.push_back(bookmark);
}
bool Library::removeBookmark(const std::string& zimId, const std::string& url)
{
for(auto it=m_bookmarks.begin(); it!=m_bookmarks.end(); it++) {
std::lock_guard<std::mutex> lock(m_mutex);
for(auto it=mp_impl->m_bookmarks.begin(); it!=mp_impl->m_bookmarks.end(); it++) {
if (it->getBookId() == zimId && it->getUrl() == url) {
m_bookmarks.erase(it);
mp_impl->m_bookmarks.erase(it);
return true;
}
}
@@ -74,20 +204,70 @@ bool Library::removeBookmark(const std::string& zimId, const std::string& url)
}
void Library::dropCache(const std::string& id)
{
mp_impl->mp_archiveCache->drop(id);
mp_impl->mp_searcherCache->drop(id);
}
bool Library::removeBookById(const std::string& id)
{
return m_books.erase(id) == 1;
std::lock_guard<std::mutex> lock(m_mutex);
mp_impl->m_bookDB.delete_document("Q" + id);
dropCache(id);
// We do not change the cache size here
// Most of the time, the book is remove in case of library refresh, it is
// often associated with addBook calls (which will properly set the cache size)
// Having a too big cache is not a problem here (or it would have been before)
// (And setMaxSize doesn't actually reduce the cache size, extra cached items
// will be removed in put or getOrPut).
return mp_impl->m_books.erase(id) == 1;
}
Book& Library::getBookById(const std::string& id)
Library::Revision Library::getRevision() const
{
return m_books.at(id);
std::lock_guard<std::mutex> lock(m_mutex);
return mp_impl->m_revision;
}
Book& Library::getBookByPath(const std::string& path)
uint32_t Library::removeBooksNotUpdatedSince(Revision libraryRevision)
{
for(auto& it: m_books) {
BookIdCollection booksToRemove;
{
std::lock_guard<std::mutex> lock(m_mutex);
for ( const auto& entry : mp_impl->m_books) {
if ( entry.second.lastUpdatedRevision <= libraryRevision ) {
booksToRemove.push_back(entry.first);
}
}
}
uint32_t countOfRemovedBooks = 0;
for ( const auto& id : booksToRemove ) {
if ( removeBookById(id) )
++countOfRemovedBooks;
}
return countOfRemovedBooks;
}
const Book& Library::getBookById(const std::string& id) const
{
// XXX: Doesn't make sense to lock this operation since it cannot
// XXX: guarantee thread-safety because of its return type
return mp_impl->m_books.at(id);
}
Book Library::getBookByIdThreadSafe(const std::string& id) const
{
std::lock_guard<std::mutex> lock(m_mutex);
return getBookById(id);
}
const Book& Library::getBookByPath(const std::string& path) const
{
// XXX: Doesn't make sense to lock this operation since it cannot
// XXX: guarantee thread-safety because of its return type
for(auto& it: mp_impl->m_books) {
auto& book = it.second;
if (book.getPath() == path)
return book;
@@ -97,113 +277,141 @@ Book& Library::getBookByPath(const std::string& path)
throw std::out_of_range(ss.str());
}
std::shared_ptr<Reader> Library::getReaderById(const std::string& id)
std::shared_ptr<zim::Archive> Library::getArchiveById(const std::string& id)
{
try {
return m_readers.at(id);
} catch (std::out_of_range& e) {}
auto book = getBookById(id);
if (!book.isPathValid())
return mp_impl->mp_archiveCache->getOrPut(id,
[&](){
auto book = getBookById(id);
if (!book.isPathValid()) {
throw std::invalid_argument("");
}
return std::make_shared<zim::Archive>(book.getPath());
});
} catch (std::invalid_argument&) {
return nullptr;
auto sptr = make_shared<Reader>(book.getPath());
m_readers[id] = sptr;
return sptr;
}
}
std::shared_ptr<ZimSearcher> Library::getSearcherByIds(const BookIdSet& ids)
{
assert(!ids.empty());
try {
return mp_impl->mp_searcherCache->getOrPut(ids,
[&](){
std::vector<zim::Archive> archives;
for(auto& id:ids) {
auto archive = getArchiveById(id);
if(!archive) {
throw std::invalid_argument("");
}
archives.push_back(*archive);
}
return std::make_shared<ZimSearcher>(zim::Searcher(archives));
});
} catch (std::invalid_argument&) {
return nullptr;
}
}
unsigned int Library::getBookCount(const bool localBooks,
const bool remoteBooks)
const bool remoteBooks) const
{
unsigned int result = 0;
for (auto& pair: m_books) {
auto& book = pair.second;
if ((!book.getPath().empty() && localBooks)
|| (book.getPath().empty() && remoteBooks)) {
result++;
std::lock_guard<std::mutex> lock(m_mutex);
return mp_impl->getBookCount(localBooks, remoteBooks);
}
bool Library::writeToFile(const std::string& path) const
{
const auto allBookIds = getBooksIds();
auto baseDir = removeLastPathElement(path);
LibXMLDumper dumper(this);
dumper.setBaseDir(baseDir);
std::string xml;
{
std::lock_guard<std::mutex> lock(m_mutex);
xml = dumper.dumpLibXMLContent(allBookIds);
};
return writeTextFile(path, xml);
}
bool Library::writeBookmarksToFile(const std::string& path) const
{
LibXMLDumper dumper(this);
// NOTE: LibXMLDumper::dumpLibXMLBookmark uses Library in a thread-safe way
const std::string xml = dumper.dumpLibXMLBookmark();
return writeTextFile(path, xml);
}
Library::AttributeCounts Library::getBookAttributeCounts(BookStrPropMemFn p) const
{
std::lock_guard<std::mutex> lock(m_mutex);
AttributeCounts propValueCounts;
for (const auto& pair: mp_impl->m_books) {
const auto& book = pair.second;
if (book.getOrigId().empty()) {
propValueCounts[(book.*p)()] += 1;
}
}
return propValueCounts;
}
std::vector<std::string> Library::getBookPropValueSet(BookStrPropMemFn p) const
{
std::vector<std::string> result;
for ( const auto& kv : getBookAttributeCounts(p) ) {
result.push_back(kv.first);
}
return result;
}
bool Library::writeToFile(const std::string& path)
std::vector<std::string> Library::getBooksLanguages() const
{
auto baseDir = removeLastPathElement(path);
LibXMLDumper dumper(this);
dumper.setBaseDir(baseDir);
return writeTextFile(path, dumper.dumpLibXMLContent(getBooksIds()));
return getBookPropValueSet(&Book::getLanguage);
}
bool Library::writeBookmarksToFile(const std::string& path)
Library::AttributeCounts Library::getBooksLanguagesWithCounts() const
{
LibXMLDumper dumper(this);
return writeTextFile(path, dumper.dumpLibXMLBookmark());
return getBookAttributeCounts(&Book::getLanguage);
}
std::vector<std::string> Library::getBooksLanguages()
std::vector<std::string> Library::getBooksCategories() const
{
std::vector<std::string> booksLanguages;
std::map<std::string, bool> booksLanguagesMap;
std::lock_guard<std::mutex> lock(m_mutex);
std::set<std::string> categories;
for (auto& pair: m_books) {
auto& book = pair.second;
auto& language = book.getLanguage();
if (booksLanguagesMap.find(language) == booksLanguagesMap.end()) {
if (book.getOrigId().empty()) {
booksLanguagesMap[language] = true;
booksLanguages.push_back(language);
}
for (const auto& pair: mp_impl->m_books) {
const auto& book = pair.second;
const auto& c = book.getCategory();
if ( !c.empty() ) {
categories.insert(c);
}
}
return booksLanguages;
return std::vector<std::string>(categories.begin(), categories.end());
}
std::vector<std::string> Library::getBooksCreators()
std::vector<std::string> Library::getBooksCreators() const
{
std::vector<std::string> booksCreators;
std::map<std::string, bool> booksCreatorsMap;
for (auto& pair: m_books) {
auto& book = pair.second;
auto& creator = book.getCreator();
if (booksCreatorsMap.find(creator) == booksCreatorsMap.end()) {
if (book.getOrigId().empty()) {
booksCreatorsMap[creator] = true;
booksCreators.push_back(creator);
}
}
}
return booksCreators;
return getBookPropValueSet(&Book::getCreator);
}
std::vector<std::string> Library::getBooksPublishers()
std::vector<std::string> Library::getBooksPublishers() const
{
std::vector<std::string> booksPublishers;
std::map<std::string, bool> booksPublishersMap;
for (auto& pair:m_books) {
auto& book = pair.second;
auto& publisher = book.getPublisher();
if (booksPublishersMap.find(publisher) == booksPublishersMap.end()) {
if (book.getOrigId().empty()) {
booksPublishersMap[publisher] = true;
booksPublishers.push_back(publisher);
}
}
}
return booksPublishers;
return getBookPropValueSet(&Book::getPublisher);
}
const std::vector<kiwix::Bookmark> Library::getBookmarks(bool onlyValidBookmarks)
const std::vector<kiwix::Bookmark> Library::getBookmarks(bool onlyValidBookmarks) const
{
if (!onlyValidBookmarks) {
return m_bookmarks;
return mp_impl->m_bookmarks;
}
std::vector<kiwix::Bookmark> validBookmarks;
auto booksId = getBooksIds();
for(auto& bookmark:m_bookmarks) {
std::lock_guard<std::mutex> lock(m_mutex);
for(auto& bookmark:mp_impl->m_bookmarks) {
if (std::find(booksId.begin(), booksId.end(), bookmark.getBookId()) != booksId.end()) {
validBookmarks.push_back(bookmark);
}
@@ -211,37 +419,216 @@ const std::vector<kiwix::Bookmark> Library::getBookmarks(bool onlyValidBookmarks
return validBookmarks;
}
std::vector<std::string> Library::getBooksIds()
Library::BookIdCollection Library::getBooksIds() const
{
std::vector<std::string> bookIds;
std::lock_guard<std::mutex> lock(m_mutex);
BookIdCollection bookIds;
for (auto& pair: m_books) {
for (auto& pair: mp_impl->m_books) {
bookIds.push_back(pair.first);
}
return bookIds;
}
std::vector<std::string> Library::filter(const std::string& search)
void Library::updateBookDB(const Book& book)
{
if (search.empty()) {
return getBooksIds();
Xapian::Stem stemmer;
Xapian::TermGenerator indexer;
const std::string lang = book.getLanguage();
try {
stemmer = Xapian::Stem(iso639_3ToXapian(lang));
indexer.set_stemmer(stemmer);
indexer.set_stemming_strategy(Xapian::TermGenerator::STEM_SOME);
} catch (...) {}
Xapian::Document doc;
indexer.set_document(doc);
const std::string title = normalizeText(book.getTitle());
const std::string desc = normalizeText(book.getDescription());
// Index title and description without prefixes for general search
indexer.index_text(title);
indexer.increase_termpos();
indexer.index_text(desc);
// Index all fields for field-based search
indexer.index_text(title, 1, "S");
indexer.index_text(desc, 1, "XD");
indexer.index_text(lang, 1, "L");
indexer.index_text(normalizeText(book.getCreator()), 1, "A");
indexer.index_text(normalizeText(book.getPublisher()), 1, "XP");
indexer.index_text(normalizeText(book.getName()), 1, "XN");
indexer.index_text(normalizeText(book.getCategory()), 1, "XC");
for ( const auto& tag : split(normalizeText(book.getTags()), ";") ) {
doc.add_boolean_term("XT" + tag);
if ( tag[0] != '_' ) {
indexer.increase_termpos();
indexer.index_text(tag);
}
}
return filter(Filter().query(search));
const std::string idterm = "Q" + book.getId();
doc.add_boolean_term(idterm);
doc.set_data(book.getId());
mp_impl->m_bookDB.replace_document(idterm, doc);
}
namespace
{
bool willSelectEverything(const Xapian::Query& query)
{
return query.get_type() == Xapian::Query::LEAF_MATCH_ALL;
}
std::vector<std::string> Library::filter(const Filter& filter)
Xapian::Query buildXapianQueryFromFilterQuery(const Filter& filter)
{
std::vector<std::string> bookIds;
for(auto& pair:m_books) {
auto book = pair.second;
if(filter.accept(book)) {
bookIds.push_back(pair.first);
if ( !filter.hasQuery() || filter.getQuery().empty() ) {
// This is a thread-safe way to construct an equivalent of
// a Xapian::Query::MatchAll query
return Xapian::Query(std::string());
}
Xapian::QueryParser queryParser;
queryParser.set_default_op(Xapian::Query::OP_AND);
queryParser.add_prefix("title", "S");
queryParser.add_prefix("description", "XD");
queryParser.add_prefix("name", "XN");
queryParser.add_prefix("category", "XC");
queryParser.add_prefix("lang", "L");
queryParser.add_prefix("publisher", "XP");
queryParser.add_prefix("creator", "A");
queryParser.add_prefix("tag", "XT");
const auto partialQueryFlag = filter.queryIsPartial()
? Xapian::QueryParser::FLAG_PARTIAL
: 0;
// Language assumed for the query is not known for sure so stemming
// is not applied
//queryParser.set_stemmer(Xapian::Stem(iso639_3ToXapian(???)));
//queryParser.set_stemming_strategy(Xapian::QueryParser::STEM_SOME);
const auto flags = Xapian::QueryParser::FLAG_PHRASE
| Xapian::QueryParser::FLAG_BOOLEAN
| Xapian::QueryParser::FLAG_BOOLEAN_ANY_CASE
| Xapian::QueryParser::FLAG_LOVEHATE
| Xapian::QueryParser::FLAG_WILDCARD
| partialQueryFlag;
return queryParser.parse_query(normalizeText(filter.getQuery()), flags);
}
Xapian::Query nameQuery(const std::string& name)
{
return Xapian::Query("XN" + normalizeText(name));
}
Xapian::Query categoryQuery(const std::string& category)
{
return Xapian::Query("XC" + normalizeText(category));
}
Xapian::Query langQuery(const std::string& lang)
{
return Xapian::Query("L" + normalizeText(lang));
}
Xapian::Query publisherQuery(const std::string& publisher)
{
Xapian::QueryParser queryParser;
queryParser.set_default_op(Xapian::Query::OP_OR);
queryParser.set_stemming_strategy(Xapian::QueryParser::STEM_NONE);
const auto flags = 0;
const auto q = queryParser.parse_query(normalizeText(publisher), flags, "XP");
return Xapian::Query(Xapian::Query::OP_PHRASE, q.get_terms_begin(), q.get_terms_end(), q.get_length());
}
Xapian::Query creatorQuery(const std::string& creator)
{
Xapian::QueryParser queryParser;
queryParser.set_default_op(Xapian::Query::OP_OR);
queryParser.set_stemming_strategy(Xapian::QueryParser::STEM_NONE);
const auto flags = 0;
const auto q = queryParser.parse_query(normalizeText(creator), flags, "A");
return Xapian::Query(Xapian::Query::OP_PHRASE, q.get_terms_begin(), q.get_terms_end(), q.get_length());
}
Xapian::Query tagsQuery(const Filter::Tags& acceptTags, const Filter::Tags& rejectTags)
{
Xapian::Query q = Xapian::Query(std::string());
if (!acceptTags.empty()) {
for ( const auto& tag : acceptTags )
q &= Xapian::Query("XT" + normalizeText(tag));
}
if (!rejectTags.empty()) {
for ( const auto& tag : rejectTags )
q = Xapian::Query(Xapian::Query::OP_AND_NOT, q, "XT" + normalizeText(tag));
}
return q;
}
Xapian::Query buildXapianQuery(const Filter& filter)
{
auto q = buildXapianQueryFromFilterQuery(filter);
if ( filter.hasName() ) {
q = Xapian::Query(Xapian::Query::OP_AND, q, nameQuery(filter.getName()));
}
if ( filter.hasCategory() ) {
q = Xapian::Query(Xapian::Query::OP_AND, q, categoryQuery(filter.getCategory()));
}
if ( filter.hasLang() ) {
q = Xapian::Query(Xapian::Query::OP_AND, q, langQuery(filter.getLang()));
}
if ( filter.hasPublisher() ) {
q = Xapian::Query(Xapian::Query::OP_AND, q, publisherQuery(filter.getPublisher()));
}
if ( filter.hasCreator() ) {
q = Xapian::Query(Xapian::Query::OP_AND, q, creatorQuery(filter.getCreator()));
}
if ( !filter.getAcceptTags().empty() || !filter.getRejectTags().empty() ) {
const auto tq = tagsQuery(filter.getAcceptTags(), filter.getRejectTags());
q = Xapian::Query(Xapian::Query::OP_AND, q, tq);;
}
return q;
}
} // unnamed namespace
Library::BookIdCollection Library::filterViaBookDB(const Filter& filter) const
{
const auto query = buildXapianQuery(filter);
if ( willSelectEverything(query) )
return getBooksIds();
BookIdCollection bookIds;
std::lock_guard<std::mutex> lock(m_mutex);
Xapian::Enquire enquire(mp_impl->m_bookDB);
enquire.set_query(query);
const auto results = enquire.get_mset(0, mp_impl->m_books.size());
for ( auto it = results.begin(); it != results.end(); ++it ) {
bookIds.push_back(it.get_document().get_data());
}
return bookIds;
}
Library::BookIdCollection Library::filter(const Filter& filter) const
{
BookIdCollection result;
const auto preliminaryResult = filterViaBookDB(filter);
std::lock_guard<std::mutex> lock(m_mutex);
for(auto id : preliminaryResult) {
if(filter.accept(mp_impl->m_books.at(id))) {
result.push_back(id);
}
}
return bookIds;
return result;
}
template<supportedListSortBy SORT>
@@ -257,13 +644,13 @@ struct KEY_TYPE<SIZE> {
template<supportedListSortBy sort>
class Comparator {
private:
Library* lib;
bool ascending;
const Library* const lib;
const bool ascending;
inline typename KEY_TYPE<sort>::TYPE get_key(const std::string& id);
public:
Comparator(Library* lib, bool ascending) : lib(lib), ascending(ascending) {}
Comparator(const Library* lib, bool ascending) : lib(lib), ascending(ascending) {}
inline bool operator() (const std::string& id1, const std::string& id2) {
if (ascending) {
return get_key(id1) < get_key(id2);
@@ -303,8 +690,13 @@ std::string Comparator<PUBLISHER>::get_key(const std::string& id)
return lib->getBookById(id).getPublisher();
}
void Library::sort(std::vector<std::string>& bookIds, supportedListSortBy sort, bool ascending)
void Library::sort(BookIdCollection& bookIds, supportedListSortBy sort, bool ascending) const
{
// NOTE: Can reimplement this method in a way that doesn't require locking
// NOTE: for the entire duration of the sort. Will need to obtain (under a
// NOTE: lock) the required atributes from the books once, and then the
// NOTE: sorting will run on a copy of data without locking.
std::lock_guard<std::mutex> lock(m_mutex);
switch(sort) {
case TITLE:
std::sort(bookIds.begin(), bookIds.end(), Comparator<TITLE>(this, ascending));
@@ -327,48 +719,6 @@ void Library::sort(std::vector<std::string>& bookIds, supportedListSortBy sort,
}
std::vector<std::string> Library::listBooksIds(
int mode,
supportedListSortBy sortBy,
const std::string& search,
const std::string& language,
const std::string& creator,
const std::string& publisher,
const std::vector<std::string>& tags,
size_t maxSize) {
Filter _filter;
if (mode & LOCAL)
_filter.local(true);
if (mode & NOLOCAL)
_filter.local(false);
if (mode & VALID)
_filter.valid(true);
if (mode & NOVALID)
_filter.valid(false);
if (mode & REMOTE)
_filter.remote(true);
if (mode & NOREMOTE)
_filter.remote(false);
if (!tags.empty())
_filter.acceptTags(tags);
if (maxSize != 0)
_filter.maxSize(maxSize);
if (!language.empty())
_filter.lang(language);
if (!publisher.empty())
_filter.publisher(publisher);
if (!creator.empty())
_filter.creator(creator);
if (!search.empty())
_filter.query(search);
auto bookIds = filter(_filter);
sort(bookIds, sortBy, true);
return bookIds;
}
Filter::Filter()
: activeFilters(0),
_maxSize(0)
@@ -391,6 +741,7 @@ enum filterTypes {
MAXSIZE = FLAG(11),
QUERY = FLAG(12),
NAME = FLAG(13),
CATEGORY = FLAG(14),
};
Filter& Filter::local(bool accept)
@@ -429,20 +780,27 @@ Filter& Filter::valid(bool accept)
return *this;
}
Filter& Filter::acceptTags(std::vector<std::string> tags)
Filter& Filter::acceptTags(const Tags& tags)
{
_acceptTags = tags;
activeFilters |= ACCEPTTAGS;
return *this;
}
Filter& Filter::rejectTags(std::vector<std::string> tags)
Filter& Filter::rejectTags(const Tags& tags)
{
_rejectTags = tags;
activeFilters |= REJECTTAGS;
return *this;
}
Filter& Filter::category(std::string category)
{
_category = category;
activeFilters |= CATEGORY;
return *this;
}
Filter& Filter::lang(std::string lang)
{
_lang = lang;
@@ -471,9 +829,10 @@ Filter& Filter::maxSize(size_t maxSize)
return *this;
}
Filter& Filter::query(std::string query)
Filter& Filter::query(std::string query, bool partial)
{
_query = query;
_queryIsPartial = partial;
activeFilters |= QUERY;
return *this;
}
@@ -487,6 +846,36 @@ Filter& Filter::name(std::string name)
#define ACTIVE(X) (activeFilters & (X))
#define FILTER(TAG, TEST) if (ACTIVE(TAG) && !(TEST)) { return false; }
bool Filter::hasQuery() const
{
return ACTIVE(QUERY);
}
bool Filter::hasName() const
{
return ACTIVE(NAME);
}
bool Filter::hasCategory() const
{
return ACTIVE(CATEGORY);
}
bool Filter::hasLang() const
{
return ACTIVE(LANG);
}
bool Filter::hasPublisher() const
{
return ACTIVE(_PUBLISHER);
}
bool Filter::hasCreator() const
{
return ACTIVE(_CREATOR);
}
bool Filter::accept(const Book& book) const
{
auto local = !book.getPath().empty();
@@ -502,40 +891,8 @@ bool Filter::accept(const Book& book) const
FILTER(_NOREMOTE, !remote)
FILTER(MAXSIZE, book.getSize() <= _maxSize)
FILTER(LANG, book.getLanguage() == _lang)
FILTER(_PUBLISHER, book.getPublisher() == _publisher)
FILTER(_CREATOR, book.getCreator() == _creator)
FILTER(NAME, book.getName() == _name)
if (ACTIVE(ACCEPTTAGS)) {
if (!_acceptTags.empty()) {
auto vBookTags = split(book.getTags(), ";");
std::set<std::string> sBookTags(vBookTags.begin(), vBookTags.end());
for (auto& t: _acceptTags) {
if (sBookTags.find(t) == sBookTags.end()) {
return false;
}
}
}
}
if (ACTIVE(REJECTTAGS)) {
if (!_rejectTags.empty()) {
auto vBookTags = split(book.getTags(), ";");
std::set<std::string> sBookTags(vBookTags.begin(), vBookTags.end());
for (auto& t: _rejectTags) {
if (sBookTags.find(t) != sBookTags.end()) {
return false;
}
}
}
}
if ( ACTIVE(QUERY)
&& !(matchRegex(book.getTitle(), "\\Q" + _query + "\\E")
|| matchRegex(book.getDescription(), "\\Q" + _query + "\\E")))
return false;
return true;
}
}

View File

@@ -20,15 +20,15 @@
#include "libxml_dumper.h"
#include "book.h"
#include "tools.h"
#include "tools/base64.h"
#include "tools/stringTools.h"
#include "tools/otherTools.h"
#include "tools/pathTools.h"
namespace kiwix
{
/* Constructor */
LibXMLDumper::LibXMLDumper(Library* library)
LibXMLDumper::LibXMLDumper(const Library* library)
: library(library)
{
}
@@ -60,10 +60,13 @@ void LibXMLDumper::handleBook(Book book, pugi::xml_node root_node) {
ADD_ATTR_NOT_EMPTY(entry_node, "name", book.getName());
ADD_ATTR_NOT_EMPTY(entry_node, "flavour", book.getFlavour());
ADD_ATTR_NOT_EMPTY(entry_node, "tags", book.getTags());
ADD_ATTR_NOT_EMPTY(entry_node, "faviconMimeType", book.getFaviconMimeType());
ADD_ATTR_NOT_EMPTY(entry_node, "faviconUrl", book.getFaviconUrl());
if (!book.getFavicon().empty())
ADD_ATTRIBUTE(entry_node, "favicon", base64_encode(book.getFavicon()));
try {
auto defaultIllustration = book.getIllustration(48);
ADD_ATTR_NOT_EMPTY(entry_node, "faviconMimeType", defaultIllustration->mimeType);
ADD_ATTR_NOT_EMPTY(entry_node, "faviconUrl", defaultIllustration->url);
if (!defaultIllustration->getData().empty())
ADD_ATTRIBUTE(entry_node, "favicon", base64_encode(defaultIllustration->getData()));
} catch(...) {}
} else {
ADD_ATTRIBUTE(entry_node, "origId", book.getOrigId());
}
@@ -91,7 +94,7 @@ void LibXMLDumper::handleBookmark(Bookmark bookmark, pugi::xml_node root_node) {
auto book_node = entry_node.append_child("book");
try {
auto book = library->getBookById(bookmark.getBookId());
auto book = library->getBookByIdThreadSafe(bookmark.getBookId());
ADD_TEXT_ENTRY(book_node, "id", book.getId());
ADD_TEXT_ENTRY(book_node, "title", book.getTitle());
ADD_TEXT_ENTRY(book_node, "language", book.getLanguage());

View File

@@ -19,34 +19,88 @@
#include "manager.h"
#include "tools.h"
#include "tools/pathTools.h"
#include <pugixml.hpp>
namespace kiwix
{
namespace
{
struct NoDelete
{
template<class T> void operator()(T*) {}
};
} // unnamed namespace
////////////////////////////////////////////////////////////////////////////////
// LibraryManipulator
////////////////////////////////////////////////////////////////////////////////
LibraryManipulator::LibraryManipulator(Library* library)
: library(*library)
{}
LibraryManipulator::~LibraryManipulator()
{}
bool LibraryManipulator::addBookToLibrary(const Book& book)
{
const auto ret = library.addBook(book);
if ( ret ) {
bookWasAddedToLibrary(book);
}
return ret;
}
void LibraryManipulator::addBookmarkToLibrary(const Bookmark& bookmark)
{
library.addBookmark(bookmark);
bookmarkWasAddedToLibrary(bookmark);
}
uint32_t LibraryManipulator::removeBooksNotUpdatedSince(Library::Revision rev)
{
const auto n = library.removeBooksNotUpdatedSince(rev);
if ( n != 0 ) {
booksWereRemovedFromLibrary();
}
return n;
}
void LibraryManipulator::bookWasAddedToLibrary(const Book& book)
{
}
void LibraryManipulator::bookmarkWasAddedToLibrary(const Bookmark& bookmark)
{
}
void LibraryManipulator::booksWereRemovedFromLibrary()
{
}
////////////////////////////////////////////////////////////////////////////////
// Manager
////////////////////////////////////////////////////////////////////////////////
/* Constructor */
Manager::Manager(LibraryManipulator* manipulator):
writableLibraryPath(""),
manipulator(manipulator),
mustDeleteManipulator(false)
manipulator(manipulator, NoDelete())
{
}
Manager::Manager(Library* library) :
writableLibraryPath(""),
manipulator(new DefaultLibraryManipulator(library)),
mustDeleteManipulator(true)
manipulator(new LibraryManipulator(library))
{
}
/* Destructor */
Manager::~Manager()
{
if (mustDeleteManipulator) {
delete manipulator;
}
}
bool Manager::parseXmlDom(const pugi::xml_document& doc,
bool readOnly,
const std::string& libraryPath,
@@ -80,7 +134,7 @@ bool Manager::readXml(const std::string& xml,
{
pugi::xml_document doc;
pugi::xml_parse_result result
= doc.load_buffer_inplace((void*)xml.data(), xml.size());
= doc.load_buffer((void*)xml.data(), xml.size());
if (result) {
this->parseXmlDom(doc, readOnly, libraryPath, trustLibrary);
@@ -124,7 +178,7 @@ bool Manager::readOpds(const std::string& content, const std::string& urlHost)
{
pugi::xml_document doc;
pugi::xml_parse_result result
= doc.load_buffer_inplace((void*)content.data(), content.size());
= doc.load_buffer((void*)content.data(), content.size());
if (result) {
this->parseOpdsDom(doc, urlHost);
@@ -214,8 +268,8 @@ bool Manager::readBookFromPath(const std::string& path, kiwix::Book* book)
tmp_path = computeAbsolutePath(getCurrentDirectory(), path);
}
try {
kiwix::Reader reader(tmp_path);
book->update(reader);
zim::Archive archive(tmp_path);
book->update(archive);
book->setPathValid(true);
} catch (const std::exception& e) {
book->setPathValid(false);
@@ -248,4 +302,21 @@ bool Manager::readBookmarkFile(const std::string& path)
return true;
}
void Manager::reload(const Paths& paths)
{
const auto libRevision = manipulator->getLibrary().getRevision();
for (std::string path : paths) {
if (!path.empty()) {
if ( kiwix::isRelativePath(path) )
path = kiwix::computeAbsolutePath(kiwix::getCurrentDirectory(), path);
if (!readFile(path, false, true)) {
throw std::runtime_error("Failed to load the XML library file '" + path + "'.");
}
}
}
manipulator->removeBooksNotUpdatedSince(libRevision);
}
}

View File

@@ -6,10 +6,7 @@ kiwix_sources = [
'libxml_dumper.cpp',
'opds_dumper.cpp',
'downloader.cpp',
'reader.cpp',
'entry.cpp',
'server.cpp',
'searcher.cpp',
'search_renderer.cpp',
'subprocess.cpp',
'aria2.cpp',
@@ -19,13 +16,21 @@ kiwix_sources = [
'tools/stringTools.cpp',
'tools/networkTools.cpp',
'tools/otherTools.cpp',
'tools/archiveTools.cpp',
'kiwixserve.cpp',
'name_mapper.cpp',
'server/byte_range.cpp',
'server/etag.cpp',
'server/request_context.cpp',
'server/response.cpp'
'server/response.cpp',
'server/internalServer.cpp',
'server/internalServer_catalog_v2.cpp',
'server/i18n.cpp',
'opds_catalog.cpp',
'version.cpp'
]
kiwix_sources += lib_resources
kiwix_sources += i18n_resources
if host_machine.system() == 'windows'
kiwix_sources += 'subprocess_windows.cpp'
@@ -33,15 +38,7 @@ else
kiwix_sources += 'subprocess_unix.cpp'
endif
if 'android' in wrapper
install_dir = 'kiwix-lib/jniLibs/' + meson.get_cross_property('android_abi')
else
install_dir = get_option('libdir')
endif
if 'android' in wrapper or 'java' in wrapper
subdir('wrapper/java')
endif
install_dir = get_option('libdir')
config_h = configure_file(output : 'kiwix_config.h',
configuration : conf,
@@ -52,6 +49,7 @@ kiwixlib = library('kiwix',
kiwix_sources,
include_directories : inc,
dependencies : all_deps,
link_args: extra_libs,
version: meson.project_version(),
install: true,
install_dir: install_dir)

View File

@@ -1,5 +1,5 @@
/*
* Copyright (C) 2013 Emmanuel Engelhart <kelson@kiwix.org>
* Copyright 2020 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -17,9 +17,8 @@
* MA 02110-1301, USA.
*/
package org.kiwix.kiwixlib;
#include <microhttpd.h>
public class JNIKiwixInt
{
public int value;
}
#if MHD_VERSION < 0x00097002
typedef int MHD_Result;
#endif

View File

@@ -51,12 +51,54 @@ HumanReadableNameMapper::HumanReadableNameMapper(kiwix::Library& library, bool w
}
}
std::string HumanReadableNameMapper::getNameForId(const std::string& id) {
std::string HumanReadableNameMapper::getNameForId(const std::string& id) const {
return m_idToName.at(id);
}
std::string HumanReadableNameMapper::getIdForName(const std::string& name) {
std::string HumanReadableNameMapper::getIdForName(const std::string& name) const {
return m_nameToId.at(name);
}
////////////////////////////////////////////////////////////////////////////////
// UpdatableNameMapper
////////////////////////////////////////////////////////////////////////////////
UpdatableNameMapper::UpdatableNameMapper(Library& lib, bool withAlias)
: library(lib)
, withAlias(withAlias)
{
update();
}
void UpdatableNameMapper::update()
{
const auto newNameMapper = new HumanReadableNameMapper(library, withAlias);
std::lock_guard<std::mutex> lock(mutex);
nameMapper.reset(newNameMapper);
}
UpdatableNameMapper::NameMapperHandle
UpdatableNameMapper::currentNameMapper() const
{
// Return a copy of the handle to the current NameMapper object. It will
// ensure that the object survives any call to UpdatableNameMapper::update()
// made before the completion of any pending operation on that object.
std::lock_guard<std::mutex> lock(mutex);
return nameMapper;
}
std::string UpdatableNameMapper::getNameForId(const std::string& id) const
{
// Ensure that the current nameMapper object survives a concurrent call
// to UpdatableNameMapper::update()
return currentNameMapper()->getNameForId(id);
}
std::string UpdatableNameMapper::getIdForName(const std::string& name) const
{
// Ensure that the current nameMapper object survives a concurrent call
// to UpdatableNameMapper::update()
return currentNameMapper()->getIdForName(name);
}
}

74
src/opds_catalog.cpp Normal file
View File

@@ -0,0 +1,74 @@
/*
* Copyright 2021 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "opds_catalog.h"
#include "tools/stringTools.h"
#include <sstream>
namespace kiwix
{
namespace
{
const char opdsSearchEndpoint[] = "/catalog/v2/entries";
enum Separator { AMP };
std::ostringstream& operator<<(std::ostringstream& oss, Separator sep)
{
if ( oss.tellp() > 0 )
oss << "&";
return oss;
}
std::string buildSearchString(const Filter& f)
{
std::ostringstream oss;
if ( f.hasQuery() )
oss << AMP << "q=" << urlEncode(f.getQuery());
if ( f.hasCategory() )
oss << AMP << "category=" << urlEncode(f.getCategory());
if ( f.hasLang() )
oss << AMP << "lang=" << urlEncode(f.getLang());
if ( f.hasName() )
oss << AMP << "name=" << urlEncode(f.getName());
if ( !f.getAcceptTags().empty() )
oss << AMP << "tag=" << urlEncode(join(f.getAcceptTags(), ";"));
return oss.str();
}
} // unnamed namespace
std::string getSearchUrl(const Filter& f)
{
const std::string searchString = buildSearchString(f);
if ( searchString.empty() )
return opdsSearchEndpoint;
else
return opdsSearchEndpoint + ("?" + searchString);
}
} // namespace kiwix

View File

@@ -20,11 +20,15 @@
#include "opds_dumper.h"
#include "book.h"
#include "kiwixlib-resources.h"
#include <mustache.hpp>
#include "tools/stringTools.h"
#include "tools/otherTools.h"
#include <iomanip>
namespace kiwix
{
/* Constructor */
OPDSDumper::OPDSDumper(Library* library)
: library(library)
@@ -35,120 +39,240 @@ OPDSDumper::~OPDSDumper()
{
}
std::string gen_date_str()
{
auto now = time(0);
auto tm = localtime(&now);
std::stringstream is;
is << std::setw(2) << std::setfill('0')
<< 1900+tm->tm_year << "-"
<< std::setw(2) << std::setfill('0') << tm->tm_mon << "-"
<< std::setw(2) << std::setfill('0') << tm->tm_mday << "T"
<< std::setw(2) << std::setfill('0') << tm->tm_hour << ":"
<< std::setw(2) << std::setfill('0') << tm->tm_min << ":"
<< std::setw(2) << std::setfill('0') << tm->tm_sec << "Z";
return is.str();
}
static std::string gen_date_from_yyyy_mm_dd(const std::string& date)
{
std::stringstream is;
is << date << "T00:00::00:Z";
return is.str();
}
void OPDSDumper::setOpenSearchInfo(int totalResults, int startIndex, int count)
{
m_totalResults = totalResults;
m_startIndex = startIndex,
m_count = count;
m_isSearchResult = true;
}
#define ADD_TEXT_ENTRY(node, child, value) (node).append_child((child)).append_child(pugi::node_pcdata).set_value((value).c_str())
pugi::xml_node OPDSDumper::handleBook(Book book, pugi::xml_node root_node) {
auto entry_node = root_node.append_child("entry");
ADD_TEXT_ENTRY(entry_node, "id", "urn:uuid:"+book.getId());
ADD_TEXT_ENTRY(entry_node, "title", book.getTitle());
ADD_TEXT_ENTRY(entry_node, "summary", book.getDescription());
ADD_TEXT_ENTRY(entry_node, "language", book.getLanguage());
ADD_TEXT_ENTRY(entry_node, "updated", gen_date_from_yyyy_mm_dd(book.getDate()));
ADD_TEXT_ENTRY(entry_node, "name", book.getName());
ADD_TEXT_ENTRY(entry_node, "flavour", book.getFlavour());
ADD_TEXT_ENTRY(entry_node, "tags", book.getTags());
ADD_TEXT_ENTRY(entry_node, "articleCount", to_string(book.getArticleCount()));
ADD_TEXT_ENTRY(entry_node, "mediaCount", to_string(book.getMediaCount()));
ADD_TEXT_ENTRY(entry_node, "icon", rootLocation + "/meta?name=favicon&content=" + book.getHumanReadableIdFromPath());
auto content_node = entry_node.append_child("link");
content_node.append_attribute("type") = "text/html";
content_node.append_attribute("href") = (rootLocation + "/" + book.getHumanReadableIdFromPath()).c_str();
auto author_node = entry_node.append_child("author");
ADD_TEXT_ENTRY(author_node, "name", book.getCreator());
auto publisher_node = entry_node.append_child("publisher");
ADD_TEXT_ENTRY(publisher_node, "name", book.getPublisher());
if (! book.getUrl().empty()) {
auto acquisition_link = entry_node.append_child("link");
acquisition_link.append_attribute("rel") = "http://opds-spec.org/acquisition/open-access";
acquisition_link.append_attribute("type") = "application/x-zim";
acquisition_link.append_attribute("href") = book.getUrl().c_str();
acquisition_link.append_attribute("length") = to_string(book.getSize()).c_str();
}
if (! book.getFaviconMimeType().empty() ) {
auto image_link = entry_node.append_child("link");
image_link.append_attribute("rel") = "http://opds-spec.org/image/thumbnail";
image_link.append_attribute("type") = book.getFaviconMimeType().c_str();
image_link.append_attribute("href") = (rootLocation + "/meta?name=favicon&content=" + book.getHumanReadableIdFromPath()).c_str();
}
return entry_node;
}
string OPDSDumper::dumpOPDSFeed(const std::vector<std::string>& bookIds)
namespace
{
date = gen_date_str();
pugi::xml_document doc;
auto root_node = doc.append_child("feed");
root_node.append_attribute("xmlns") = "http://www.w3.org/2005/Atom";
root_node.append_attribute("xmlns:opds") = "http://opds-spec.org/2010/catalog";
typedef kainjow::mustache::data MustacheData;
typedef kainjow::mustache::list BooksData;
typedef kainjow::mustache::list IllustrationInfo;
ADD_TEXT_ENTRY(root_node, "id", id);
IllustrationInfo getBookIllustrationInfo(const Book& book)
{
kainjow::mustache::list illustrations;
if ( book.isPathValid() ) {
for ( const auto& illustration : book.getIllustrations() ) {
// For now, we are handling only sizexsize@1 illustration.
// So we can simply pass one size to mustache.
illustrations.push_back(kainjow::mustache::object{
{"icon_size", to_string(illustration->width)},
{"icon_mimetype", illustration->mimeType}
});
}
}
return illustrations;
}
ADD_TEXT_ENTRY(root_node, "title", title);
ADD_TEXT_ENTRY(root_node, "updated", date);
kainjow::mustache::object getSingleBookData(const Book& book)
{
const auto bookDate = book.getDate() + "T00:00:00Z";
return kainjow::mustache::object{
{"id", book.getId()},
{"name", book.getName()},
{"title", book.getTitle()},
{"description", book.getDescription()},
{"language", book.getLanguage()},
{"content_id", urlEncode(book.getHumanReadableIdFromPath(), true)},
{"updated", bookDate}, // XXX: this should be the entry update datetime
{"book_date", bookDate},
{"category", book.getCategory()},
{"flavour", book.getFlavour()},
{"tags", book.getTags()},
{"article_count", to_string(book.getArticleCount())},
{"media_count", to_string(book.getMediaCount())},
{"author_name", book.getCreator()},
{"publisher_name", book.getPublisher()},
{"url", onlyAsNonEmptyMustacheValue(book.getUrl())},
{"size", to_string(book.getSize())},
{"icons", getBookIllustrationInfo(book)},
};
}
if (m_isSearchResult) {
ADD_TEXT_ENTRY(root_node, "totalResults", to_string(m_totalResults));
ADD_TEXT_ENTRY(root_node, "startIndex", to_string(m_startIndex));
ADD_TEXT_ENTRY(root_node, "itemsPerPage", to_string(m_count));
}
std::string getSingleBookEntryXML(const Book& book, bool withXMLHeader, const std::string& rootLocation, const std::string& endpointRoot, bool partial)
{
auto data = getSingleBookData(book);
data["with_xml_header"] = MustacheData(withXMLHeader);
data["dump_partial_entries"] = MustacheData(partial);
data["endpoint_root"] = endpointRoot;
data["root"] = rootLocation;
return render_template(RESOURCE::templates::catalog_v2_entry_xml, data);
}
auto self_link_node = root_node.append_child("link");
self_link_node.append_attribute("rel") = "self";
self_link_node.append_attribute("href") = "";
self_link_node.append_attribute("type") = "application/atom+xml";
if (!searchDescriptionUrl.empty() ) {
auto search_link = root_node.append_child("link");
search_link.append_attribute("rel") = "search";
search_link.append_attribute("type") = "application/opensearchdescription+xml";
search_link.append_attribute("href") = searchDescriptionUrl.c_str();
}
if (library) {
for (auto& bookId: bookIds) {
handleBook(library->getBookById(bookId), root_node);
BooksData getBooksData(const Library* library, const std::vector<std::string>& bookIds, const std::string& rootLocation, const std::string& endpointRoot, bool partial)
{
BooksData booksData;
for ( const auto& bookId : bookIds ) {
try {
const Book book = library->getBookByIdThreadSafe(bookId);
booksData.push_back(kainjow::mustache::object{
{"entry", getSingleBookEntryXML(book, false, rootLocation, endpointRoot, partial)}
});
} catch ( const std::out_of_range& ) {
// the book was removed from the library since its id was obtained
// ignore it
}
}
return nodeToString(root_node);
return booksData;
}
std::map<std::string, std::string> iso639_3 = {
{"atj", "atikamekw"},
{"azb", "آذربایجان دیلی"},
{"bcl", "central bikol"},
{"bgs", "tagabawa"},
{"bxr", "буряад хэлэн"},
{"cbk", "chavacano"},
{"cdo", "閩東語"},
{"dag", "Dagbani"},
{"diq", "dimli"},
{"dty", "डोटेली"},
{"eml", "emiliân-rumagnōl"},
{"fbs", "српскохрватски"},
{"ido", "ido"},
{"kbp", "kabɩ"},
{"kld", "Gamilaraay"},
{"lbe", "лакку маз"},
{"lbj", "ལ་དྭགས་སྐད་"},
{"map", "Austronesian"},
{"mhr", "марий йылме"},
{"mnw", "ဘာသာမန်"},
{"myn", "mayan"},
{"nah", "nahuatl"},
{"nai", "north American Indian"},
{"nds", "plattdütsch"},
{"nrm", "bhasa narom"},
{"olo", "livvi"},
{"pih", "Pitcairn-Norfolk"},
{"pnb", "Western Panjabi"},
{"rmr", "Caló"},
{"rmy", "romani shib"},
{"roa", "romance languages"},
{"twi", "twi"}
};
std::once_flag fillLanguagesFlag;
void fillLanguagesMap()
{
for (auto icuLangPtr = icu::Locale::getISOLanguages(); *icuLangPtr != NULL; ++icuLangPtr) {
const ICULanguageInfo lang(*icuLangPtr);
iso639_3.insert({lang.iso3Code(), lang.selfName()});
}
}
std::string getLanguageSelfName(const std::string& lang) {
const auto itr = iso639_3.find(lang);
if (itr != iso639_3.end()) {
return itr->second;
}
return lang;
};
} // unnamed namespace
string OPDSDumper::dumpOPDSFeed(const std::vector<std::string>& bookIds, const std::string& query) const
{
const auto booksData = getBooksData(library, bookIds, rootLocation, "", false);
const kainjow::mustache::object template_data{
{"date", gen_date_str()},
{"root", rootLocation},
{"feed_id", gen_uuid(libraryId + "/catalog/search?"+query)},
{"filter", onlyAsNonEmptyMustacheValue(query)},
{"totalResults", to_string(m_totalResults)},
{"startIndex", to_string(m_startIndex)},
{"itemsPerPage", to_string(m_count)},
{"books", booksData }
};
return render_template(RESOURCE::templates::catalog_entries_xml, template_data);
}
string OPDSDumper::dumpOPDSFeedV2(const std::vector<std::string>& bookIds, const std::string& query, bool partial) const
{
const auto endpointRoot = rootLocation + "/catalog/v2";
const auto booksData = getBooksData(library, bookIds, rootLocation, endpointRoot, partial);
const char* const endpoint = partial ? "/partial_entries" : "/entries";
const kainjow::mustache::object template_data{
{"date", gen_date_str()},
{"endpoint_root", endpointRoot},
{"feed_id", gen_uuid(libraryId + endpoint + "?" + query)},
{"filter", onlyAsNonEmptyMustacheValue(query)},
{"query", query.empty() ? "" : "?" + urlEncode(query)},
{"totalResults", to_string(m_totalResults)},
{"startIndex", to_string(m_startIndex)},
{"itemsPerPage", to_string(m_count)},
{"books", booksData },
{"dump_partial_entries", MustacheData(partial)}
};
return render_template(RESOURCE::templates::catalog_v2_entries_xml, template_data);
}
std::string OPDSDumper::dumpOPDSCompleteEntry(const std::string& bookId) const
{
return getSingleBookEntryXML(library->getBookById(bookId), true, rootLocation, "", false);
}
std::string OPDSDumper::categoriesOPDSFeed() const
{
const auto now = gen_date_str();
kainjow::mustache::list categoryData;
for ( const auto& category : library->getBooksCategories() ) {
const auto urlencodedCategoryName = urlEncode(category);
categoryData.push_back(kainjow::mustache::object{
{"name", category},
{"urlencoded_name", urlencodedCategoryName},
{"updated", now},
{"id", gen_uuid(libraryId + "/categories/" + urlencodedCategoryName)}
});
}
return render_template(
RESOURCE::templates::catalog_v2_categories_xml,
kainjow::mustache::object{
{"date", now},
{"endpoint_root", rootLocation + "/catalog/v2"},
{"feed_id", gen_uuid(libraryId + "/categories")},
{"categories", categoryData }
}
);
}
std::string OPDSDumper::languagesOPDSFeed() const
{
const auto now = gen_date_str();
kainjow::mustache::list languageData;
std::call_once(fillLanguagesFlag, fillLanguagesMap);
for ( const auto& langAndBookCount : library->getBooksLanguagesWithCounts() ) {
const std::string languageCode = langAndBookCount.first;
const int bookCount = langAndBookCount.second;
const auto languageSelfName = getLanguageSelfName(languageCode);
languageData.push_back(kainjow::mustache::object{
{"lang_code", languageCode},
{"lang_self_name", languageSelfName},
{"book_count", to_string(bookCount)},
{"updated", now},
{"id", gen_uuid(libraryId + "/languages/" + languageCode)}
});
}
return render_template(
RESOURCE::templates::catalog_v2_languages_xml,
kainjow::mustache::object{
{"date", now},
{"endpoint_root", rootLocation + "/catalog/v2"},
{"feed_id", gen_uuid(libraryId + "/languages")},
{"languages", languageData }
}
);
}
}

View File

@@ -1,904 +0,0 @@
/*
* Copyright 2011 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "reader.h"
#include <time.h>
#include <zim/search.h>
#include "tools/otherTools.h"
inline char hi(char v)
{
char hex[] = "0123456789abcdef";
return hex[(v >> 4) & 0xf];
}
inline char lo(char v)
{
char hex[] = "0123456789abcdef";
return hex[v & 0xf];
}
std::string hexUUID(std::string in)
{
std::ostringstream out;
for (unsigned n = 0; n < 4; ++n) {
out << hi(in[n]) << lo(in[n]);
}
out << '-';
for (unsigned n = 4; n < 6; ++n) {
out << hi(in[n]) << lo(in[n]);
}
out << '-';
for (unsigned n = 6; n < 8; ++n) {
out << hi(in[n]) << lo(in[n]);
}
out << '-';
for (unsigned n = 8; n < 10; ++n) {
out << hi(in[n]) << lo(in[n]);
}
out << '-';
for (unsigned n = 10; n < 16; ++n) {
out << hi(in[n]) << lo(in[n]);
}
std::string op = out.str();
return op;
}
namespace kiwix
{
/* Constructor */
Reader::Reader(const string zimFilePath) : zimFileHandler(NULL)
{
string tmpZimFilePath = zimFilePath;
/* Remove potential trailing zimaa */
size_t found = tmpZimFilePath.rfind("zimaa");
if (found != string::npos && tmpZimFilePath.size() > 5
&& found == tmpZimFilePath.size() - 5) {
tmpZimFilePath.resize(tmpZimFilePath.size() - 2);
}
this->zimFileHandler = new zim::File(tmpZimFilePath);
if (this->zimFileHandler != NULL) {
this->firstArticleOffset
= this->zimFileHandler->getNamespaceBeginOffset('A');
this->lastArticleOffset = this->zimFileHandler->getNamespaceEndOffset('A');
this->nsACount = this->zimFileHandler->getNamespaceCount('A');
this->nsICount = this->zimFileHandler->getNamespaceCount('I');
this->zimFilePath = zimFilePath;
}
/* initialize random seed: */
srand(time(NULL));
}
/* Destructor */
Reader::~Reader()
{
if (this->zimFileHandler != NULL) {
delete this->zimFileHandler;
}
}
zim::File* Reader::getZimFileHandler() const
{
return this->zimFileHandler;
}
std::map<const std::string, unsigned int> Reader::parseCounterMetadata() const
{
std::map<const std::string, unsigned int> counters;
string mimeType, item, counterString;
unsigned int counter;
zim::Article article = this->zimFileHandler->getArticle('M', "Counter");
if (article.good()) {
stringstream ssContent(article.getData());
while (getline(ssContent, item, ';')) {
stringstream ssItem(item);
getline(ssItem, mimeType, '=');
getline(ssItem, counterString, '=');
if (!counterString.empty() && !mimeType.empty()) {
sscanf(counterString.c_str(), "%u", &counter);
counters.insert(pair<string, int>(mimeType, counter));
}
}
}
return counters;
}
/* Get the count of articles which can be indexed/displayed */
unsigned int Reader::getArticleCount() const
{
std::map<const std::string, unsigned int> counterMap
= this->parseCounterMetadata();
unsigned int counter = 0;
if (counterMap.empty()) {
counter = this->nsACount;
} else {
auto it = counterMap.find("text/html");
if (it != counterMap.end()) {
counter = it->second;
}
}
return counter;
}
/* Get the count of medias content in the ZIM file */
unsigned int Reader::getMediaCount() const
{
std::map<const std::string, unsigned int> counterMap
= this->parseCounterMetadata();
unsigned int counter = 0;
if (counterMap.empty()) {
counter = this->nsICount;
} else {
auto it = counterMap.find("image/jpeg");
if (it != counterMap.end()) {
counter += it->second;
}
it = counterMap.find("image/gif");
if (it != counterMap.end()) {
counter += it->second;
}
it = counterMap.find("image/png");
if (it != counterMap.end()) {
counter += it->second;
}
}
return counter;
}
/* Get the total of all items of a ZIM file, redirects included */
unsigned int Reader::getGlobalCount() const
{
return this->zimFileHandler->getCountArticles();
}
/* Return the UID of the ZIM file */
string Reader::getId() const
{
std::ostringstream s;
s << this->zimFileHandler->getFileheader().getUuid();
return s.str();
}
/* Return a page url from a title */
bool Reader::getPageUrlFromTitle(const string& title, string& url) const
{
try {
auto entry = getEntryFromTitle(title);
entry = entry.getFinalEntry();
url = entry.getPath();
return true;
} catch (NoEntry& e) {
return false;
}
}
/* Return an URL from a title */
string Reader::getRandomPageUrl() const
{
return getRandomPage().getPath();
}
Entry Reader::getRandomPage() const
{
if (!this->zimFileHandler) {
throw NoEntry();
}
zim::Article article;
std::string mainPagePath = this->getMainPage().getPath();
int watchdog = 42;
do {
auto idx = this->firstArticleOffset
+ (zim::size_type)((double)rand() / ((double)RAND_MAX + 1)
* this->nsACount);
article = zimFileHandler->getArticle(idx);
if (!watchdog--) {
throw NoEntry();
}
} while (!article.good() && article.getLongUrl() == mainPagePath);
return article;
}
/* Return the welcome page URL */
string Reader::getMainPageUrl() const
{
return getMainPage().getPath();
}
Entry Reader::getMainPage() const
{
if (!this->zimFileHandler) {
throw NoEntry();
}
zim::Article article;
if (this->zimFileHandler->getFileheader().hasMainPage())
{
article = zimFileHandler->getArticle(
this->zimFileHandler->getFileheader().getMainPage());
}
if (!article.good())
{
return getFirstPage();
}
return article;
}
bool Reader::getFavicon(string& content, string& mimeType) const
{
static const char* const paths[] = {"-/favicon", "-/favicon.png", "I/favicon.png", "I/favicon"};
for (auto &path: paths) {
try {
auto entry = getEntryFromPath(path);
entry = entry.getFinalEntry();
content = entry.getContent();
mimeType = entry.getMimetype();
return true;
} catch(NoEntry& e) {};
}
return false;
}
string Reader::getZimFilePath() const
{
return this->zimFilePath;
}
/* Return a metatag value */
bool Reader::getMetadata(const string& name, string& value) const
{
try {
auto entry = getEntryFromPath("M/"+name);
value = entry.getContent();
return true;
} catch(NoEntry& e) {
return false;
}
}
#define METADATA(NAME) std::string v; getMetadata(NAME, v); return v;
string Reader::getName() const
{
METADATA("Name")
}
string Reader::getTitle() const
{
string value;
this->getMetadata("Title", value);
if (value.empty()) {
value = getLastPathElement(zimFileHandler->getFilename());
std::replace(value.begin(), value.end(), '_', ' ');
size_t pos = value.find(".zim");
value = value.substr(0, pos);
}
return value;
}
string Reader::getCreator() const
{
METADATA("Creator")
}
string Reader::getPublisher() const
{
METADATA("Publisher")
}
string Reader::getDate() const
{
METADATA("Date")
}
string Reader::getDescription() const
{
string value;
this->getMetadata("Description", value);
/* Mediawiki Collection tends to use the "Subtitle" name */
if (value.empty()) {
this->getMetadata("Subtitle", value);
}
return value;
}
string Reader::getLongDescription() const
{
METADATA("LongDescription")
}
string Reader::getLanguage() const
{
METADATA("Language")
}
string Reader::getLicense() const
{
METADATA("License")
}
string Reader::getTags(bool original) const
{
string tags_str;
getMetadata("Tags", tags_str);
if (original) {
return tags_str;
}
auto tags = convertTags(tags_str);
return join(tags, ";");
}
string Reader::getTagStr(const std::string& tagName) const
{
string tags_str;
getMetadata("Tags", tags_str);
return getTagValueFromTagList(convertTags(tags_str), tagName);
}
bool Reader::getTagBool(const std::string& tagName) const
{
return convertStrToBool(getTagStr(tagName));
}
string Reader::getRelation() const
{
METADATA("Relation")
}
string Reader::getFlavour() const
{
METADATA("Flavour")
}
string Reader::getSource() const
{
METADATA("Source")
}
string Reader::getScraper() const
{
METADATA("Scraper")
}
#undef METADATA
string Reader::getOrigId() const
{
string value;
this->getMetadata("startfileuid", value);
if (value.empty()) {
return "";
}
std::string id = value;
std::string origID;
std::string temp = "";
unsigned int k = 0;
char tempArray[16] = "";
for (unsigned int i = 0; i < id.size(); i++) {
if (id[i] == '\n') {
tempArray[k] = atoi(temp.c_str());
temp = "";
k++;
} else {
temp += id[i];
}
}
origID = hexUUID(tempArray);
return origID;
}
/* Return the first page URL */
string Reader::getFirstPageUrl() const
{
return getFirstPage().getPath();
}
Entry Reader::getFirstPage() const
{
if (!this->zimFileHandler) {
throw NoEntry();
}
auto firstPageOffset = zimFileHandler->getNamespaceBeginOffset('A');
auto article = zimFileHandler->getArticle(firstPageOffset);
if (! article.good()) {
throw NoEntry();
}
return article;
}
bool _parseUrl(const string& url, char* ns, string& title)
{
/* Offset to visit the url */
unsigned int urlLength = url.size();
unsigned int offset = 0;
/* Ignore the first '/' */
if (url[offset] == '/')
offset++;
if (url[offset] == '/' || offset >= urlLength)
return false;
/* Get namespace */
*ns = url[offset++];
if (url[offset] != '/' || offset >= urlLength)
return false;
offset++;
if ( offset >= urlLength)
return false;
/* Get content title */
title = url.substr(offset, urlLength - offset);
return true;
}
bool Reader::parseUrl(const string& url, char* ns, string& title) const
{
return _parseUrl(url, ns, title);
}
Entry Reader::getEntryFromPath(const std::string& path) const
{
char ns = 0;
std::string short_url;
if (!this->zimFileHandler) {
throw NoEntry();
}
_parseUrl(path, &ns, short_url);
if (short_url.empty() && ns == 0) {
return getMainPage();
}
auto article = zimFileHandler->getArticle(ns, short_url);
if (!article.good()) {
throw NoEntry();
}
return article;
}
Entry Reader::getEntryFromEncodedPath(const std::string& path) const
{
return getEntryFromPath(urlDecode(path, true));
}
Entry Reader::getEntryFromTitle(const std::string& title) const
{
if (!this->zimFileHandler) {
throw NoEntry();
}
auto article = this->zimFileHandler->getArticleByTitle('A', title);
if (!article.good()) {
throw NoEntry();
}
return article;
}
/* Return article by url */
bool Reader::getArticleObjectByDecodedUrl(const string& url,
zim::Article& article) const
{
if (this->zimFileHandler == NULL) {
return false;
}
/* Parse the url */
char ns = 0;
string urlStr;
_parseUrl(url, &ns, urlStr);
/* Main page */
if (urlStr.empty() && ns == 0) {
_parseUrl(this->getMainPage().getPath(), &ns, urlStr);
}
/* Extract the content from the zim file */
article = zimFileHandler->getArticle(ns, urlStr);
return article.good();
}
/* Return the mimeType without the content */
bool Reader::getMimeTypeByUrl(const string& url, string& mimeType) const
{
try {
auto entry = getEntryFromPath(url);
mimeType = entry.getMimetype();
return true;
} catch (NoEntry& e) {
mimeType = "";
return false;
}
}
bool get_content_by_decoded_url(const Reader& reader,
const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType,
string& baseUrl)
{
content = "";
contentType = "";
contentLength = 0;
try {
auto entry = reader.getEntryFromPath(url);
entry = entry.getFinalEntry();
baseUrl = entry.getPath();
contentType = entry.getMimetype();
content = entry.getContent();
contentLength = entry.getSize();
title = entry.getTitle();
/* Try to set a stub HTML header/footer if necesssary */
if (contentType.find("text/html") != string::npos
&& content.find("<body") == std::string::npos
&& content.find("<BODY") == std::string::npos) {
content = "<html><head><title>" + title +
"</title><meta http-equiv=\"Content-Type\" content=\"text/html; "
"charset=utf-8\" /></head><body>" +
content + "</body></html>";
}
return true;
} catch (NoEntry& e) {
return false;
}
}
/* Get a content from a zim file */
bool Reader::getContentByUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType) const
{
std::string stubRedirectUrl;
return get_content_by_decoded_url(*this,
kiwix::urlDecode(url),
content,
title,
contentLength,
contentType,
stubRedirectUrl);
}
bool Reader::getContentByEncodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType,
string& baseUrl) const
{
return get_content_by_decoded_url(*this,
kiwix::urlDecode(url),
content,
title,
contentLength,
contentType,
baseUrl);
}
bool Reader::getContentByEncodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType) const
{
std::string stubRedirectUrl;
return get_content_by_decoded_url(*this,
kiwix::urlDecode(url),
content,
title,
contentLength,
contentType,
stubRedirectUrl);
}
bool Reader::getContentByDecodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType) const
{
std::string stubRedirectUrl;
return get_content_by_decoded_url(*this,
url,
content,
title,
contentLength,
contentType,
stubRedirectUrl);
}
bool Reader::getContentByDecodedUrl(const string& url,
string& content,
string& title,
unsigned int& contentLength,
string& contentType,
string& baseUrl) const
{
return get_content_by_decoded_url(*this,
url,
content,
title,
contentLength,
contentType,
baseUrl);
}
/* Check if an article exists */
bool Reader::urlExists(const string& url) const
{
return pathExists(url);
}
bool Reader::pathExists(const string& path) const
{
if (!zimFileHandler)
{
return false;
}
char ns = 0;
string titleStr;
_parseUrl(path, &ns, titleStr);
zim::File::const_iterator findItr = zimFileHandler->find(ns, titleStr);
return findItr != zimFileHandler->end() && findItr->getUrl() == titleStr;
}
/* Does the ZIM file has a fulltext index */
bool Reader::hasFulltextIndex() const
{
if (!zimFileHandler || zimFileHandler->is_multiPart() )
{
return false;
}
return ( pathExists("Z//fulltextIndex/xapian")
|| pathExists("X/fulltext/xapian"));
}
/* Search titles by prefix */
bool Reader::searchSuggestions(const string& prefix,
unsigned int suggestionsCount,
const bool reset)
{
bool retVal = false;
/* Reset the suggestions otherwise check if the suggestions number is less
* than the suggestionsCount */
if (reset) {
this->suggestions.clear();
this->suggestionsOffset = this->suggestions.begin();
} else {
if (this->suggestions.size() > suggestionsCount) {
return false;
}
}
/* Return if no prefix */
if (prefix.size() == 0) {
return false;
}
for (auto articleItr = zimFileHandler->findByTitle('A', prefix);
articleItr != zimFileHandler->end()
&& articleItr->getTitle().compare(0, prefix.size(), prefix) == 0
&& this->suggestions.size() < suggestionsCount;
++articleItr) {
/* Extract the interesting part of article title & url */
std::string normalizedArticleTitle
= kiwix::normalize(articleItr->getTitle());
std::string articleFinalUrl = "/A/" + articleItr->getUrl();
if (articleItr->isRedirect()) {
zim::Article article = *articleItr;
unsigned int loopCounter = 0;
while (article.isRedirect() && loopCounter++ < 42) {
article = article.getRedirectArticle();
}
articleFinalUrl = "/A/" + article.getUrl();
}
/* Go through all already found suggestions and skip if this
article is already in the suggestions list (with an other
title) */
bool insert = true;
std::vector<std::vector<std::string>>::iterator suggestionItr;
for (suggestionItr = this->suggestions.begin();
suggestionItr != this->suggestions.end();
suggestionItr++) {
int result = normalizedArticleTitle.compare((*suggestionItr)[2]);
if (result == 0 && articleFinalUrl.compare((*suggestionItr)[1]) == 0) {
insert = false;
break;
} else if (result < 0) {
break;
}
}
/* Insert if possible */
if (insert) {
std::vector<std::string> suggestion;
suggestion.push_back(articleItr->getTitle());
suggestion.push_back(articleFinalUrl);
suggestion.push_back(normalizedArticleTitle);
this->suggestions.insert(suggestionItr, suggestion);
}
/* Suggestions where found */
retVal = true;
}
/* Set the cursor to the begining */
this->suggestionsOffset = this->suggestions.begin();
return retVal;
}
std::vector<std::string> Reader::getTitleVariants(
const std::string& title) const
{
std::vector<std::string> variants;
variants.push_back(title);
variants.push_back(kiwix::ucFirst(title));
variants.push_back(kiwix::lcFirst(title));
variants.push_back(kiwix::toTitle(title));
return variants;
}
/* Try also a few variations of the prefix to have better results */
bool Reader::searchSuggestionsSmart(const string& prefix,
unsigned int suggestionsCount)
{
std::vector<std::string> variants = this->getTitleVariants(prefix);
bool retVal = false;
this->suggestions.clear();
this->suggestionsOffset = this->suggestions.begin();
/* Try to search in the title using fulltext search database */
const auto suggestionSearch
= this->getZimFileHandler()->suggestions(prefix, 0, suggestionsCount);
if (suggestionSearch->get_matches_estimated()) {
for (auto current = suggestionSearch->begin();
current != suggestionSearch->end();
current++) {
if (!current->good()) {
continue;
}
std::vector<std::string> suggestion;
suggestion.push_back(current->getTitle());
suggestion.push_back("/A/" + current->getUrl());
suggestion.push_back(kiwix::normalize(current->getTitle()));
this->suggestions.push_back(suggestion);
}
this->suggestionsOffset = this->suggestions.begin();
retVal = true;
} else {
for (std::vector<std::string>::iterator variantsItr = variants.begin();
variantsItr != variants.end();
variantsItr++) {
retVal = this->searchSuggestions(*variantsItr, suggestionsCount, false)
|| retVal;
}
}
return retVal;
}
/* Get next suggestion */
bool Reader::getNextSuggestion(string& title)
{
if (this->suggestionsOffset != this->suggestions.end()) {
/* title */
title = (*(this->suggestionsOffset))[0];
/* increment the cursor for the next call */
this->suggestionsOffset++;
return true;
}
return false;
}
bool Reader::getNextSuggestion(string& title, string& url)
{
if (this->suggestionsOffset != this->suggestions.end()) {
/* title */
title = (*(this->suggestionsOffset))[0];
url = (*(this->suggestionsOffset))[1];
/* increment the cursor for the next call */
this->suggestionsOffset++;
return true;
}
return false;
}
/* Check if the file has as checksum */
bool Reader::canCheckIntegrity() const
{
return this->zimFileHandler->getChecksum() != "";
}
/* Return true if corrupted, false otherwise */
bool Reader::isCorrupted() const
{
try {
if (this->zimFileHandler->verify() == true) {
return false;
}
} catch (exception& e) {
cerr << e.what() << endl;
return true;
}
return true;
}
/* Return the file size, works also for splitted files */
unsigned int Reader::getFileSize() const
{
zim::File* file = this->getZimFileHandler();
zim::size_type size = 0;
if (file != NULL) {
size = file->getFilesize();
}
return (size / 1024);
}
}

View File

@@ -21,26 +21,35 @@
#include <cmath>
#include "search_renderer.h"
#include "searcher.h"
#include "reader.h"
#include "library.h"
#include "name_mapper.h"
#include "tools/archiveTools.h"
#include <zim/search.h>
#include <mustache.hpp>
#include "kiwixlib-resources.h"
#include "tools/stringTools.h"
namespace kiwix
{
/* Constructor */
SearchRenderer::SearchRenderer(Searcher* searcher, NameMapper* mapper)
: mp_searcher(searcher),
SearchRenderer::SearchRenderer(zim::SearchResultSet srs, NameMapper* mapper,
unsigned int start, unsigned int estimatedResultCount)
: SearchRenderer(srs, mapper, nullptr, start, estimatedResultCount)
{}
SearchRenderer::SearchRenderer(zim::SearchResultSet srs, NameMapper* mapper, Library* library,
unsigned int start, unsigned int estimatedResultCount)
: m_srs(srs),
mp_nameMapper(mapper),
mp_library(library),
protocolPrefix("zim://"),
searchProtocolPrefix("search://?")
searchProtocolPrefix("search://"),
estimatedResultCount(estimatedResultCount),
resultStart(start)
{}
/* Destructor */
@@ -48,12 +57,12 @@ SearchRenderer::~SearchRenderer() = default;
void SearchRenderer::setSearchPattern(const std::string& pattern)
{
this->searchPattern = pattern;
searchPattern = pattern;
}
void SearchRenderer::setSearchContent(const std::string& name)
void SearchRenderer::setSearchBookQuery(const std::string& bookQuery)
{
this->searchContent = name;
searchBookQuery = bookQuery;
}
void SearchRenderer::setProtocolPrefix(const std::string& prefix)
@@ -66,87 +75,161 @@ void SearchRenderer::setSearchProtocolPrefix(const std::string& prefix)
this->searchProtocolPrefix = prefix;
}
std::string SearchRenderer::getHtml()
{
kainjow::mustache::data results{kainjow::mustache::data::type::list};
mp_searcher->restart_search();
Result* p_result = NULL;
while ((p_result = mp_searcher->getNextResult())) {
kainjow::mustache::data result;
result.set("title", p_result->get_title());
result.set("url", p_result->get_url());
result.set("snippet", p_result->get_snippet());
auto readerIndex = p_result->get_readerIndex();
auto reader = mp_searcher->get_reader(readerIndex);
result.set("resultContentId", mp_nameMapper->getNameForId(reader->getId()));
if (p_result->get_wordCount() >= 0) {
result.set("wordCount", kiwix::beautifyInteger(p_result->get_wordCount()));
}
results.push_back(result);
delete p_result;
std::string extractValueFromQuery(const std::string& query, const std::string& key) {
const std::string p = key + "=";
const size_t i = query.find(p);
if (i == std::string::npos) {
return "";
}
std::string r = query.substr(i + p.size());
return r.substr(0, r.find("&"));
}
// pages
kainjow::mustache::data buildQueryData
(
const std::string& searchProtocolPrefix,
const std::string& pattern,
const std::string& bookQuery
) {
kainjow::mustache::data query;
query.set("pattern", kiwix::encodeDiples(pattern));
std::ostringstream ss;
ss << searchProtocolPrefix << "?pattern=" << urlEncode(pattern, true);
ss << "&" << bookQuery;
query.set("unpaginatedQuery", ss.str());
auto lang = extractValueFromQuery(bookQuery, "books.filter.lang");
if(!lang.empty()) {
query.set("lang", lang);
}
return query;
}
kainjow::mustache::data buildPagination(
unsigned int pageLength,
unsigned int resultsCount,
unsigned int resultsStart
)
{
assert(pageLength!=0);
kainjow::mustache::data pagination;
kainjow::mustache::data pages{kainjow::mustache::data::type::list};
auto resultStart = mp_searcher->getResultStart();
auto resultEnd = mp_searcher->getResultEnd();
auto resultCountPerPage = resultEnd - resultStart;
auto estimatedResultCount = mp_searcher->getEstimatedResultCount();
auto currentPage = 0U;
auto pageStart = 0U;
auto pageEnd = 0U;
auto lastPageStart = 0U;
if (resultCountPerPage) {
currentPage = resultStart/resultCountPerPage;
pageStart = currentPage > 4 ? currentPage-4 : 0;
pageEnd = currentPage + 5;
if (pageEnd > estimatedResultCount / resultCountPerPage) {
pageEnd = estimatedResultCount / resultCountPerPage;
if (resultsCount == 0) {
// Easy case
pagination.set("itemsPerPage", to_string(pageLength));
pagination.set("hasPages", false);
pagination.set("pages", pages);
return pagination;
}
// First we want to display pages starting at a multiple of `pageLength`
// so, let's calculate the start index of the current page.
auto currentPage = resultsStart/pageLength;
auto lastPage = ((resultsCount-1)/pageLength);
auto lastPageStart = lastPage*pageLength;
auto nbPages = lastPage + 1;
auto firstPageGenerated = currentPage > 4 ? currentPage-4 : 0;
auto lastPageGenerated = std::min(currentPage+4, lastPage);
if (nbPages != 1) {
if (firstPageGenerated!=0) {
kainjow::mustache::data page;
page.set("label", "");
page.set("start", to_string(0));
page.set("current", false);
pages.push_back(page);
}
if (estimatedResultCount > resultCountPerPage) {
lastPageStart = static_cast<int>(round(estimatedResultCount/resultCountPerPage)) * resultCountPerPage;
for (auto i=firstPageGenerated; i<=lastPageGenerated; i++) {
kainjow::mustache::data page;
page.set("label", to_string(i+1));
page.set("start", to_string(i*pageLength));
page.set("current", bool(i == currentPage));
pages.push_back(page);
}
if (lastPageGenerated!=lastPage) {
kainjow::mustache::data page;
page.set("label", "");
page.set("start", to_string(lastPageStart));
page.set("current", false);
pages.push_back(page);
}
}
for (unsigned int i = pageStart; i < pageEnd; i++) {
kainjow::mustache::data page;
page.set("label", to_string(i + 1));
page.set("start", to_string(i * resultCountPerPage));
page.set("end", to_string((i + 1) * resultCountPerPage));
pagination.set("itemsPerPage", to_string(pageLength));
pagination.set("hasPages", firstPageGenerated < lastPageGenerated);
pagination.set("pages", pages);
return pagination;
}
if (i == currentPage) {
page.set("selected", true);
std::string SearchRenderer::renderTemplate(const std::string& tmpl_str)
{
const std::string absPathPrefix = protocolPrefix + "content/";
// Build the results list
kainjow::mustache::data items{kainjow::mustache::data::type::list};
for (auto it = m_srs.begin(); it != m_srs.end(); it++) {
kainjow::mustache::data result;
std::string zim_id(it.getZimId());
result.set("title", it.getTitle());
result.set("absolutePath", absPathPrefix + urlEncode(mp_nameMapper->getNameForId(zim_id), true) + "/" + urlEncode(it.getPath()));
result.set("snippet", it.getSnippet());
if (mp_library) {
result.set("bookTitle", mp_library->getBookById(zim_id).getTitle());
}
if (it.getWordCount() >= 0) {
result.set("wordCount", kiwix::beautifyInteger(it.getWordCount()));
}
pages.push_back(page);
}
std::string template_str = RESOURCE::templates::search_result_html;
kainjow::mustache::mustache tmpl(template_str);
items.push_back(result);
}
kainjow::mustache::data results;
results.set("items", items);
results.set("count", kiwix::beautifyInteger(estimatedResultCount));
results.set("hasResults", estimatedResultCount != 0);
results.set("start", kiwix::beautifyInteger(resultStart));
results.set("end", kiwix::beautifyInteger(std::min(resultStart+pageLength-1, estimatedResultCount)));
// pagination
auto pagination = buildPagination(
pageLength,
estimatedResultCount,
resultStart
);
kainjow::mustache::data query = buildQueryData(
searchProtocolPrefix,
searchPattern,
searchBookQuery
);
kainjow::mustache::data allData;
allData.set("protocolPrefix", protocolPrefix);
allData.set("results", results);
allData.set("pages", pages);
allData.set("hasResults", estimatedResultCount != 0);
allData.set("hasPages", pageStart != pageEnd);
allData.set("count", kiwix::beautifyInteger(estimatedResultCount));
allData.set("searchPattern", kiwix::encodeDiples(this->searchPattern));
allData.set("searchPatternEncoded", urlEncode(this->searchPattern));
allData.set("resultStart", to_string(resultStart + 1));
allData.set("resultEnd", to_string(min(resultEnd, estimatedResultCount)));
allData.set("resultRange", to_string(resultCountPerPage));
allData.set("resultLastPageStart", to_string(lastPageStart));
allData.set("lastResult", to_string(estimatedResultCount));
allData.set("protocolPrefix", this->protocolPrefix);
allData.set("searchProtocolPrefix", this->searchProtocolPrefix);
allData.set("contentId", this->searchContent);
allData.set("pagination", pagination);
allData.set("query", query);
kainjow::mustache::mustache tmpl(tmpl_str);
std::stringstream ss;
tmpl.render(allData, [&ss](const std::string& str) { ss << str; });
if (!tmpl.is_valid()) {
throw std::runtime_error("Error while rendering search results: " + tmpl.error_message());
}
return ss.str();
}
std::string SearchRenderer::getHtml()
{
return renderTemplate(RESOURCE::templates::search_result_html);
}
std::string SearchRenderer::getXml()
{
return renderTemplate(RESOURCE::templates::search_result_xml);
}
}

View File

@@ -1,279 +0,0 @@
/*
* Copyright 2011 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include <cmath>
#include "searcher.h"
#include "reader.h"
#include <zim/search.h>
#include <mustache.hpp>
#include "kiwixlib-resources.h"
#define MAX_SEARCH_LEN 140
namespace kiwix
{
class _Result : public Result
{
public:
_Result(zim::Search::iterator& iterator);
virtual ~_Result(){};
virtual std::string get_url();
virtual std::string get_title();
virtual int get_score();
virtual std::string get_snippet();
virtual std::string get_content();
virtual int get_wordCount();
virtual int get_size();
virtual int get_readerIndex();
private:
zim::Search::iterator iterator;
};
struct SearcherInternal {
const zim::Search* _search;
zim::Search::iterator current_iterator;
SearcherInternal() : _search(NULL) {}
~SearcherInternal()
{
if (_search != NULL) {
delete _search;
}
}
};
/* Constructor */
Searcher::Searcher()
: internal(new SearcherInternal()),
searchPattern(""),
estimatedResultCount(0),
resultStart(0),
resultEnd(0)
{
loadICUExternalTables();
}
/* Destructor */
Searcher::~Searcher()
{
delete internal;
}
bool Searcher::add_reader(Reader* reader)
{
if (!reader->hasFulltextIndex()) {
return false;
}
this->readers.push_back(reader);
return true;
}
Reader* Searcher::get_reader(int readerIndex)
{
return readers.at(readerIndex);
}
/* Search strings in the database */
void Searcher::search(const std::string& search,
unsigned int resultStart,
unsigned int resultEnd,
const bool verbose)
{
this->reset();
if (verbose == true) {
cout << "Performing query `" << search << "'" << endl;
}
this->searchPattern = search;
this->resultStart = resultStart;
this->resultEnd = resultEnd;
/* Try to find results */
if (resultStart != resultEnd) {
/* Perform the search */
string unaccentedSearch = removeAccents(search);
std::vector<const zim::File*> zims;
for (auto current = this->readers.begin(); current != this->readers.end();
current++) {
if ( (*current)->hasFulltextIndex() ) {
zims.push_back((*current)->getZimFileHandler());
}
}
zim::Search* search = new zim::Search(zims);
search->set_verbose(verbose);
search->set_query(unaccentedSearch);
search->set_range(resultStart, resultEnd);
internal->_search = search;
internal->current_iterator = internal->_search->begin();
this->estimatedResultCount = internal->_search->get_matches_estimated();
}
return;
}
void Searcher::geo_search(float latitude, float longitude, float distance,
unsigned int resultStart,
unsigned int resultEnd,
const bool verbose)
{
this->reset();
if (verbose == true) {
cout << "Performing geo query `" << distance << "&(" << latitude << ";" << longitude << ")'" << endl;
}
/* Perform the search */
std::ostringstream oss;
oss << "Articles located less than " << distance << " meters of " << latitude << ";" << longitude;
this->searchPattern = oss.str();
this->resultStart = resultStart;
this->resultEnd = resultEnd;
/* Try to find results */
if (resultStart == resultEnd) {
return;
}
std::vector<const zim::File*> zims;
for (auto current = this->readers.begin(); current != this->readers.end();
current++) {
zims.push_back((*current)->getZimFileHandler());
}
zim::Search* search = new zim::Search(zims);
search->set_verbose(verbose);
search->set_query("");
search->set_georange(latitude, longitude, distance);
search->set_range(resultStart, resultEnd);
internal->_search = search;
internal->current_iterator = internal->_search->begin();
this->estimatedResultCount = internal->_search->get_matches_estimated();
}
void Searcher::restart_search()
{
if (internal->_search) {
internal->current_iterator = internal->_search->begin();
}
}
Result* Searcher::getNextResult()
{
if (internal->_search &&
internal->current_iterator != internal->_search->end()) {
Result* result = new _Result(internal->current_iterator);
internal->current_iterator++;
return result;
}
return NULL;
}
/* Reset the results */
void Searcher::reset()
{
this->estimatedResultCount = 0;
this->searchPattern = "";
return;
}
void Searcher::suggestions(std::string& searchPattern, const bool verbose)
{
this->reset();
if (verbose == true) {
cout << "Performing suggestion query `" << searchPattern << "`" << endl;
}
this->searchPattern = searchPattern;
this->resultStart = 0;
this->resultEnd = 10;
string unaccentedSearch = removeAccents(searchPattern);
std::vector<const zim::File*> zims;
for (auto current = this->readers.begin(); current != this->readers.end();
current++) {
zims.push_back((*current)->getZimFileHandler());
}
zim::Search* search = new zim::Search(zims);
search->set_verbose(verbose);
search->set_query(unaccentedSearch);
search->set_range(resultStart, resultEnd);
search->set_suggestion_mode(true);
internal->_search = search;
internal->current_iterator = internal->_search->begin();
this->estimatedResultCount = internal->_search->get_matches_estimated();
}
/* Return the result count estimation */
unsigned int Searcher::getEstimatedResultCount()
{
return this->estimatedResultCount;
}
_Result::_Result(zim::Search::iterator& iterator)
: iterator(iterator)
{
}
std::string _Result::get_url()
{
return iterator.get_url();
}
std::string _Result::get_title()
{
return iterator.get_title();
}
int _Result::get_score()
{
return iterator.get_score();
}
std::string _Result::get_snippet()
{
return iterator.get_snippet();
}
std::string _Result::get_content()
{
if (iterator->good()) {
return iterator->getData();
}
return "";
}
int _Result::get_size()
{
return iterator.get_size();
}
int _Result::get_wordCount()
{
return iterator.get_wordCount();
}
int _Result::get_readerIndex()
{
return iterator.get_fileIndex();
}
}

View File

@@ -19,140 +19,16 @@
#include "server.h"
#ifdef _WIN32
# if !defined(__MINGW32__) && (_MSC_VER < 1600)
# include "stdint4win.h"
# endif
# include <winsock2.h>
# include <ws2tcpip.h>
# ifdef __GNUC__
// inet_pton is not declared in mingw, even if the function exists.
extern "C" {
WINSOCK_API_LINKAGE INT WSAAPI inet_pton( INT Family, PCSTR pszAddrString, PVOID pAddrBuf);
}
# endif
typedef UINT64 uint64_t;
typedef UINT16 uint16_t;
#endif
extern "C" {
#include <microhttpd.h>
}
#include "tools/otherTools.h"
#include "tools/pathTools.h"
#include "tools/regexTools.h"
#include "tools/stringTools.h"
#include "library.h"
#include "name_mapper.h"
#include "entry.h"
#include "searcher.h"
#include "search_renderer.h"
#include "opds_dumper.h"
#include <zim/uuid.h>
#include <mustache.hpp>
#include <pthread.h>
#include <atomic>
#include <string>
#include <vector>
#include <chrono>
#include "kiwixlib-resources.h"
#ifndef _WIN32
# include <arpa/inet.h>
#endif
#include "server/request_context.h"
#include "server/response.h"
#define MAX_SEARCH_LEN 140
#define KIWIX_MIN_CONTENT_SIZE_TO_DEFLATE 100
#include <zim/item.h>
#include "server/internalServer.h"
namespace kiwix {
static IdNameMapper defaultNameMapper;
typedef kainjow::mustache::data MustacheData;
static int staticHandlerCallback(void* cls,
struct MHD_Connection* connection,
const char* url,
const char* method,
const char* version,
const char* upload_data,
size_t* upload_data_size,
void** cont_cls);
class InternalServer {
public:
InternalServer(Library* library,
NameMapper* nameMapper,
std::string addr,
int port,
std::string root,
int nbThreads,
bool verbose,
bool withTaskbar,
bool withLibraryButton,
bool blockExternalLinks);
virtual ~InternalServer() = default;
int handlerCallback(struct MHD_Connection* connection,
const char* url,
const char* method,
const char* version,
const char* upload_data,
size_t* upload_data_size,
void** cont_cls);
bool start();
void stop();
private: // functions
Response handle_request(const RequestContext& request);
Response build_500(const std::string& msg);
Response build_404(const RequestContext& request, const std::string& zimName);
Response build_304(const RequestContext& request, const ETag& etag) const;
Response build_redirect(const std::string& bookName, const kiwix::Entry& entry) const;
Response build_homepage(const RequestContext& request);
Response handle_skin(const RequestContext& request);
Response handle_catalog(const RequestContext& request);
Response handle_meta(const RequestContext& request);
Response handle_search(const RequestContext& request);
Response handle_suggest(const RequestContext& request);
Response handle_random(const RequestContext& request);
Response handle_captured_external(const RequestContext& request);
Response handle_content(const RequestContext& request);
MustacheData get_default_data() const;
MustacheData homepage_data() const;
Response get_default_response() const;
std::shared_ptr<Reader> get_reader(const std::string& bookName) const;
bool etag_not_needed(const RequestContext& r) const;
ETag get_matching_if_none_match_etag(const RequestContext& request) const;
private: // data
std::string m_addr;
int m_port;
std::string m_root;
int m_nbThreads;
std::atomic_bool m_verbose;
bool m_withTaskbar;
bool m_withLibraryButton;
bool m_blockExternalLinks;
struct MHD_Daemon* mp_daemon;
Library* mp_library;
NameMapper* mp_nameMapper;
std::string m_server_id;
};
Server::Server(Library* library, NameMapper* nameMapper) :
mp_library(library),
mp_nameMapper(nameMapper),
@@ -170,16 +46,21 @@ bool Server::start() {
m_port,
m_root,
m_nbThreads,
m_multizimSearchLimit,
m_verbose,
m_withTaskbar,
m_withLibraryButton,
m_blockExternalLinks));
m_blockExternalLinks,
m_indexTemplateString,
m_ipConnectionLimit));
return mp_server->start();
}
void Server::stop() {
mp_server->stop();
mp_server.reset(nullptr);
if (mp_server) {
mp_server->stop();
mp_server.reset(nullptr);
}
}
void Server::setRoot(const std::string& root)
@@ -193,773 +74,14 @@ void Server::setRoot(const std::string& root)
}
}
InternalServer::InternalServer(Library* library,
NameMapper* nameMapper,
std::string addr,
int port,
std::string root,
int nbThreads,
bool verbose,
bool withTaskbar,
bool withLibraryButton,
bool blockExternalLinks) :
m_addr(addr),
m_port(port),
m_root(root),
m_nbThreads(nbThreads),
m_verbose(verbose),
m_withTaskbar(withTaskbar),
m_withLibraryButton(withLibraryButton),
m_blockExternalLinks(blockExternalLinks),
mp_daemon(nullptr),
mp_library(library),
mp_nameMapper(nameMapper ? nameMapper : &defaultNameMapper)
{}
bool InternalServer::start() {
#ifdef _WIN32
int flags = MHD_USE_SELECT_INTERNALLY;
#else
int flags = MHD_USE_POLL_INTERNALLY;
#endif
if (m_verbose.load())
flags |= MHD_USE_DEBUG;
struct sockaddr_in sockAddr;
memset(&sockAddr, 0, sizeof(sockAddr));
sockAddr.sin_family = AF_INET;
sockAddr.sin_port = htons(m_port);
if (m_addr.empty()) {
if (0 != INADDR_ANY)
sockAddr.sin_addr.s_addr = htonl(INADDR_ANY);
} else {
if (inet_pton(AF_INET, m_addr.c_str(), &(sockAddr.sin_addr.s_addr)) == 0) {
std::cerr << "Ip address " << m_addr << " is not a valid ip address" << std::endl;
return false;
}
}
mp_daemon = MHD_start_daemon(flags,
m_port,
NULL,
NULL,
&staticHandlerCallback,
this,
MHD_OPTION_SOCK_ADDR, &sockAddr,
MHD_OPTION_THREAD_POOL_SIZE, m_nbThreads,
MHD_OPTION_END);
if (mp_daemon == nullptr) {
std::cerr << "Unable to instantiate the HTTP daemon. The port " << m_port
<< " is maybe already occupied or need more permissions to be open. "
"Please try as root or with a port number higher or equal to 1024."
<< std::endl;
return false;
}
auto server_start_time = std::chrono::system_clock::now().time_since_epoch();
m_server_id = kiwix::to_string(server_start_time.count());
return true;
}
void InternalServer::stop()
int Server::getPort()
{
MHD_stop_daemon(mp_daemon);
return mp_server->getPort();
}
static int staticHandlerCallback(void* cls,
struct MHD_Connection* connection,
const char* url,
const char* method,
const char* version,
const char* upload_data,
size_t* upload_data_size,
void** cont_cls)
std::string Server::getAddress()
{
InternalServer* _this = static_cast<InternalServer*>(cls);
return _this->handlerCallback(connection,
url,
method,
version,
upload_data,
upload_data_size,
cont_cls);
}
int InternalServer::handlerCallback(struct MHD_Connection* connection,
const char* url,
const char* method,
const char* version,
const char* upload_data,
size_t* upload_data_size,
void** cont_cls)
{
auto start_time = std::chrono::steady_clock::now();
if (m_verbose.load() ) {
printf("======================\n");
printf("Requesting : \n");
printf("full_url : %s\n", url);
}
RequestContext request(connection, m_root, url, method, version);
if (m_verbose.load() ) {
request.print_debug_info();
}
/* Unexpected method */
if (request.get_method() != RequestMethod::GET
&& request.get_method() != RequestMethod::POST
&& request.get_method() != RequestMethod::HEAD) {
printf("Reject request because of unhandled request method.\n");
printf("----------------------\n");
return MHD_NO;
}
auto response = handle_request(request);
if (response.getReturnCode() == MHD_HTTP_INTERNAL_SERVER_ERROR) {
printf("========== INTERNAL ERROR !! ============\n");
if (!m_verbose.load()) {
printf("Requesting : \n");
printf("full_url : %s\n", url);
request.print_debug_info();
}
}
if (response.getReturnCode() == MHD_HTTP_OK && !etag_not_needed(request))
response.set_server_id(m_server_id);
auto ret = response.send(request, connection);
auto end_time = std::chrono::steady_clock::now();
auto time_span = std::chrono::duration_cast<std::chrono::duration<double>>(end_time - start_time);
if (m_verbose.load()) {
printf("Request time : %fs\n", time_span.count());
printf("----------------------\n");
}
return ret;
}
Response InternalServer::build_304(const RequestContext& request, const ETag& etag) const
{
auto response = get_default_response();
response.set_code(MHD_HTTP_NOT_MODIFIED);
response.set_etag(etag);
response.set_content("");
return response;
}
Response InternalServer::handle_request(const RequestContext& request)
{
try {
if (! request.is_valid_url())
return build_404(request, "");
const ETag etag = get_matching_if_none_match_etag(request);
if ( etag )
return build_304(request, etag);
if (kiwix::startsWith(request.get_url(), "/skin/"))
return handle_skin(request);
if (startsWith(request.get_url(), "/catalog"))
return handle_catalog(request);
if (request.get_url() == "/meta")
return handle_meta(request);
if (request.get_url() == "/search")
return handle_search(request);
if (request.get_url() == "/suggest")
return handle_suggest(request);
if (request.get_url() == "/random")
return handle_random(request);
if (request.get_url() == "/catch/external")
return handle_captured_external(request);
return handle_content(request);
} catch (std::exception& e) {
fprintf(stderr, "===== Unhandled error : %s\n", e.what());
return build_500(e.what());
} catch (...) {
fprintf(stderr, "===== Unhandled unknown error\n");
return build_500("Unknown error");
}
}
MustacheData InternalServer::get_default_data() const
{
MustacheData data;
data.set("root", m_root);
return data;
}
Response InternalServer::get_default_response() const
{
return Response(m_root, m_verbose.load(), m_withTaskbar, m_withLibraryButton, m_blockExternalLinks);
}
Response InternalServer::build_404(const RequestContext& request,
const std::string& bookName)
{
MustacheData results;
results.set("url", request.get_full_url());
auto response = get_default_response();
response.set_template(RESOURCE::templates::_404_html, results);
response.set_mimeType("text/html");
response.set_code(MHD_HTTP_NOT_FOUND);
response.set_compress(true);
response.set_taskbar(bookName, "");
return response;
}
Response InternalServer::build_500(const std::string& msg)
{
MustacheData data;
data.set("error", msg);
Response response(m_root, true, false, false, false);
response.set_template(RESOURCE::templates::_500_html, data);
response.set_mimeType("text/html");
response.set_code(MHD_HTTP_INTERNAL_SERVER_ERROR);
return response;
}
MustacheData InternalServer::homepage_data() const
{
auto data = get_default_data();
MustacheData books{MustacheData::type::list};
for (auto& bookId: mp_library->filter(kiwix::Filter().local(true).valid(true))) {
auto& currentBook = mp_library->getBookById(bookId);
MustacheData book;
book.set("name", mp_nameMapper->getNameForId(bookId));
book.set("title", currentBook.getTitle());
book.set("description", currentBook.getDescription());
book.set("articleCount", beautifyInteger(currentBook.getArticleCount()));
book.set("mediaCount", beautifyInteger(currentBook.getMediaCount()));
books.push_back(book);
}
data.set("books", books);
return data;
}
bool InternalServer::etag_not_needed(const RequestContext& request) const
{
const std::string url = request.get_url();
return kiwix::startsWith(url, "/catalog")
|| url == "/search"
|| url == "/suggest"
|| url == "/random"
|| url == "/catch/external";
}
ETag
InternalServer::get_matching_if_none_match_etag(const RequestContext& r) const
{
try {
const std::string etag_list = r.get_header(MHD_HTTP_HEADER_IF_NONE_MATCH);
return ETag::match(etag_list, m_server_id);
} catch (const std::out_of_range&) {
return ETag();
}
}
Response InternalServer::build_homepage(const RequestContext& request)
{
auto response = get_default_response();
response.set_template(RESOURCE::templates::index_html, homepage_data());
response.set_mimeType("text/html; charset=utf-8");
response.set_compress(true);
response.set_taskbar("", "");
return response;
}
Response InternalServer::handle_meta(const RequestContext& request)
{
std::string bookName;
std::string bookId;
std::string meta_name;
std::shared_ptr<Reader> reader;
try {
bookName = request.get_argument("content");
bookId = mp_nameMapper->getIdForName(bookName);
meta_name = request.get_argument("name");
reader = mp_library->getReaderById(bookId);
} catch (const std::out_of_range& e) {
return build_404(request, bookName);
}
if (reader == nullptr) {
return build_404(request, bookName);
}
std::string content;
std::string mimeType = "text";
if (meta_name == "title") {
content = reader->getTitle();
} else if (meta_name == "description") {
content = reader->getDescription();
} else if (meta_name == "language") {
content = reader->getLanguage();
} else if (meta_name == "name") {
content = reader->getName();
} else if (meta_name == "tags") {
content = reader->getTags();
} else if (meta_name == "date") {
content = reader->getDate();
} else if (meta_name == "creator") {
content = reader->getCreator();
} else if (meta_name == "publisher") {
content = reader->getPublisher();
} else if (meta_name == "favicon") {
reader->getFavicon(content, mimeType);
} else {
return build_404(request, bookName);
}
auto response = get_default_response();
response.set_content(content);
response.set_mimeType(mimeType);
response.set_compress(false);
response.set_cacheable();
return response;
}
Response InternalServer::handle_suggest(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_suggest\n");
}
std::string content;
std::string mimeType;
unsigned int maxSuggestionCount = 10;
unsigned int suggestionCount = 0;
std::string suggestion;
std::string bookName;
std::string bookId;
std::string term;
std::shared_ptr<Reader> reader;
try {
bookName = request.get_argument("content");
bookId = mp_nameMapper->getIdForName(bookName);
term = request.get_argument("term");
reader = mp_library->getReaderById(bookId);
} catch (const std::out_of_range&) {
return build_404(request, bookName);
}
if (m_verbose.load()) {
printf("Searching suggestions for: \"%s\"\n", term.c_str());
}
MustacheData results{MustacheData::type::list};
bool first = true;
if (reader != nullptr) {
/* Get the suggestions */
reader->searchSuggestionsSmart(term, maxSuggestionCount);
while (reader->getNextSuggestion(suggestion)) {
MustacheData result;
result.set("label", suggestion);
result.set("value", suggestion);
result.set("first", first);
first = false;
results.push_back(result);
suggestionCount++;
}
}
/* Propose the fulltext search if possible */
if (reader->hasFulltextIndex()) {
MustacheData result;
result.set("label", "containing '" + term + "'...");
result.set("value", term + " ");
result.set("first", first);
results.push_back(result);
}
auto data = get_default_data();
data.set("suggestions", results);
auto response = get_default_response();
response.set_template(RESOURCE::templates::suggestion_json, data);
response.set_mimeType("application/json; charset=utf-8");
response.set_compress(true);
return response;
}
Response InternalServer::handle_skin(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_skin\n");
}
auto response = get_default_response();
auto resourceName = request.get_url().substr(1);
try {
response.set_content(getResource(resourceName));
} catch (const ResourceNotFound& e) {
return build_404(request, "");
}
response.set_mimeType(getMimeTypeForFile(resourceName));
response.set_compress(true);
response.set_cacheable();
return response;
}
Response InternalServer::handle_search(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_search\n");
}
std::string bookName;
std::string bookId;
try {
bookName = request.get_argument("content");
bookId = mp_nameMapper->getIdForName(bookName);
} catch (const std::out_of_range&) {}
std::string patternString;
try {
patternString = request.get_argument("pattern");
} catch (const std::out_of_range&) {}
/* Retrive geo search */
bool has_geo_query = false;
float latitude = 0;
float longitude = 0;
float distance = 0;
try {
latitude = request.get_argument<float>("latitude");
longitude = request.get_argument<float>("longitude");
distance = request.get_argument<float>("distance");
has_geo_query = true;
} catch(const std::out_of_range&) {}
catch(const std::invalid_argument&) {}
std::shared_ptr<Reader> reader(nullptr);
try {
reader = mp_library->getReaderById(bookId);
} catch (const std::out_of_range&) {}
/* Try first to load directly the article */
if (reader != nullptr && !patternString.empty()) {
std::string patternCorrespondingUrl;
auto variants = reader->getTitleVariants(patternString);
auto variantsItr = variants.begin();
while (patternCorrespondingUrl.empty() && variantsItr != variants.end()) {
try {
auto entry = reader->getEntryFromTitle(*variantsItr);
entry = entry.getFinalEntry();
patternCorrespondingUrl = entry.getPath();
break;
} catch(kiwix::NoEntry& e) {
variantsItr++;
}
}
/* If article found then redirect directly to it */
if (!patternCorrespondingUrl.empty()) {
auto response = get_default_response();
response.set_redirection(m_root + "/" + bookName + "/" + patternCorrespondingUrl);
return response;
}
}
/* Make the search */
auto response = get_default_response();
response.set_mimeType("text/html; charset=utf-8");
response.set_taskbar(bookName, reader ? reader->getTitle() : "");
response.set_compress(true);
if ( (!reader && !bookName.empty())
|| (patternString.empty() && ! has_geo_query) ) {
auto data = get_default_data();
data.set("pattern", encodeDiples(patternString));
response.set_template(RESOURCE::templates::no_search_result_html, data);
response.set_code(MHD_HTTP_NOT_FOUND);
return response;
}
Searcher searcher;
if (reader) {
searcher.add_reader(reader.get());
} else {
for (auto& bookId: mp_library->filter(kiwix::Filter().local(true).valid(true))) {
auto currentReader = mp_library->getReaderById(bookId);
if (currentReader) {
searcher.add_reader(currentReader.get());
}
}
}
auto start = 0;
try {
start = request.get_argument<unsigned int>("start");
} catch (const std::exception&) {}
auto end = 25;
try {
end = request.get_argument<unsigned int>("end");
} catch (const std::exception&) {}
if (start>end) {
auto tmp = start;
start = end;
end = tmp;
}
if (end > start + MAX_SEARCH_LEN) {
end = start + MAX_SEARCH_LEN;
}
/* Get the results */
try {
if (patternString.empty()) {
searcher.geo_search(latitude, longitude, distance,
start, end, m_verbose.load());
} else {
searcher.search(patternString,
start, end, m_verbose.load());
}
SearchRenderer renderer(&searcher, mp_nameMapper);
renderer.setSearchPattern(patternString);
renderer.setSearchContent(bookName);
renderer.setProtocolPrefix(m_root + "/");
renderer.setSearchProtocolPrefix(m_root + "/search?");
response.set_content(renderer.getHtml());
} catch (const std::exception& e) {
std::cerr << e.what() << std::endl;
}
return response;
}
Response InternalServer::handle_random(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_random\n");
}
std::string bookName;
std::string bookId;
std::shared_ptr<Reader> reader;
try {
bookName = request.get_argument("content");
bookId = mp_nameMapper->getIdForName(bookName);
reader = mp_library->getReaderById(bookId);
} catch (const std::out_of_range&) {
return build_404(request, bookName);
}
if (reader == nullptr) {
return build_404(request, bookName);
}
try {
auto entry = reader->getRandomPage();
return build_redirect(bookName, entry.getFinalEntry());
} catch(kiwix::NoEntry& e) {
return build_404(request, bookName);
}
}
Response InternalServer::handle_captured_external(const RequestContext& request)
{
std::string source = "";
try {
source = kiwix::urlDecode(request.get_argument("source"));
} catch (const std::out_of_range& e) {}
if (source.empty())
return build_404(request, "");
auto data = get_default_data();
data.set("source", source);
auto response = get_default_response();
response.set_template(RESOURCE::templates::captured_external_html, data);
response.set_mimeType("text/html; charset=utf-8");
response.set_compress(true);
response.set_taskbar("", "");
return response;
}
Response InternalServer::handle_catalog(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_catalog");
}
std::string host;
std::string url;
try {
host = request.get_header("Host");
url = request.get_url_part(1);
} catch (const std::out_of_range&) {
return build_404(request, "");
}
if (url != "searchdescription.xml" && url != "root.xml" && url != "search") {
return build_404(request, "");
}
auto response = get_default_response();
response.set_compress(true);
if (url == "searchdescription.xml") {
response.set_template(RESOURCE::opensearchdescription_xml, get_default_data());
response.set_mimeType("application/opensearchdescription+xml");
return response;
}
zim::Uuid uuid;
kiwix::OPDSDumper opdsDumper;
opdsDumper.setRootLocation(m_root);
opdsDumper.setSearchDescriptionUrl("catalog/searchdescription.xml");
opdsDumper.setLibrary(mp_library);
response.set_mimeType("application/atom+xml; profile=opds-catalog; kind=acquisition; charset=utf-8");
std::vector<std::string> bookIdsToDump;
if (url == "root.xml") {
opdsDumper.setTitle("All zims");
uuid = zim::Uuid::generate(host);
bookIdsToDump = mp_library->filter(kiwix::Filter().valid(true).local(true).remote(true));
} else if (url == "search") {
auto filter = kiwix::Filter().valid(true).local(true).remote(true);
string query("<Empty query>");
size_t count(10);
size_t startIndex(0);
try {
query = request.get_argument("q");
filter.query(query);
} catch (const std::out_of_range&) {}
try {
filter.maxSize(extractFromString<unsigned long>(request.get_argument("maxsize")));
} catch (...) {}
try {
filter.name(request.get_argument("name"));
} catch (const std::out_of_range&) {}
try {
filter.lang(request.get_argument("lang"));
} catch (const std::out_of_range&) {}
try {
count = extractFromString<unsigned long>(request.get_argument("count"));
} catch (...) {}
try {
startIndex = extractFromString<unsigned long>(request.get_argument("start"));
} catch (...) {}
try {
filter.acceptTags(kiwix::split(request.get_argument("tag"), ";"));
} catch (...) {}
try {
filter.rejectTags(kiwix::split(request.get_argument("notag"), ";"));
} catch (...) {}
opdsDumper.setTitle("Search result for " + query);
uuid = zim::Uuid::generate();
bookIdsToDump = mp_library->filter(filter);
auto totalResults = bookIdsToDump.size();
bookIdsToDump.erase(bookIdsToDump.begin(), bookIdsToDump.begin()+startIndex);
if (count>0 && bookIdsToDump.size() > count) {
bookIdsToDump.resize(count);
}
opdsDumper.setOpenSearchInfo(totalResults, startIndex, bookIdsToDump.size());
}
opdsDumper.setId(kiwix::to_string(uuid));
response.set_content(opdsDumper.dumpOPDSFeed(bookIdsToDump));
return response;
}
namespace
{
std::string get_book_name(const RequestContext& request)
{
try {
return request.get_url_part(0);
} catch (const std::out_of_range& e) {
return std::string();
}
}
} // unnamed namespace
std::shared_ptr<Reader>
InternalServer::get_reader(const std::string& bookName) const
{
std::shared_ptr<Reader> reader;
try {
const std::string bookId = mp_nameMapper->getIdForName(bookName);
reader = mp_library->getReaderById(bookId);
} catch (const std::out_of_range& e) {
}
return reader;
}
Response
InternalServer::build_redirect(const std::string& bookName, const kiwix::Entry& entry) const
{
auto response = get_default_response();
response.set_redirection(m_root + "/" + bookName + "/" +
kiwix::urlEncode(entry.getPath()));
return response;
}
Response InternalServer::handle_content(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_content\n");
}
const std::string bookName = get_book_name(request);
if (bookName.empty())
return build_homepage(request);
const std::shared_ptr<Reader> reader = get_reader(bookName);
if (reader == nullptr) {
return build_404(request, bookName);
}
auto urlStr = request.get_url().substr(bookName.size()+1);
if (urlStr[0] == '/') {
urlStr = urlStr.substr(1);
}
kiwix::Entry entry;
try {
entry = reader->getEntryFromPath(urlStr);
if (entry.isRedirect() || urlStr.empty()) {
// If urlStr is empty, we want to mainPage.
// We must do a redirection to the real page.
return build_redirect(bookName, entry.getFinalEntry());
}
} catch(kiwix::NoEntry& e) {
if (m_verbose.load())
printf("Failed to find %s\n", urlStr.c_str());
return build_404(request, bookName);
}
auto response = get_default_response();
response.set_entry(entry, request);
if (m_verbose.load()) {
printf("Found %s\n", entry.getPath().c_str());
printf("mimeType: %s\n", response.get_mimeType().c_str());
}
if (response.get_mimeType().find("text/html") != string::npos)
response.set_taskbar(bookName, reader->getTitle());
return response;
return mp_server->getAddress();
}
}

127
src/server/byte_range.cpp Normal file
View File

@@ -0,0 +1,127 @@
/*
* Copyright 2020 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "byte_range.h"
#include "tools/stringTools.h"
#include <cassert>
#include <algorithm>
namespace kiwix {
namespace {
ByteRange parseByteRange(const std::string& rangeStr)
{
std::istringstream iss(rangeStr);
int64_t start, end = INT64_MAX;
if (iss >> start) {
if ( start < 0 ) {
if ( iss.eof() )
return ByteRange(-start);
} else {
char c;
if (iss >> c && c=='-') {
iss >> end; // if this fails, end is not modified, which is OK
if (iss.eof() && start <= end)
return ByteRange(ByteRange::PARSED, start, end);
}
}
}
return ByteRange(ByteRange::INVALID, 0, INT64_MAX);
}
} // unnamed namespace
ByteRange::ByteRange()
: kind_(NONE)
, first_(0)
, last_(INT64_MAX)
{}
ByteRange::ByteRange(Kind kind, int64_t first, int64_t last)
: kind_(kind)
, first_(first)
, last_(last)
{
assert(kind != NONE);
assert(first >= 0);
assert(last >= first || (first == 0 && last == -1));
}
ByteRange::ByteRange(int64_t suffix_length)
: kind_(PARSED)
, first_(-suffix_length)
, last_(INT64_MAX)
{
assert(suffix_length > 0);
}
int64_t ByteRange::first() const
{
assert(kind_ > PARSED);
return first_;
}
int64_t ByteRange::last() const
{
assert(kind_ > PARSED);
return last_;
}
int64_t ByteRange::length() const
{
assert(kind_ > PARSED);
return last_ + 1 - first_;
}
ByteRange ByteRange::parse(const std::string& rangeStr)
{
const std::string byteUnitSpec("bytes=");
if ( ! kiwix::startsWith(rangeStr, byteUnitSpec) )
return ByteRange(INVALID, 0, INT64_MAX);
return parseByteRange(rangeStr.substr(byteUnitSpec.size()));
}
ByteRange ByteRange::resolve(int64_t contentSize) const
{
if ( kind() == NONE )
return ByteRange(RESOLVED_FULL_CONTENT, 0, contentSize-1);
if ( kind() == INVALID )
return ByteRange(RESOLVED_UNSATISFIABLE, 0, contentSize-1);
const int64_t resolved_first = first_ < 0
? std::max(int64_t(0), contentSize + first_)
: first_;
const int64_t resolved_last = std::min(contentSize-1, last_);
if ( resolved_first > resolved_last )
return ByteRange(RESOLVED_UNSATISFIABLE, 0, contentSize-1);
return ByteRange(RESOLVED_PARTIAL_CONTENT, resolved_first, resolved_last);
}
} // namespace kiwix

86
src/server/byte_range.h Normal file
View File

@@ -0,0 +1,86 @@
/*
* Copyright 2020 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIXLIB_SERVER_BYTE_RANGE_H
#define KIWIXLIB_SERVER_BYTE_RANGE_H
#include <cstdint>
#include <string>
namespace kiwix {
class ByteRange
{
public: // types
// ByteRange is parsed in a request, then it must be resolved (taking
// into account the actual size of the requested resource) before
// being applied in the response.
// The Kind enum represents possible states in such a lifecycle.
enum Kind {
// The request is not a range request (no Range header)
NONE,
// The value of the Range header is not a valid continuous
// range. Note that a valid (according to RFC7233) sequence of multiple
// byte ranges is considered invalid in the current implementation
// (i.e. only single-range partial requests are supported).
INVALID,
// This byte-range has been successfully parsed from the request
PARSED,
// This is a response to a regular (non-range) request
RESOLVED_FULL_CONTENT,
// The range request is invalid or unsatisfiable
RESOLVED_UNSATISFIABLE,
// This is a response to a (satisfiable) range request
RESOLVED_PARTIAL_CONTENT,
};
public: // functions
// Constructs a ByteRange object of NONE kind
ByteRange();
// Constructs a ByteRange object of the given kind (except NONE)
ByteRange(Kind kind, int64_t first, int64_t last);
// Constructs a ByteRange object of PARSED kind corresponding to a
// range request of the form "Range: bytes=-suffix_length"
explicit ByteRange(int64_t suffix_length);
Kind kind() const { return kind_; }
int64_t first() const;
int64_t last() const;
int64_t length() const;
static ByteRange parse(const std::string& rangeStr);
ByteRange resolve(int64_t contentSize) const;
private: // data
Kind kind_;
int64_t first_;
int64_t last_;
};
} // namespace kiwix
#endif //KIWIXLIB_SERVER_BYTE_RANGE_H

114
src/server/i18n.cpp Normal file
View File

@@ -0,0 +1,114 @@
/*
* Copyright 2022 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "i18n.h"
#include "tools/otherTools.h"
#include <algorithm>
#include <map>
namespace kiwix
{
const char* I18nStringTable::get(const std::string& key) const
{
const I18nString* const begin = entries;
const I18nString* const end = begin + entryCount;
const I18nString* found = std::lower_bound(begin, end, key,
[](const I18nString& a, const std::string& k) {
return a.key < k;
});
return (found == end || found->key != key) ? nullptr : found->value;
}
namespace i18n
{
// this data is generated by the i18n resource compiler
extern const I18nStringTable stringTables[];
extern const size_t langCount;
}
namespace
{
class I18nStringDB
{
public: // functions
I18nStringDB() {
for ( size_t i = 0; i < kiwix::i18n::langCount; ++i ) {
const auto& t = kiwix::i18n::stringTables[i];
lang2TableMap[t.lang] = &t;
}
enStrings = lang2TableMap.at("en");
};
std::string get(const std::string& lang, const std::string& key) const {
const char* s = getStringsFor(lang)->get(key);
if ( s == nullptr ) {
s = enStrings->get(key);
if ( s == nullptr ) {
throw std::runtime_error("Invalid message id");
}
}
return s;
}
private: // functions
const I18nStringTable* getStringsFor(const std::string& lang) const {
try {
return lang2TableMap.at(lang);
} catch(const std::out_of_range&) {
return enStrings;
}
}
private: // data
std::map<std::string, const I18nStringTable*> lang2TableMap;
const I18nStringTable* enStrings;
};
} // unnamed namespace
std::string getTranslatedString(const std::string& lang, const std::string& key)
{
static const I18nStringDB stringDb;
return stringDb.get(lang, key);
}
namespace i18n
{
std::string expandParameterizedString(const std::string& lang,
const std::string& key,
const Parameters& params)
{
const std::string tmpl = getTranslatedString(lang, key);
return render_template(tmpl, params);
}
} // namespace i18n
std::string ParameterizedMessage::getText(const std::string& lang) const
{
return i18n::expandParameterizedString(lang, msgId, params);
}
} // namespace kiwix

94
src/server/i18n.h Normal file
View File

@@ -0,0 +1,94 @@
/*
* Copyright 2022 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIX_SERVER_I18N
#define KIWIX_SERVER_I18N
#include <string>
#include <mustache.hpp>
namespace kiwix
{
struct I18nString {
const char* const key;
const char* const value;
};
struct I18nStringTable {
const char* const lang;
const size_t entryCount;
const I18nString* const entries;
const char* get(const std::string& key) const;
};
std::string getTranslatedString(const std::string& lang, const std::string& key);
namespace i18n
{
typedef kainjow::mustache::object Parameters;
std::string expandParameterizedString(const std::string& lang,
const std::string& key,
const Parameters& params);
class GetTranslatedString
{
public:
explicit GetTranslatedString(const std::string& lang) : m_lang(lang) {}
std::string operator()(const std::string& key) const
{
return getTranslatedString(m_lang, key);
}
std::string operator()(const std::string& key, const Parameters& params) const
{
return expandParameterizedString(m_lang, key, params);
}
private:
const std::string m_lang;
};
} // namespace i18n
struct ParameterizedMessage
{
public: // types
typedef kainjow::mustache::object Parameters;
public: // functions
ParameterizedMessage(const std::string& msgId, const Parameters& params)
: msgId(msgId)
, params(params)
{}
std::string getText(const std::string& lang) const;
private: // data
const std::string msgId;
const Parameters params;
};
} // namespace kiwix
#endif // KIWIX_SERVER_I18N

View File

File diff suppressed because it is too large Load Diff

193
src/server/internalServer.h Normal file
View File

@@ -0,0 +1,193 @@
/*
* Copyright 2019 Matthieu Gautier <mgautier@kymeria.fr>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIXLIB_SERVER_INTERNALSERVER_H
#define KIWIXLIB_SERVER_INTERNALSERVER_H
extern "C" {
#include "microhttpd_wrapper.h"
}
#include "library.h"
#include "name_mapper.h"
#include <zim/search.h>
#include <zim/suggestion.h>
#include <mustache.hpp>
#include <atomic>
#include <string>
#include "server/request_context.h"
#include "server/response.h"
#include "tools/concurrent_cache.h"
namespace kiwix {
struct GeoQuery {
GeoQuery()
: GeoQuery(0, 0, -1)
{}
GeoQuery(float latitude, float longitude, float distance)
: latitude(latitude), longitude(longitude), distance(distance)
{}
float latitude;
float longitude;
float distance;
explicit operator bool() const {
return distance >= 0;
}
friend bool operator<(const GeoQuery& l, const GeoQuery& r)
{
return std::tie(l.latitude, l.longitude, l.distance)
< std::tie(r.latitude, r.longitude, r.distance); // keep the same order
}
};
class SearchInfo {
public:
SearchInfo(const std::string& pattern, GeoQuery geoQuery, const Library::BookIdSet& bookIds, const std::string& bookFilterString);
zim::Query getZimQuery(bool verbose) const;
const Library::BookIdSet& getBookIds() const { return bookIds; }
friend bool operator<(const SearchInfo& l, const SearchInfo& r)
{
return std::tie(l.bookIds, l.pattern, l.geoQuery)
< std::tie(r.bookIds, r.pattern, r.geoQuery); // keep the same order
}
public: //data
std::string pattern;
GeoQuery geoQuery;
Library::BookIdSet bookIds;
std::string bookFilterQuery;
};
typedef kainjow::mustache::data MustacheData;
typedef ConcurrentCache<SearchInfo, std::shared_ptr<zim::Search>> SearchCache;
typedef ConcurrentCache<std::string, std::shared_ptr<zim::SuggestionSearcher>> SuggestionSearcherCache;
class OPDSDumper;
class InternalServer {
public:
InternalServer(Library* library,
NameMapper* nameMapper,
std::string addr,
int port,
std::string root,
int nbThreads,
unsigned int multizimSearchLimit,
bool verbose,
bool withTaskbar,
bool withLibraryButton,
bool blockExternalLinks,
std::string indexTemplateString,
int ipConnectionLimit);
virtual ~InternalServer();
MHD_Result handlerCallback(struct MHD_Connection* connection,
const char* url,
const char* method,
const char* version,
const char* upload_data,
size_t* upload_data_size,
void** cont_cls);
bool start();
void stop();
std::string getAddress() { return m_addr; }
int getPort() { return m_port; }
private: // functions
std::unique_ptr<Response> handle_request(const RequestContext& request);
std::unique_ptr<Response> build_redirect(const std::string& bookName, const zim::Item& item) const;
std::unique_ptr<Response> build_homepage(const RequestContext& request);
std::unique_ptr<Response> handle_skin(const RequestContext& request);
std::unique_ptr<Response> handle_catalog(const RequestContext& request);
std::unique_ptr<Response> handle_catalog_v2(const RequestContext& request);
std::unique_ptr<Response> handle_catalog_v2_root(const RequestContext& request);
std::unique_ptr<Response> handle_catalog_v2_entries(const RequestContext& request, bool partial);
std::unique_ptr<Response> handle_catalog_v2_complete_entry(const RequestContext& request, const std::string& entryId);
std::unique_ptr<Response> handle_catalog_v2_categories(const RequestContext& request);
std::unique_ptr<Response> handle_catalog_v2_languages(const RequestContext& request);
std::unique_ptr<Response> handle_catalog_v2_illustration(const RequestContext& request);
std::unique_ptr<Response> handle_search(const RequestContext& request);
std::unique_ptr<Response> handle_suggest(const RequestContext& request);
std::unique_ptr<Response> handle_random(const RequestContext& request);
std::unique_ptr<Response> handle_catch(const RequestContext& request);
std::unique_ptr<Response> handle_captured_external(const RequestContext& request);
std::unique_ptr<Response> handle_content(const RequestContext& request);
std::unique_ptr<Response> handle_raw(const RequestContext& request);
std::unique_ptr<Response> handle_locally_customized_resource(const RequestContext& request);
std::unique_ptr<Response> handle_widget(const RequestContext& request);
std::vector<std::string> search_catalog(const RequestContext& request,
kiwix::OPDSDumper& opdsDumper);
MustacheData get_default_data() const;
bool etag_not_needed(const RequestContext& r) const;
ETag get_matching_if_none_match_etag(const RequestContext& request) const;
std::pair<std::string, Library::BookIdSet> selectBooks(const RequestContext& r) const;
SearchInfo getSearchInfo(const RequestContext& r) const;
bool isLocallyCustomizedResource(const std::string& url) const;
private: // data
std::string m_addr;
int m_port;
std::string m_root;
int m_nbThreads;
unsigned int m_multizimSearchLimit;
std::atomic_bool m_verbose;
bool m_withTaskbar;
bool m_withLibraryButton;
bool m_blockExternalLinks;
std::string m_indexTemplateString;
int m_ipConnectionLimit;
struct MHD_Daemon* mp_daemon;
Library* mp_library;
NameMapper* mp_nameMapper;
SearchCache searchCache;
SuggestionSearcherCache suggestionSearcherCache;
std::string m_server_id;
std::string m_library_id;
class CustomizedResources;
std::unique_ptr<CustomizedResources> m_customizedResources;
friend std::unique_ptr<Response> Response::build(const InternalServer& server);
friend std::unique_ptr<ContentResponse> ContentResponse::build(const InternalServer& server, const std::string& content, const std::string& mimetype, bool isHomePage, bool raw);
friend std::unique_ptr<Response> ItemResponse::build(const InternalServer& server, const RequestContext& request, const zim::Item& item, bool raw);
};
}
#endif //KIWIXLIB_SERVER_INTERNALSERVER_H

View File

@@ -0,0 +1,168 @@
/*
* Copyright 2021 Veloman Yunkan <veloman.yunkan@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "internalServer.h"
#include "library.h"
#include "opds_dumper.h"
#include "request_context.h"
#include "response.h"
#include "tools/otherTools.h"
#include "kiwixlib-resources.h"
#include <mustache.hpp>
#include <string>
#include <vector>
namespace kiwix {
std::unique_ptr<Response> InternalServer::handle_catalog_v2(const RequestContext& request)
{
if (m_verbose.load()) {
printf("** running handle_catalog_v2");
}
std::string url;
try {
url = request.get_url_part(2);
} catch (const std::out_of_range&) {
return HTTP404Response(*this, request)
+ urlNotFoundMsg;
}
if (url == "root.xml") {
return handle_catalog_v2_root(request);
} else if (url == "searchdescription.xml") {
const std::string endpoint_root = m_root + "/catalog/v2";
return ContentResponse::build(*this,
RESOURCE::catalog_v2_searchdescription_xml,
kainjow::mustache::object({{"endpoint_root", endpoint_root}}),
"application/opensearchdescription+xml"
);
} else if (url == "entry") {
const std::string entryId = request.get_url_part(3);
return handle_catalog_v2_complete_entry(request, entryId);
} else if (url == "entries") {
return handle_catalog_v2_entries(request, /*partial=*/false);
} else if (url == "partial_entries") {
return handle_catalog_v2_entries(request, /*partial=*/true);
} else if (url == "categories") {
return handle_catalog_v2_categories(request);
} else if (url == "languages") {
return handle_catalog_v2_languages(request);
} else if (url == "illustration") {
return handle_catalog_v2_illustration(request);
} else {
return HTTP404Response(*this, request)
+ urlNotFoundMsg;
}
}
std::unique_ptr<Response> InternalServer::handle_catalog_v2_root(const RequestContext& request)
{
return ContentResponse::build(
*this,
RESOURCE::templates::catalog_v2_root_xml,
kainjow::mustache::object{
{"date", gen_date_str()},
{"endpoint_root", m_root + "/catalog/v2"},
{"feed_id", gen_uuid(m_library_id)},
{"all_entries_feed_id", gen_uuid(m_library_id + "/entries")},
{"partial_entries_feed_id", gen_uuid(m_library_id + "/partial_entries")},
{"category_list_feed_id", gen_uuid(m_library_id + "/categories")},
{"language_list_feed_id", gen_uuid(m_library_id + "/languages")}
},
"application/atom+xml;profile=opds-catalog;kind=navigation"
);
}
std::unique_ptr<Response> InternalServer::handle_catalog_v2_entries(const RequestContext& request, bool partial)
{
OPDSDumper opdsDumper(mp_library);
opdsDumper.setRootLocation(m_root);
opdsDumper.setLibraryId(m_library_id);
const auto bookIds = search_catalog(request, opdsDumper);
const auto opdsFeed = opdsDumper.dumpOPDSFeedV2(bookIds, request.get_query(), partial);
return ContentResponse::build(
*this,
opdsFeed,
"application/atom+xml;profile=opds-catalog;kind=acquisition"
);
}
std::unique_ptr<Response> InternalServer::handle_catalog_v2_complete_entry(const RequestContext& request, const std::string& entryId)
{
try {
mp_library->getBookById(entryId);
} catch (const std::out_of_range&) {
return HTTP404Response(*this, request)
+ urlNotFoundMsg;
}
OPDSDumper opdsDumper(mp_library);
opdsDumper.setRootLocation(m_root);
opdsDumper.setLibraryId(m_library_id);
const auto opdsFeed = opdsDumper.dumpOPDSCompleteEntry(entryId);
return ContentResponse::build(
*this,
opdsFeed,
"application/atom+xml;type=entry;profile=opds-catalog"
);
}
std::unique_ptr<Response> InternalServer::handle_catalog_v2_categories(const RequestContext& request)
{
OPDSDumper opdsDumper(mp_library);
opdsDumper.setRootLocation(m_root);
opdsDumper.setLibraryId(m_library_id);
return ContentResponse::build(
*this,
opdsDumper.categoriesOPDSFeed(),
"application/atom+xml;profile=opds-catalog;kind=navigation"
);
}
std::unique_ptr<Response> InternalServer::handle_catalog_v2_languages(const RequestContext& request)
{
OPDSDumper opdsDumper(mp_library);
opdsDumper.setRootLocation(m_root);
opdsDumper.setLibraryId(m_library_id);
return ContentResponse::build(
*this,
opdsDumper.languagesOPDSFeed(),
"application/atom+xml;profile=opds-catalog;kind=navigation"
);
}
std::unique_ptr<Response> InternalServer::handle_catalog_v2_illustration(const RequestContext& request)
{
try {
const auto bookId = request.get_url_part(3);
auto book = mp_library->getBookByIdThreadSafe(bookId);
auto size = request.get_argument<unsigned int>("size");
auto illustration = book.getIllustration(size);
return ContentResponse::build(*this, illustration->getData(), illustration->mimeType);
} catch(...) {
return HTTP404Response(*this, request)
+ urlNotFoundMsg;
}
}
} // namespace kiwix

View File

@@ -26,6 +26,8 @@
#include <cstdio>
#include <atomic>
#include "tools/stringTools.h"
namespace kiwix {
static std::atomic_ullong s_requestIndex(0);
@@ -73,58 +75,38 @@ RequestContext::RequestContext(struct MHD_Connection* connection,
method(str2RequestMethod(_method)),
version(version),
requestIndex(s_requestIndex++),
acceptEncodingDeflate(false),
accept_range(false),
range_pair(0, -1)
acceptEncodingGzip(false),
byteRange_()
{
MHD_get_connection_values(connection, MHD_HEADER_KIND, &RequestContext::fill_header, this);
MHD_get_connection_values(connection, MHD_GET_ARGUMENT_KIND, &RequestContext::fill_argument, this);
try {
acceptEncodingDeflate =
(get_header(MHD_HTTP_HEADER_ACCEPT_ENCODING).find("deflate") != std::string::npos);
acceptEncodingGzip =
(get_header(MHD_HTTP_HEADER_ACCEPT_ENCODING).find("gzip") != std::string::npos);
} catch (const std::out_of_range&) {}
/*Check if range is requested. */
try {
auto range = get_header(MHD_HTTP_HEADER_RANGE);
int start = 0;
int end = -1;
std::istringstream iss(range);
char c;
iss >> start >> c;
if (iss.good() && c=='-') {
iss >> end;
if (iss.fail()) {
// Something went wrong will extracting.
end = -1;
}
if (iss.eof()) {
accept_range = true;
range_pair = std::pair<int, int>(start, end);
}
}
byteRange_ = ByteRange::parse(get_header(MHD_HTTP_HEADER_RANGE));
} catch (const std::out_of_range&) {}
}
RequestContext::~RequestContext()
{}
int RequestContext::fill_header(void *__this, enum MHD_ValueKind kind,
const char *key, const char *value)
MHD_Result RequestContext::fill_header(void *__this, enum MHD_ValueKind kind,
const char *key, const char *value)
{
RequestContext *_this = static_cast<RequestContext*>(__this);
_this->headers[key] = value;
_this->headers[lcAll(key)] = value;
return MHD_YES;
}
int RequestContext::fill_argument(void *__this, enum MHD_ValueKind kind,
const char *key, const char* value)
MHD_Result RequestContext::fill_argument(void *__this, enum MHD_ValueKind kind,
const char *key, const char* value)
{
RequestContext *_this = static_cast<RequestContext*>(__this);
_this->arguments[key] = value == nullptr ? "" : value;
_this->arguments[key].push_back(value == nullptr ? "" : value);
return MHD_YES;
}
@@ -139,14 +121,20 @@ void RequestContext::print_debug_info() const {
printf(" - %s : '%s'\n", it->first.c_str(), it->second.c_str());
}
printf("arguments :\n");
for (auto it=arguments.begin(); it!=arguments.end(); it++) {
printf(" - %s : '%s'\n", it->first.c_str(), it->second.c_str());
for (auto& pair:arguments) {
printf(" - %s :", pair.first.c_str());
bool first = true;
for (auto& v: pair.second) {
printf("%s %s", first?"":",", v.c_str());
first = false;
}
printf("\n");
}
printf("Parsed : \n");
printf("full_url: %s\n", full_url.c_str());
printf("url : %s\n", url.c_str());
printf("acceptEncodingDeflate : %d\n", acceptEncodingDeflate);
printf("has_range : %d\n", accept_range);
printf("acceptEncodingGzip : %d\n", acceptEncodingGzip);
printf("has_range : %d\n", byteRange_.kind() != ByteRange::NONE);
printf("is_valid_url : %d\n", is_valid_url());
printf(".............\n");
}
@@ -188,21 +176,35 @@ bool RequestContext::is_valid_url() const {
return !url.empty();
}
bool RequestContext::has_range() const {
return accept_range;
}
std::pair<int, int> RequestContext::get_range() const {
return range_pair;
ByteRange RequestContext::get_range() const {
return byteRange_;
}
template<>
std::string RequestContext::get_argument(const std::string& name) const {
return arguments.at(name);
return arguments.at(name)[0];
}
std::string RequestContext::get_header(const std::string& name) const {
return headers.at(name);
return headers.at(lcAll(name));
}
std::string RequestContext::get_user_language() const
{
try {
return get_argument("userlang");
} catch(const std::out_of_range&) {}
try {
return get_header("Accept-Language");
} catch(const std::out_of_range&) {}
return "en";
}
std::string RequestContext::get_requested_format() const
{
return get_optional_param<std::string>("format", "html");
}
}

View File

@@ -25,10 +25,14 @@
#include <string>
#include <sstream>
#include <map>
#include <vector>
#include <stdexcept>
#include "byte_range.h"
#include "tools/stringTools.h"
extern "C" {
#include <microhttpd.h>
#include "microhttpd_wrapper.h"
}
namespace kiwix {
@@ -51,9 +55,6 @@ class IndexError: public std::runtime_error {};
class RequestContext {
public: // types
typedef std::pair<int, int> ByteRange;
public: // functions
RequestContext(struct MHD_Connection* connection,
std::string rootLocation,
@@ -69,10 +70,20 @@ class RequestContext {
std::string get_header(const std::string& name) const;
template<typename T=std::string>
T get_argument(const std::string& name) const {
std::istringstream stream(arguments.at(name));
T v;
stream >> v;
return v;
return extractFromString<T>(get_argument(name));
}
std::vector<std::string> get_arguments(const std::string& name) const {
return arguments.at(name);
}
template<class T>
T get_optional_param(const std::string& name, T default_value) const
{
try {
return get_argument<T>(name);
} catch (...) {}
return default_value;
}
@@ -81,10 +92,33 @@ class RequestContext {
std::string get_url_part(int part) const;
std::string get_full_url() const;
bool has_range() const;
std::string get_query(bool mustEncode = false) const {
return get_query([](const std::string& key) {return true;}, mustEncode);
}
template<class F>
std::string get_query(F filter, bool mustEncode) const {
std::string q;
const char* sep = "";
auto encode = [=](const std::string& value) { return mustEncode?urlEncode(value, true):value; };
for ( const auto& a : arguments ) {
if (!filter(a.first)) {
continue;
}
for (const auto& v: a.second) {
q += sep + encode(a.first) + '=' + encode(v);
sep = "&";
}
}
return q;
}
ByteRange get_range() const;
bool can_compress() const { return acceptEncodingDeflate; }
bool can_compress() const { return acceptEncodingGzip; }
std::string get_user_language() const;
std::string get_requested_format() const;
private: // data
std::string full_url;
@@ -93,16 +127,15 @@ class RequestContext {
std::string version;
unsigned long long requestIndex;
bool acceptEncodingDeflate;
bool acceptEncodingGzip;
bool accept_range;
ByteRange range_pair;
ByteRange byteRange_;
std::map<std::string, std::string> headers;
std::map<std::string, std::string> arguments;
std::map<std::string, std::vector<std::string>> arguments;
private: // functions
static int fill_header(void *, enum MHD_ValueKind, const char*, const char*);
static int fill_argument(void *, enum MHD_ValueKind, const char*, const char*);
static MHD_Result fill_header(void *, enum MHD_ValueKind, const char*, const char*);
static MHD_Result fill_argument(void *, enum MHD_ValueKind, const char*, const char*);
};
template<> std::string RequestContext::get_argument(const std::string& name) const;

View File

@@ -1,19 +1,47 @@
/*
* Copyright 2019 Matthieu Gautier <mgautier@kymeria.fr>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "response.h"
#include "request_context.h"
#include "internalServer.h"
#include "kiwixlib-resources.h"
#include "tools/regexTools.h"
#include "tools/stringTools.h"
#include "tools/otherTools.h"
#include "tools/archiveTools.h"
#include "string.h"
#include <mustache.hpp>
#include <zlib.h>
#include <array>
#define KIWIX_MIN_CONTENT_SIZE_TO_DEFLATE 100
// This is somehow a magic value.
// If this value is too small, we will compress (and lost cpu time) too much
// content.
// If this value is too big, we will not compress enough content and send too
// much data.
// If we assume that MTU is 1500 Bytes it is useless to compress
// content smaller as the content will be sent in one packet anyway.
// 1400 Bytes seems to be a common accepted limit.
#define KIWIX_MIN_CONTENT_SIZE_TO_COMPRESS 1400
namespace kiwix {
@@ -21,52 +49,240 @@ namespace
{
// some utilities
std::string get_mime_type(const kiwix::Entry& entry)
std::string get_mime_type(const zim::Item& item)
{
try {
return entry.getMimetype();
} catch (exception& e) {
return item.getMimetype();
} catch (std::exception& e) {
return "application/octet-stream";
}
}
bool is_compressible_mime_type(const std::string& mimeType)
{
return mimeType.find("text/") != string::npos
|| mimeType.find("application/javascript") != string::npos
|| mimeType.find("application/atom") != string::npos
|| mimeType.find("application/opensearchdescription") != string::npos
|| mimeType.find("application/json") != string::npos;
return mimeType.find("text/") != std::string::npos
|| mimeType.find("application/javascript") != std::string::npos
|| mimeType.find("application/atom") != std::string::npos
|| mimeType.find("application/opensearchdescription") != std::string::npos
|| mimeType.find("application/json") != std::string::npos;
}
int get_range_len(const kiwix::Entry& entry, RequestContext::ByteRange range)
{
return range.second == -1
? entry.getSize() - range.first
: range.second - range.first;
bool compress(std::string &content) {
z_stream strm;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
auto ret = deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED, 31, 8,
Z_DEFAULT_STRATEGY);
if (ret != Z_OK) { return false; }
strm.avail_in = static_cast<decltype(strm.avail_in)>(content.size());
strm.next_in =
const_cast<Bytef *>(reinterpret_cast<const Bytef *>(content.data()));
std::string compressed;
std::array<char, 16384> buff{};
do {
strm.avail_out = buff.size();
strm.next_out = reinterpret_cast<Bytef *>(buff.data());
ret = deflate(&strm, Z_FINISH);
assert(ret != Z_STREAM_ERROR);
compressed.append(buff.data(), buff.size() - strm.avail_out);
} while (strm.avail_out == 0);
assert(ret == Z_STREAM_END);
assert(strm.avail_in == 0);
content.swap(compressed);
deflateEnd(&strm);
return true;
}
} // unnamed namespace
Response::Response(const std::string& root, bool verbose, bool withTaskbar, bool withLibraryButton, bool blockExternalLinks)
Response::Response(bool verbose)
: m_verbose(verbose),
m_root(root),
m_content(""),
m_mimeType(""),
m_returnCode(MHD_HTTP_OK),
m_withTaskbar(withTaskbar),
m_withLibraryButton(withLibraryButton),
m_blockExternalLinks(blockExternalLinks),
m_addTaskbar(false),
m_bookName(""),
m_startRange(0),
m_lenRange(0)
m_returnCode(MHD_HTTP_OK)
{
add_header(MHD_HTTP_HEADER_ACCESS_CONTROL_ALLOW_ORIGIN, "*");
}
std::unique_ptr<Response> Response::build(const InternalServer& server)
{
return std::unique_ptr<Response>(new Response(server.m_verbose.load()));
}
std::unique_ptr<Response> Response::build_304(const InternalServer& server, const ETag& etag)
{
auto response = Response::build(server);
response->set_code(MHD_HTTP_NOT_MODIFIED);
response->m_etag = etag;
if ( etag.get_option(ETag::COMPRESSED_CONTENT) ) {
response->add_header(MHD_HTTP_HEADER_VARY, "Accept-Encoding");
}
return response;
}
const UrlNotFoundMsg urlNotFoundMsg;
const InvalidUrlMsg invalidUrlMsg;
std::string ContentResponseBlueprint::getMessage(const std::string& msgId) const
{
return getTranslatedString(m_request.get_user_language(), msgId);
}
std::unique_ptr<ContentResponse> ContentResponseBlueprint::generateResponseObject() const
{
auto r = ContentResponse::build(m_server, m_template, m_data, m_mimeType);
r->set_code(m_httpStatusCode);
if ( m_taskbarInfo ) {
r->set_taskbar(m_taskbarInfo->bookName, m_taskbarInfo->archive);
}
return r;
}
HTTPErrorResponse::HTTPErrorResponse(const InternalServer& server,
const RequestContext& request,
int httpStatusCode,
const std::string& pageTitleMsgId,
const std::string& headingMsgId,
const std::string& cssUrl)
: ContentResponseBlueprint(&server,
&request,
httpStatusCode,
request.get_requested_format() == "html" ? "text/html; charset=utf-8" : "application/xml; charset=utf-8",
request.get_requested_format() == "html" ? RESOURCE::templates::error_html : RESOURCE::templates::error_xml)
{
kainjow::mustache::list emptyList;
this->m_data = kainjow::mustache::object{
{"CSS_URL", onlyAsNonEmptyMustacheValue(cssUrl) },
{"PAGE_TITLE", getMessage(pageTitleMsgId)},
{"PAGE_HEADING", getMessage(headingMsgId)},
{"details", emptyList}
};
}
HTTP404Response::HTTP404Response(const InternalServer& server,
const RequestContext& request)
: HTTPErrorResponse(server,
request,
MHD_HTTP_NOT_FOUND,
"404-page-title",
"404-page-heading")
{
}
HTTPErrorResponse& HTTP404Response::operator+(UrlNotFoundMsg /*unused*/)
{
const std::string requestUrl = m_request.get_full_url();
return *this + ParameterizedMessage("url-not-found", {{"url", requestUrl}});
}
static int print_key_value (void *cls, enum MHD_ValueKind kind,
const char *key, const char *value)
HTTPErrorResponse& HTTPErrorResponse::operator+(const std::string& msg)
{
m_data["details"].push_back({"p", msg});
return *this;
}
HTTPErrorResponse& HTTPErrorResponse::operator+(const ParameterizedMessage& details)
{
return *this + details.getText(m_request.get_user_language());
}
HTTPErrorResponse& HTTPErrorResponse::operator+=(const ParameterizedMessage& details)
{
// operator+() is already a state-modifying operator (akin to operator+=)
return *this + details;
}
HTTP400Response::HTTP400Response(const InternalServer& server,
const RequestContext& request)
: HTTPErrorResponse(server,
request,
MHD_HTTP_BAD_REQUEST,
"400-page-title",
"400-page-heading")
{
}
HTTPErrorResponse& HTTP400Response::operator+(InvalidUrlMsg /*unused*/)
{
std::string requestUrl = m_request.get_full_url();
const auto query = m_request.get_query();
if (!query.empty()) {
requestUrl += "?" + encodeDiples(query);
}
kainjow::mustache::mustache msgTmpl(R"(The requested URL "{{{url}}}" is not a valid request.)");
return *this + msgTmpl.render({"url", requestUrl});
}
HTTP500Response::HTTP500Response(const InternalServer& server,
const RequestContext& request)
: HTTPErrorResponse(server,
request,
MHD_HTTP_INTERNAL_SERVER_ERROR,
"500-page-title",
"500-page-heading")
{
// operator+() is a state-modifying operator (akin to operator+=)
*this + "An internal server error occured. We are sorry about that :/";
}
std::unique_ptr<ContentResponse> HTTP500Response::generateResponseObject() const
{
// We want a 500 response to be a minimalistic one (so that the server doesn't
// have to provide additional resources required for its proper rendering)
// ";raw=true" in the MIME-type below disables response decoration
// (see ContentResponse::contentDecorationAllowed())
const std::string mimeType = "text/html;charset=utf-8;raw=true";
auto r = ContentResponse::build(m_server, m_template, m_data, mimeType);
r->set_code(m_httpStatusCode);
return r;
}
ContentResponseBlueprint& ContentResponseBlueprint::operator+(const TaskbarInfo& taskbarInfo)
{
this->m_taskbarInfo.reset(new TaskbarInfo(taskbarInfo));
return *this;
}
ContentResponseBlueprint& ContentResponseBlueprint::operator+=(const TaskbarInfo& taskbarInfo)
{
// operator+() is already a state-modifying operator (akin to operator+=)
return *this + taskbarInfo;
}
std::unique_ptr<Response> Response::build_416(const InternalServer& server, size_t resourceLength)
{
auto response = Response::build(server);
// [FIXME] (compile with recent enough version of libmicrohttpd)
// response->set_code(MHD_HTTP_RANGE_NOT_SATISFIABLE);
response->set_code(416);
std::ostringstream oss;
oss << "bytes */" << resourceLength;
response->add_header(MHD_HTTP_HEADER_CONTENT_RANGE, oss.str());
return response;
}
std::unique_ptr<Response> Response::build_redirect(const InternalServer& server, const std::string& redirectUrl)
{
auto response = Response::build(server);
response->m_returnCode = MHD_HTTP_FOUND;
response->add_header(MHD_HTTP_HEADER_LOCATION, redirectUrl);
return response;
}
static MHD_Result print_key_value (void *cls, enum MHD_ValueKind kind,
const char *key, const char *value)
{
printf (" - %s: '%s'\n", key, value);
return MHD_YES;
@@ -74,32 +290,32 @@ static int print_key_value (void *cls, enum MHD_ValueKind kind,
struct RunningResponse {
kiwix::Entry entry;
zim::Item item;
int range_start;
RunningResponse(kiwix::Entry entry,
RunningResponse(zim::Item item,
int range_start) :
entry(entry),
item(item),
range_start(range_start)
{}
};
static ssize_t callback_reader_from_entry(void* cls,
static ssize_t callback_reader_from_item(void* cls,
uint64_t pos,
char* buf,
size_t max)
{
RunningResponse* response = static_cast<RunningResponse*>(cls);
size_t max_size_to_set = min<size_t>(
size_t max_size_to_set = std::min<size_t>(
max,
response->entry.getSize() - pos - response->range_start);
response->item.getSize() - pos - response->range_start);
if (max_size_to_set <= 0) {
return MHD_CONTENT_READER_END_WITH_ERROR;
}
zim::Blob blob = response->entry.getBlob(response->range_start+pos, max_size_to_set);
zim::Blob blob = response->item.getData(response->range_start+pos, max_size_to_set);
memcpy(buf, blob.data(), max_size_to_set);
return max_size_to_set;
}
@@ -121,33 +337,24 @@ void print_response_info(int retCode, MHD_Response* response)
}
std::string render_template(const std::string& template_str, kainjow::mustache::data data)
void ContentResponse::introduce_taskbar(const std::string& lang)
{
kainjow::mustache::mustache tmpl(template_str);
kainjow::mustache::data urlencode{kainjow::mustache::lambda2{
[](const std::string& str,const kainjow::mustache::renderer& r) { return urlEncode(r(str), true); }}};
data.set("urlencoded", urlencode);
std::stringstream ss;
tmpl.render(data, [&ss](const std::string& str) { ss << str; });
return ss.str();
}
void Response::introduce_taskbar()
{
if (! m_withTaskbar)
// Taskbar is globally disabled.
return;
kainjow::mustache::data data;
data.set("root", m_root);
data.set("content", m_bookName);
data.set("hascontent", !m_bookName.empty());
data.set("title", m_bookTitle);
data.set("withlibrarybutton", m_withLibraryButton);
auto head_content = render_template(RESOURCE::templates::head_part_html, data);
m_content = appendToFirstOccurence(
i18n::GetTranslatedString t(lang);
kainjow::mustache::object data{
{"root", m_root},
{"content", m_bookName},
{"hascontent", (!m_bookName.empty() && !m_bookTitle.empty())},
{"title", m_bookTitle},
{"withlibrarybutton", m_withLibraryButton},
{"LIBRARY_BUTTON_TEXT", t("library-button-text")},
{"HOME_BUTTON_TEXT", t("home-button-text", {{"BOOK_TITLE", m_bookTitle}}) },
{"RANDOM_PAGE_BUTTON_TEXT", t("random-page-button-text") },
{"SEARCHBOX_TOOLTIP", t("searchbox-tooltip", {{"BOOK_TITLE", m_bookTitle}}) },
};
auto head_content = render_template(RESOURCE::templates::head_taskbar_html, data);
m_content = prependToFirstOccurence(
m_content,
"<head>",
"</head[ \\t]*>",
head_content);
auto taskbar_part = render_template(RESOURCE::templates::taskbar_part_html, data);
@@ -158,136 +365,92 @@ void Response::introduce_taskbar()
}
void Response::inject_externallinks_blocker()
void ContentResponse::inject_externallinks_blocker()
{
kainjow::mustache::data data;
data.set("root", m_root);
auto script_tag = render_template(RESOURCE::templates::external_blocker_part_html, data);
m_content = appendToFirstOccurence(
m_content = prependToFirstOccurence(
m_content,
"<head>",
"</head[ \\t]*>",
script_tag);
}
void ContentResponse::inject_root_link(){
m_content = prependToFirstOccurence(
m_content,
"</head[ \\t]*>",
"<link type=\"root\" href=\"" + m_root + "\">");
}
bool
Response::can_compress(const RequestContext& request) const
ContentResponse::can_compress(const RequestContext& request) const
{
return request.can_compress()
&& is_compressible_mime_type(m_mimeType)
&& (m_content.size() > KIWIX_MIN_CONTENT_SIZE_TO_DEFLATE);
&& (m_content.size() > KIWIX_MIN_CONTENT_SIZE_TO_COMPRESS);
}
MHD_Response*
Response::create_raw_content_mhd_response(const RequestContext& request)
bool
ContentResponse::contentDecorationAllowed() const
{
if (m_addTaskbar) {
introduce_taskbar();
if (m_raw) {
return false;
}
if ( m_blockExternalLinks ) {
inject_externallinks_blocker();
}
bool shouldCompress = m_compress && can_compress(request);
if (shouldCompress) {
std::vector<Bytef> compr_buffer(compressBound(m_content.size()));
uLongf comprLen = compr_buffer.capacity();
int err = compress(&compr_buffer[0],
&comprLen,
(const Bytef*)(m_content.data()),
m_content.size());
if (err == Z_OK && comprLen > 2 && comprLen < (m_content.size() + 2)) {
/* /!\ Internet Explorer has a bug with deflate compression.
It can not handle the first two bytes (compression headers)
We need to chunk them off (move the content 2bytes)
It has no incidence on other browsers
See http://www.subbu.org/blog/2008/03/ie7-deflate-or-not and comments */
m_content = string((char*)&compr_buffer[2], comprLen - 2);
m_etag.set_option(ETag::COMPRESSED_CONTENT);
} else {
shouldCompress = false;
}
}
MHD_Response* response = MHD_create_response_from_buffer(
m_content.size(), const_cast<char*>(m_content.data()), MHD_RESPMEM_MUST_COPY);
// At shis point m_etag.get_option(ETag::COMPRESSED_CONTENT) and
// shouldCompress can have different values. This can happen for a 304 (Not
// Modified) response generated while handling a conditional If-None-Match
// request. In that case the m_etag (together with its COMPRESSED_CONTENT
// option) is obtained from the ETag list of the If-None-Match header and the
// response has no body (which shouldn't be compressed).
if ( m_etag.get_option(ETag::COMPRESSED_CONTENT) ) {
MHD_add_response_header(
response, MHD_HTTP_HEADER_VARY, "Accept-Encoding");
}
if (shouldCompress) {
MHD_add_response_header(
response, MHD_HTTP_HEADER_CONTENT_ENCODING, "deflate");
}
MHD_add_response_header(response, MHD_HTTP_HEADER_CONTENT_TYPE, m_mimeType.c_str());
return response;
}
MHD_Response*
Response::create_redirection_mhd_response() const
{
MHD_Response* response = MHD_create_response_from_buffer(0, nullptr, MHD_RESPMEM_MUST_COPY);
MHD_add_response_header(response, MHD_HTTP_HEADER_LOCATION, m_content.c_str());
return response;
}
MHD_Response*
Response::create_entry_mhd_response() const
{
MHD_Response* response = MHD_create_response_from_callback(m_entry.getSize(),
16384,
callback_reader_from_entry,
new RunningResponse(m_entry, m_startRange),
callback_free_response);
MHD_add_response_header(response,
MHD_HTTP_HEADER_CONTENT_TYPE, m_mimeType.c_str());
MHD_add_response_header(response, MHD_HTTP_HEADER_ACCEPT_RANGES, "bytes");
std::ostringstream oss;
oss << "bytes " << m_startRange << "-" << m_startRange + m_lenRange - 1
<< "/" << m_entry.getSize();
MHD_add_response_header(response,
MHD_HTTP_HEADER_CONTENT_RANGE, oss.str().c_str());
MHD_add_response_header(response,
MHD_HTTP_HEADER_CONTENT_LENGTH, kiwix::to_string(m_lenRange).c_str());
return response;
return (startsWith(m_mimeType, "text/html")
&& m_mimeType.find(";raw=true") == std::string::npos);
}
MHD_Response*
Response::create_mhd_response(const RequestContext& request)
{
switch (m_mode) {
case ResponseMode::RAW_CONTENT :
return create_raw_content_mhd_response(request);
case ResponseMode::REDIRECTION :
return create_redirection_mhd_response();
case ResponseMode::ENTRY :
return create_entry_mhd_response();
}
return nullptr;
MHD_Response* response = MHD_create_response_from_buffer(0, nullptr, MHD_RESPMEM_PERSISTENT);
return response;
}
int Response::send(const RequestContext& request, MHD_Connection* connection)
MHD_Response*
ContentResponse::create_mhd_response(const RequestContext& request)
{
if (contentDecorationAllowed()) {
inject_root_link();
if (m_withTaskbar) {
introduce_taskbar(request.get_user_language());
}
if (m_blockExternalLinks) {
inject_externallinks_blocker();
}
}
const bool isCompressed = can_compress(request) && compress(m_content);
MHD_Response* response = MHD_create_response_from_buffer(
m_content.size(), const_cast<char*>(m_content.data()), MHD_RESPMEM_MUST_COPY);
if (isCompressed) {
m_etag.set_option(ETag::COMPRESSED_CONTENT);
MHD_add_response_header(
response, MHD_HTTP_HEADER_VARY, "Accept-Encoding");
MHD_add_response_header(
response, MHD_HTTP_HEADER_CONTENT_ENCODING, "gzip");
}
return response;
}
MHD_Result Response::send(const RequestContext& request, MHD_Connection* connection)
{
MHD_Response* response = create_mhd_response(request);
MHD_add_response_header(response, "Access-Control-Allow-Origin", "*");
MHD_add_response_header(response, MHD_HTTP_HEADER_CACHE_CONTROL,
m_etag.get_option(ETag::CACHEABLE_ENTITY) ? "max-age=2723040, public" : "no-cache, no-store, must-revalidate");
const std::string etag = m_etag.get_etag();
if ( ! etag.empty() )
MHD_add_response_header(response, MHD_HTTP_HEADER_ETAG, etag.c_str());
MHD_add_response_header(response, MHD_HTTP_HEADER_ETAG, etag.c_str());
for(auto& p: m_customHeaders) {
MHD_add_response_header(response, p.first.c_str(), p.second.c_str());
}
if (m_returnCode == MHD_HTTP_OK && request.has_range())
if (m_returnCode == MHD_HTTP_OK && m_byteRange.kind() == ByteRange::RESOLVED_PARTIAL_CONTENT)
m_returnCode = MHD_HTTP_PARTIAL_CONTENT;
if (m_verbose)
@@ -298,47 +461,115 @@ int Response::send(const RequestContext& request, MHD_Connection* connection)
return ret;
}
void Response::set_template(const std::string& template_str, kainjow::mustache::data data) {
set_content(render_template(template_str, data));
}
void Response::set_content(const std::string& content) {
m_content = content;
m_mode = ResponseMode::RAW_CONTENT;
}
void Response::set_redirection(const std::string& url) {
m_content = url;
m_mode = ResponseMode::REDIRECTION;
m_returnCode = MHD_HTTP_FOUND;
}
void Response::set_entry(const Entry& entry, const RequestContext& request) {
m_entry = entry;
m_mode = ResponseMode::ENTRY;
const std::string mimeType = get_mime_type(entry);
set_mimeType(mimeType);
set_cacheable();
if ( is_compressible_mime_type(mimeType) ) {
zim::Blob raw_content = entry.getBlob();
const std::string content = string(raw_content.data(), raw_content.size());
set_content(content);
set_compress(true);
} else {
const int range_len = get_range_len(entry, request.get_range());
set_range_first(request.get_range().first);
set_range_len(range_len);
}
}
void Response::set_taskbar(const std::string& bookName, const std::string& bookTitle)
void ContentResponse::set_taskbar(const std::string& bookName, const zim::Archive* archive)
{
m_addTaskbar = true;
m_bookName = bookName;
m_bookTitle = bookTitle;
m_bookTitle = archive ? getArchiveTitle(*archive) : "";
}
ContentResponse::ContentResponse(const std::string& root, bool verbose, bool raw, bool withTaskbar, bool withLibraryButton, bool blockExternalLinks, const std::string& content, const std::string& mimetype) :
Response(verbose),
m_root(root),
m_content(content),
m_mimeType(mimetype),
m_raw(raw),
m_withTaskbar(withTaskbar),
m_withLibraryButton(withLibraryButton),
m_blockExternalLinks(blockExternalLinks),
m_bookName(""),
m_bookTitle("")
{
add_header(MHD_HTTP_HEADER_CONTENT_TYPE, m_mimeType);
}
std::unique_ptr<ContentResponse> ContentResponse::build(
const InternalServer& server,
const std::string& content,
const std::string& mimetype,
bool isHomePage,
bool raw)
{
return std::unique_ptr<ContentResponse>(new ContentResponse(
server.m_root,
server.m_verbose.load(),
raw,
server.m_withTaskbar && !isHomePage,
server.m_withLibraryButton,
server.m_blockExternalLinks,
content,
mimetype));
}
std::unique_ptr<ContentResponse> ContentResponse::build(
const InternalServer& server,
const std::string& template_str,
kainjow::mustache::data data,
const std::string& mimetype,
bool isHomePage)
{
auto content = render_template(template_str, data);
return ContentResponse::build(server, content, mimetype, isHomePage);
}
ItemResponse::ItemResponse(bool verbose, const zim::Item& item, const std::string& mimetype, const ByteRange& byterange) :
Response(verbose),
m_item(item),
m_mimeType(mimetype)
{
m_byteRange = byterange;
set_cacheable();
add_header(MHD_HTTP_HEADER_CONTENT_TYPE, m_mimeType);
}
std::unique_ptr<Response> ItemResponse::build(const InternalServer& server, const RequestContext& request, const zim::Item& item, bool raw)
{
const std::string mimetype = get_mime_type(item);
auto byteRange = request.get_range().resolve(item.getSize());
const bool noRange = byteRange.kind() == ByteRange::RESOLVED_FULL_CONTENT;
if (noRange && is_compressible_mime_type(mimetype)) {
// Return a contentResponse
auto response = ContentResponse::build(server, item.getData(), mimetype, /*isHomePage=*/false, raw);
response->set_cacheable();
response->m_byteRange = byteRange;
return std::move(response);
}
if (byteRange.kind() == ByteRange::RESOLVED_UNSATISFIABLE) {
auto response = Response::build_416(server, item.getSize());
response->set_cacheable();
return response;
}
return std::unique_ptr<Response>(new ItemResponse(
server.m_verbose.load(),
item,
mimetype,
byteRange));
}
MHD_Response*
ItemResponse::create_mhd_response(const RequestContext& request)
{
const auto content_length = m_byteRange.length();
MHD_Response* response = MHD_create_response_from_callback(content_length,
16384,
callback_reader_from_item,
new RunningResponse(m_item, m_byteRange.first()),
callback_free_response);
MHD_add_response_header(response, MHD_HTTP_HEADER_ACCEPT_RANGES, "bytes");
if ( m_byteRange.kind() == ByteRange::RESOLVED_PARTIAL_CONTENT ) {
std::ostringstream oss;
oss << "bytes " << m_byteRange.first() << "-" << m_byteRange.last()
<< "/" << m_item.getSize();
MHD_add_response_header(response,
MHD_HTTP_HEADER_CONTENT_RANGE, oss.str().c_str());
}
MHD_add_response_header(response,
MHD_HTTP_HEADER_CONTENT_LENGTH, kiwix::to_string(content_length).c_str());
return response;
}

View File

@@ -22,80 +22,229 @@
#define KIWIXLIB_SERVER_RESPONSE_H
#include <string>
#include <map>
#include <mustache.hpp>
#include "entry.h"
#include "byte_range.h"
#include "etag.h"
#include "i18n.h"
#include <zim/item.h>
extern "C" {
#include <microhttpd.h>
#include "microhttpd_wrapper.h"
}
namespace zim {
class Archive;
} // namespace zim
namespace kiwix {
enum class ResponseMode {
RAW_CONTENT,
REDIRECTION,
ENTRY
};
class InternalServer;
class RequestContext;
class Response {
public:
Response(const std::string& root, bool verbose, bool withTaskbar, bool withLibraryButton, bool blockExternalLinks);
~Response() = default;
Response(bool verbose);
virtual ~Response() = default;
int send(const RequestContext& request, MHD_Connection* connection);
static std::unique_ptr<Response> build(const InternalServer& server);
static std::unique_ptr<Response> build_304(const InternalServer& server, const ETag& etag);
static std::unique_ptr<Response> build_416(const InternalServer& server, size_t resourceLength);
static std::unique_ptr<Response> build_redirect(const InternalServer& server, const std::string& redirectUrl);
void set_template(const std::string& template_str, kainjow::mustache::data data);
void set_content(const std::string& content);
void set_redirection(const std::string& url);
void set_entry(const Entry& entry, const RequestContext& request);
MHD_Result send(const RequestContext& request, MHD_Connection* connection);
void set_mimeType(const std::string& mimeType) { m_mimeType = mimeType; }
void set_code(int code) { m_returnCode = code; }
void set_cacheable() { m_etag.set_option(ETag::CACHEABLE_ENTITY); }
void set_server_id(const std::string& id) { m_etag.set_server_id(id); }
void set_etag(const ETag& etag) { m_etag = etag; }
void set_compress(bool compress) { m_compress = compress; }
void set_taskbar(const std::string& bookName, const std::string& bookTitle);
void set_range_first(uint64_t start) { m_startRange = start; }
void set_range_len(uint64_t len) { m_lenRange = len; }
void add_header(const std::string& name, const std::string& value) { m_customHeaders[name] = value; }
int getReturnCode() const { return m_returnCode; }
std::string get_mimeType() const { return m_mimeType; }
void introduce_taskbar();
void inject_externallinks_blocker();
bool can_compress(const RequestContext& request) const;
private: // functions
MHD_Response* create_mhd_response(const RequestContext& request);
MHD_Response* create_raw_content_mhd_response(const RequestContext& request);
MHD_Response* create_redirection_mhd_response() const;
MHD_Response* create_entry_mhd_response() const;
virtual MHD_Response* create_mhd_response(const RequestContext& request);
MHD_Response* create_error_response(const RequestContext& request) const;
private: // data
protected: // data
bool m_verbose;
ResponseMode m_mode;
int m_returnCode;
ByteRange m_byteRange;
ETag m_etag;
std::map<std::string, std::string> m_customHeaders;
friend class ItemResponse;
};
class ContentResponse : public Response {
public:
ContentResponse(
const std::string& root,
bool verbose,
bool raw,
bool withTaskbar,
bool withLibraryButton,
bool blockExternalLinks,
const std::string& content,
const std::string& mimetype);
static std::unique_ptr<ContentResponse> build(
const InternalServer& server,
const std::string& content,
const std::string& mimetype,
bool isHomePage = false,
bool raw = false);
static std::unique_ptr<ContentResponse> build(
const InternalServer& server,
const std::string& template_str,
kainjow::mustache::data data,
const std::string& mimetype,
bool isHomePage = false);
void set_taskbar(const std::string& bookName, const zim::Archive* archive);
private:
MHD_Response* create_mhd_response(const RequestContext& request);
void introduce_taskbar(const std::string& lang);
void inject_externallinks_blocker();
void inject_root_link();
bool can_compress(const RequestContext& request) const;
bool contentDecorationAllowed() const;
private:
std::string m_root;
std::string m_content;
Entry m_entry;
std::string m_mimeType;
int m_returnCode;
bool m_raw;
bool m_withTaskbar;
bool m_withLibraryButton;
bool m_blockExternalLinks;
bool m_compress;
bool m_addTaskbar;
std::string m_bookName;
std::string m_bookTitle;
uint64_t m_startRange;
uint64_t m_lenRange;
ETag m_etag;
};
struct TaskbarInfo
{
const std::string bookName;
const zim::Archive* const archive;
TaskbarInfo(const std::string& bookName, const zim::Archive* a = nullptr)
: bookName(bookName)
, archive(a)
{}
};
class ContentResponseBlueprint
{
public: // functions
ContentResponseBlueprint(const InternalServer* server,
const RequestContext* request,
int httpStatusCode,
const std::string& mimeType,
const std::string& templateStr)
: m_server(*server)
, m_request(*request)
, m_httpStatusCode(httpStatusCode)
, m_mimeType(mimeType)
, m_template(templateStr)
{}
virtual ~ContentResponseBlueprint() = default;
operator std::unique_ptr<ContentResponse>() const
{
return generateResponseObject();
}
operator std::unique_ptr<Response>() const
{
return operator std::unique_ptr<ContentResponse>();
}
ContentResponseBlueprint& operator+(const TaskbarInfo& taskbarInfo);
ContentResponseBlueprint& operator+=(const TaskbarInfo& taskbarInfo);
protected: // functions
std::string getMessage(const std::string& msgId) const;
virtual std::unique_ptr<ContentResponse> generateResponseObject() const;
public: //data
const InternalServer& m_server;
const RequestContext& m_request;
const int m_httpStatusCode;
const std::string m_mimeType;
const std::string m_template;
kainjow::mustache::data m_data;
std::unique_ptr<TaskbarInfo> m_taskbarInfo;
};
struct HTTPErrorResponse : ContentResponseBlueprint
{
HTTPErrorResponse(const InternalServer& server,
const RequestContext& request,
int httpStatusCode,
const std::string& pageTitleMsgId,
const std::string& headingMsgId,
const std::string& cssUrl = "");
using ContentResponseBlueprint::operator+;
using ContentResponseBlueprint::operator+=;
HTTPErrorResponse& operator+(const std::string& msg);
HTTPErrorResponse& operator+(const ParameterizedMessage& errorDetails);
HTTPErrorResponse& operator+=(const ParameterizedMessage& errorDetails);
};
class UrlNotFoundMsg {};
extern const UrlNotFoundMsg urlNotFoundMsg;
struct HTTP404Response : HTTPErrorResponse
{
HTTP404Response(const InternalServer& server,
const RequestContext& request);
using HTTPErrorResponse::operator+;
HTTPErrorResponse& operator+(UrlNotFoundMsg /*unused*/);
};
class InvalidUrlMsg {};
extern const InvalidUrlMsg invalidUrlMsg;
struct HTTP400Response : HTTPErrorResponse
{
HTTP400Response(const InternalServer& server,
const RequestContext& request);
using HTTPErrorResponse::operator+;
HTTPErrorResponse& operator+(InvalidUrlMsg /*unused*/);
};
struct HTTP500Response : HTTPErrorResponse
{
HTTP500Response(const InternalServer& server,
const RequestContext& request);
private: // overrides
// generateResponseObject() is overriden in order to produce a minimal
// response without any need for additional resources from the server
std::unique_ptr<ContentResponse> generateResponseObject() const override;
};
class ItemResponse : public Response {
public:
ItemResponse(bool verbose, const zim::Item& item, const std::string& mimetype, const ByteRange& byterange);
static std::unique_ptr<Response> build(const InternalServer& server, const RequestContext& request, const zim::Item& item, bool raw = false);
private:
MHD_Response* create_mhd_response(const RequestContext& request);
zim::Item m_item;
std::string m_mimeType;
};
}

View File

@@ -12,45 +12,31 @@
UnixImpl::UnixImpl():
m_pid(0),
m_running(false),
m_mutex(PTHREAD_MUTEX_INITIALIZER),
m_waitingThread()
m_shouldQuit(false)
{
}
UnixImpl::~UnixImpl()
{
kill();
// Android has no pthread_cancel :(
#ifdef __ANDROID__
pthread_kill(m_waitingThread, SIGUSR1);
#else
pthread_cancel(m_waitingThread);
#endif
m_shouldQuit = true;
m_waitingThread.join();
}
#ifdef __ANDROID__
void thread_exit_handler(int sig) {
pthread_exit(0);
}
#endif
void* UnixImpl::waitForPID(void* _self)
{
#ifdef __ANDROID__
struct sigaction actions;
memset(&actions, 0, sizeof(actions));
sigemptyset(&actions.sa_mask);
actions.sa_flags = 0;
actions.sa_handler = thread_exit_handler;
sigaction(SIGUSR1, &actions, NULL);
#endif
UnixImpl* self = static_cast<UnixImpl*>(_self);
waitpid(self->m_pid, NULL, 0);
while (true) {
if (!waitpid(self->m_pid, NULL, WNOHANG)) {
break;
}
if (self->m_shouldQuit) {
return nullptr;
}
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
pthread_mutex_lock(&self->m_mutex);
self->m_running = false;
pthread_mutex_unlock(&self->m_mutex);
return self;
}
@@ -74,7 +60,7 @@ void UnixImpl::run(commandLine_t& commandLine)
default:
m_pid = pid;
m_running = true;
pthread_create(&m_waitingThread, NULL, waitForPID, this);
m_waitingThread = std::thread(waitForPID, this);
break;
}
}
@@ -86,8 +72,5 @@ bool UnixImpl::kill()
bool UnixImpl::isRunning()
{
pthread_mutex_lock(&m_mutex);
bool ret = m_running;
pthread_mutex_unlock(&m_mutex);
return ret;
return m_running;
}

View File

@@ -3,16 +3,16 @@
#include "subprocess.h"
#include <pthread.h>
#include <atomic>
#include <thread>
class UnixImpl : public SubprocessImpl
{
private:
int m_pid;
bool m_running;
pthread_mutex_t m_mutex;
pthread_t m_waitingThread;
std::atomic<bool> m_running;
std::atomic<bool> m_shouldQuit;
std::thread m_waitingThread;
public:
UnixImpl();

View File

@@ -11,7 +11,8 @@
WinImpl::WinImpl():
m_pid(0),
m_running(false),
m_handle(INVALID_HANDLE_VALUE)
m_subprocessHandle(INVALID_HANDLE_VALUE),
m_waitingThreadHandle(INVALID_HANDLE_VALUE)
{
InitializeCriticalSection(&m_criticalSection);
}
@@ -19,14 +20,15 @@ WinImpl::WinImpl():
WinImpl::~WinImpl()
{
kill();
CloseHandle(m_handle);
WaitForSingleObject(m_waitingThreadHandle, INFINITE);
CloseHandle(m_subprocessHandle);
DeleteCriticalSection(&m_criticalSection);
}
DWORD WINAPI WinImpl::waitForPID(void* _self)
{
WinImpl* self = static_cast<WinImpl*>(_self);
WaitForSingleObject(self->m_handle, INFINITE);
WaitForSingleObject(self->m_subprocessHandle, INFINITE);
EnterCriticalSection(&self->m_criticalSection);
self->m_running = false;
@@ -79,16 +81,16 @@ void WinImpl::run(commandLine_t& commandLine)
&procInfo))
{
m_pid = procInfo.dwProcessId;
m_handle = procInfo.hProcess;
m_subprocessHandle = procInfo.hProcess;
CloseHandle(procInfo.hThread);
m_running = true;
CreateThread(NULL, 0, &waitForPID, this, 0, NULL );
m_waitingThreadHandle = CreateThread(NULL, 0, &waitForPID, this, 0, NULL);
}
}
bool WinImpl::kill()
{
return TerminateProcess(m_handle, 0);
return TerminateProcess(m_subprocessHandle, 0);
}
bool WinImpl::isRunning()

View File

@@ -11,7 +11,8 @@ class WinImpl : public SubprocessImpl
private:
int m_pid;
bool m_running;
HANDLE m_handle;
HANDLE m_subprocessHandle;
HANDLE m_waitingThreadHandle;
CRITICAL_SECTION m_criticalSection;
public:

182
src/tools/archiveTools.cpp Normal file
View File

@@ -0,0 +1,182 @@
/*
* Copyright 2021 Maneesh P M <manu.pm55@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include "archiveTools.h"
#include "tools.h"
#include "pathTools.h"
#include "otherTools.h"
#include "stringTools.h"
#include <zim/error.h>
#include <zim/item.h>
namespace kiwix
{
std::string getMetadata(const zim::Archive& archive, const std::string& name) {
try {
return archive.getMetadata(name);
} catch (zim::EntryNotFound& e) {
return "";
}
}
std::string getArchiveTitle(const zim::Archive& archive) {
std::string value = getMetadata(archive, "Title");
if (value.empty()) {
value = getLastPathElement(archive.getFilename());
std::replace(value.begin(), value.end(), '_', ' ');
size_t pos = value.find(".zim");
value = value.substr(0, pos);
}
return value;
}
std::string getMetaDescription(const zim::Archive& archive) {
std::string value;
value = getMetadata(archive, "Description");
/* Mediawiki Collection tends to use the "Subtitle" name */
if (value.empty()) {
value = getMetadata(archive, "Subtitle");
}
return value;
}
std::string getMetaTags(const zim::Archive& archive, bool original) {
std::string tags_str = getMetadata(archive, "Tags");
if (original) {
return tags_str;
}
auto tags = convertTags(tags_str);
return join(tags, ";");
}
std::string getMetaLanguage(const zim::Archive& archive) {
return getMetadata(archive, "Language");
}
std::string getMetaName(const zim::Archive& archive) {
return getMetadata(archive, "Name");
}
std::string getMetaDate(const zim::Archive& archive) {
return getMetadata(archive, "Date");
}
std::string getMetaCreator(const zim::Archive& archive) {
return getMetadata(archive, "Creator");
}
std::string getMetaPublisher(const zim::Archive& archive) {
return getMetadata(archive, "Publisher");
}
std::string getMetaFlavour(const zim::Archive& archive) {
return getMetadata(archive, "Flavour");
}
std::string getArchiveId(const zim::Archive& archive) {
return (std::string) archive.getUuid();
}
bool getArchiveFavicon(const zim::Archive& archive, unsigned size,
std::string& content, std::string& mimeType){
try {
auto item = archive.getIllustrationItem(size);
content = item.getData();
mimeType = item.getMimetype();
return true;
} catch(zim::EntryNotFound& e) {};
return false;
}
// should this be in libzim
unsigned int getArchiveMediaCount(const zim::Archive& archive) {
std::map<const std::string, unsigned int> counterMap = parseArchiveCounter(archive);
unsigned int counter = 0;
for (auto &pair:counterMap) {
if (startsWith(pair.first, "image/") ||
startsWith(pair.first, "video/") ||
startsWith(pair.first, "audio/")) {
counter += pair.second;
}
}
return counter;
}
unsigned int getArchiveArticleCount(const zim::Archive& archive) {
// [HACK]
// getArticleCount() returns different things depending of the "version" of the zim.
// On old zim (<=6), it returns the number of entry in `A` namespace
// On recent zim (>=7), it returns:
// - the number of entry in `C` namespace (==getEntryCount) if no frontArticleIndex is present
// - the number of front article if a frontArticleIndex is present
// The use case >=7 without frontArticleIndex is pretty rare so we don't care
// We can detect if we are reading a zim <= 6 by checking if we have a newNamespaceScheme.
if (archive.hasNewNamespaceScheme()) {
//The articleCount is "good"
return archive.getArticleCount();
} else {
// We have to parse the `M/Counter` metadata
unsigned int counter = 0;
for(const auto& pair:parseArchiveCounter(archive)) {
if (startsWith(pair.first, "text/html")) {
counter += pair.second;
}
}
return counter;
}
}
unsigned int getArchiveFileSize(const zim::Archive& archive) {
return archive.getFilesize() / 1024;
}
zim::Item getFinalItem(const zim::Archive& archive, const zim::Entry& entry)
{
return entry.getItem(true);
}
zim::Entry getEntryFromPath(const zim::Archive& archive, const std::string& path)
{
try {
return archive.getEntryByPath(path);
} catch (zim::EntryNotFound& e) {
if (path.empty() || path == "/") {
return archive.getMainEntry();
}
}
throw zim::EntryNotFound("Cannot find entry for non empty path");
}
MimeCounterType parseArchiveCounter(const zim::Archive& archive) {
try {
auto counterContent = archive.getMetadata("Counter");
return parseMimetypeCounter(counterContent);
} catch (zim::EntryNotFound& e) {
return {};
}
}
} // kiwix

60
src/tools/archiveTools.h Normal file
View File

@@ -0,0 +1,60 @@
/*
* Copyright 2021 Maneesh P M <manu.pm55@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#ifndef KIWIX_ARCHIVETOOLS_H
#define KIWIX_ARCHIVETOOLS_H
#include <zim/archive.h>
#include <tools/otherTools.h>
/**
* This file contains all the functions that would make handling data related to
* an archive easier.
**/
namespace kiwix
{
std::string getMetadata(const zim::Archive& archive, const std::string& name);
std::string getArchiveTitle(const zim::Archive& archive);
std::string getMetaDescription(const zim::Archive& archive);
std::string getMetaTags(const zim::Archive& archive, bool original = false);
std::string getMetaLanguage(const zim::Archive& archive);
std::string getMetaName(const zim::Archive& archive);
std::string getMetaDate(const zim::Archive& archive);
std::string getMetaCreator(const zim::Archive& archive);
std::string getMetaPublisher(const zim::Archive& archive);
std::string getMetaFlavour(const zim::Archive& archive);
std::string getArchiveId(const zim::Archive& archive);
bool getArchiveFavicon(const zim::Archive& archive, unsigned size,
std::string& content, std::string& mimeType);
unsigned int getArchiveMediaCount(const zim::Archive& archive);
unsigned int getArchiveArticleCount(const zim::Archive& archive);
unsigned int getArchiveFileSize(const zim::Archive& archive);
zim::Item getFinalItem(const zim::Archive& archive, const zim::Entry& entry);
zim::Entry getEntryFromPath(const zim::Archive& archive, const std::string& path);
MimeCounterType parseArchiveCounter(const zim::Archive& archive);
}
#endif

View File

@@ -0,0 +1,208 @@
/*
* Copyright (C) 2021 Matthieu Gautier <mgautier@kymeria.fr>
* Copyright (C) 2020 Veloman Yunkan
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful, but
* is provided AS IS, WITHOUT ANY WARRANTY; without even the implied
* warranty of MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, and
* NON-INFRINGEMENT. See the GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*
*/
#ifndef ZIM_CONCURRENT_CACHE_H
#define ZIM_CONCURRENT_CACHE_H
#include "lrucache.h"
#include <future>
#include <mutex>
namespace kiwix
{
/**
ConcurrentCache implements a concurrent thread-safe cache
Compared to kiwix::lru_cache, each access operation is slightly more expensive.
However, different slots of the cache can be safely accessed concurrently
with minimal blocking. Concurrent access to the same element is also
safe, and, in case of a cache miss, will block until that element becomes
available.
*/
template <typename Key, typename Value>
class ConcurrentCache
{
private: // types
typedef std::shared_future<Value> ValuePlaceholder;
typedef lru_cache<Key, ValuePlaceholder> Impl;
public: // types
explicit ConcurrentCache(size_t maxEntries)
: impl_(maxEntries)
{}
// Gets the entry corresponding to the given key. If the entry is not in the
// cache, it is obtained by calling f() (without any arguments) and the
// result is put into the cache.
//
// The cache as a whole is locked only for the duration of accessing
// the respective slot. If, in the case of the a cache miss, the generation
// of the missing element takes a long time, only attempts to access that
// element will block - the rest of the cache remains open to concurrent
// access.
template<class F>
Value getOrPut(const Key& key, F f)
{
std::promise<Value> valuePromise;
std::unique_lock<std::mutex> l(lock_);
const auto x = impl_.getOrPut(key, valuePromise.get_future().share());
l.unlock();
if ( x.miss() ) {
try {
valuePromise.set_value(f());
} catch (std::exception& e) {
drop(key);
throw;
}
}
return x.value().get();
}
bool drop(const Key& key)
{
std::unique_lock<std::mutex> l(lock_);
return impl_.drop(key);
}
size_t setMaxSize(size_t new_size) {
std::unique_lock<std::mutex> l(lock_);
return impl_.setMaxSize(new_size);
}
protected: // data
Impl impl_;
std::mutex lock_;
};
/**
WeakStore represent a thread safe store (map) of weak ptr.
It allows to store weak_ptr from shared_ptr and retrieve shared_ptr from
potential non expired weak_ptr.
It is not limited in size.
*/
template<typename Key, typename Value>
class WeakStore {
private: // types
typedef std::weak_ptr<Value> WeakValue;
public:
explicit WeakStore() = default;
std::shared_ptr<Value> get(const Key& key)
{
std::lock_guard<std::mutex> l(m_lock);
auto it = m_weakMap.find(key);
if (it != m_weakMap.end()) {
auto shared = it->second.lock();
if (shared) {
return shared;
} else {
m_weakMap.erase(it);
}
}
throw std::runtime_error("No weak ptr");
}
void add(const Key& key, std::shared_ptr<Value> shared)
{
std::lock_guard<std::mutex> l(m_lock);
m_weakMap[key] = WeakValue(shared);
}
private: //data
std::map<Key, WeakValue> m_weakMap;
std::mutex m_lock;
};
template <typename Key, typename RawValue>
class ConcurrentCache<Key, std::shared_ptr<RawValue>>
{
private: // types
typedef std::shared_ptr<RawValue> Value;
typedef std::shared_future<Value> ValuePlaceholder;
typedef lru_cache<Key, ValuePlaceholder> Impl;
public: // types
explicit ConcurrentCache(size_t maxEntries)
: impl_(maxEntries)
{}
// Gets the entry corresponding to the given key. If the entry is not in the
// cache, it is obtained by calling f() (without any arguments) and the
// result is put into the cache.
//
// The cache as a whole is locked only for the duration of accessing
// the respective slot. If, in the case of the a cache miss, the generation
// of the missing element takes a long time, only attempts to access that
// element will block - the rest of the cache remains open to concurrent
// access.
template<class F>
Value getOrPut(const Key& key, F f)
{
std::promise<Value> valuePromise;
std::unique_lock<std::mutex> l(lock_);
const auto x = impl_.getOrPut(key, valuePromise.get_future().share());
l.unlock();
if ( x.miss() ) {
// Try to get back the shared_ptr from the weak_ptr first.
try {
valuePromise.set_value(m_weakStore.get(key));
} catch(const std::runtime_error& e) {
try {
const auto value = f();
valuePromise.set_value(value);
m_weakStore.add(key, value);
} catch (std::exception& e) {
drop(key);
throw;
}
}
}
return x.value().get();
}
bool drop(const Key& key)
{
std::unique_lock<std::mutex> l(lock_);
return impl_.drop(key);
}
size_t setMaxSize(size_t new_size) {
std::unique_lock<std::mutex> l(lock_);
return impl_.setMaxSize(new_size);
}
protected: // data
std::mutex lock_;
Impl impl_;
WeakStore<Key, RawValue> m_weakStore;
};
} // namespace kiwix
#endif // ZIM_CONCURRENT_CACHE_H

175
src/tools/lrucache.h Normal file
View File

@@ -0,0 +1,175 @@
/*
* Copyrigth (c) 2021, Matthieu Gautier <mgautier@kymeria.fr>
* Copyright (c) 2020, Veloman Yunkan
* Copyright (c) 2014, lamerman
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of lamerman nor the names of its
* contributors may be used to endorse or promote products derived from
* this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
* OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* File: lrucache.hpp
* Author: Alexander Ponomarev
*
* Created on June 20, 2013, 5:09 PM
*/
#ifndef _LRUCACHE_HPP_INCLUDED_
#define _LRUCACHE_HPP_INCLUDED_
#include <map>
#include <list>
#include <set>
#include <cstddef>
#include <stdexcept>
#include <cassert>
namespace kiwix {
template<typename key_t, typename value_t>
class lru_cache {
public: // types
typedef typename std::pair<key_t, value_t> key_value_pair_t;
typedef typename std::list<key_value_pair_t>::iterator list_iterator_t;
enum AccessStatus {
HIT, // key was found in the cache
PUT, // key was not in the cache but was created by the getOrPut() access
MISS // key was not in the cache; get() access failed
};
class AccessResult
{
const AccessStatus status_;
const value_t val_;
public:
AccessResult(const value_t& val, AccessStatus status)
: status_(status), val_(val)
{}
AccessResult() : status_(MISS), val_() {}
bool hit() const { return status_ == HIT; }
bool miss() const { return !hit(); }
const value_t& value() const
{
if ( status_ == MISS )
throw std::range_error("There is no such key in cache");
return val_;
}
operator const value_t& () const { return value(); }
};
public: // functions
explicit lru_cache(size_t max_size) :
_max_size(max_size) {
}
// If 'key' is present in the cache, returns the associated value,
// otherwise puts the given value into the cache (and returns it with
// a status of a cache miss).
AccessResult getOrPut(const key_t& key, const value_t& value) {
auto it = _cache_items_map.find(key);
if (it != _cache_items_map.end()) {
_cache_items_list.splice(_cache_items_list.begin(), _cache_items_list, it->second);
return AccessResult(it->second->second, HIT);
} else {
putMissing(key, value);
return AccessResult(value, PUT);
}
}
void put(const key_t& key, const value_t& value) {
auto it = _cache_items_map.find(key);
if (it != _cache_items_map.end()) {
_cache_items_list.splice(_cache_items_list.begin(), _cache_items_list, it->second);
it->second->second = value;
} else {
putMissing(key, value);
}
}
AccessResult get(const key_t& key) {
auto it = _cache_items_map.find(key);
if (it == _cache_items_map.end()) {
return AccessResult();
} else {
_cache_items_list.splice(_cache_items_list.begin(), _cache_items_list, it->second);
return AccessResult(it->second->second, HIT);
}
}
bool drop(const key_t& key) {
try {
auto list_it = _cache_items_map.at(key);
_cache_items_list.erase(list_it);
_cache_items_map.erase(key);
return true;
} catch (std::out_of_range& e) {
return false;
}
}
bool exists(const key_t& key) const {
return _cache_items_map.find(key) != _cache_items_map.end();
}
size_t size() const {
return _cache_items_map.size();
}
size_t setMaxSize(size_t new_size) {
size_t previous = _max_size;
_max_size = new_size;
return previous;
}
std::set<key_t> keys() const {
std::set<key_t> keys;
for(auto& item:_cache_items_map) {
keys.insert(item.first);
}
return keys;
}
private: // functions
void putMissing(const key_t& key, const value_t& value) {
assert(_cache_items_map.find(key) == _cache_items_map.end());
_cache_items_list.push_front(key_value_pair_t(key, value));
_cache_items_map[key] = _cache_items_list.begin();
while (_cache_items_map.size() > _max_size) {
_cache_items_map.erase(_cache_items_list.back().first);
_cache_items_list.pop_back();
}
}
private: // data
std::list<key_value_pair_t> _cache_items_list;
std::map<key_t, list_iterator_t> _cache_items_map;
size_t _max_size;
};
} // namespace kiwix
#endif /* _LRUCACHE_HPP_INCLUDED_ */

View File

@@ -1,5 +1,6 @@
/*
* Copyright 2012 Emmanuel Engelhart <kelson@kiwix.org>
* Copyright 2021 Nikhil Tanwar <2002nikhiltanwar@gmail.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
@@ -17,6 +18,7 @@
* MA 02110-1301, USA.
*/
#include "tools.h"
#include <tools/networkTools.h>
#include <stdio.h>
@@ -29,6 +31,17 @@
#include <iostream>
#include <stdexcept>
#ifdef _WIN32
#include <winsock2.h>
#include <ws2tcpip.h>
#include <iostream>
#else
#include <unistd.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <net/if.h>
#include <netdb.h>
#endif
size_t write_callback_to_iss(char* ptr, size_t size, size_t nmemb, void* userdata)
{
@@ -57,3 +70,104 @@ std::string kiwix::download(const std::string& url) {
}
return ss.str();
}
std::map<std::string, std::string> kiwix::getNetworkInterfaces() {
std::map<std::string, std::string> interfaces;
#ifdef _WIN32
SOCKET sd = WSASocket(AF_INET, SOCK_DGRAM, 0, 0, 0, 0);
if (sd == INVALID_SOCKET) {
std::cerr << "Failed to get a socket. Error " << WSAGetLastError() << std::endl;
return interfaces;
}
INTERFACE_INFO InterfaceList[20];
unsigned long nBytesReturned;
if (WSAIoctl(sd, SIO_GET_INTERFACE_LIST, 0, 0, &InterfaceList,
sizeof(InterfaceList), &nBytesReturned, 0, 0) == SOCKET_ERROR) {
std::cerr << "Failed calling WSAIoctl: error " << WSAGetLastError() << std::endl;
return interfaces;
}
int nNumInterfaces = nBytesReturned / sizeof(INTERFACE_INFO);
for (int i = 0; i < nNumInterfaces; ++i) {
sockaddr_in *pAddress;
pAddress = (sockaddr_in *) & (InterfaceList[i].iiAddress.AddressIn);
if(pAddress->sin_family == AF_INET) {
/* Add to the map */
std::string interfaceName = std::string(inet_ntoa(pAddress->sin_addr));
interfaces[interfaceName] = interfaceName;
}
}
#else
/* Get Network interfaces information */
char buf[16384];
struct ifconf ifconf;
int fd = socket(PF_INET, SOCK_DGRAM, 0); /* Only IPV4 */
ifconf.ifc_len = sizeof(buf);
ifconf.ifc_buf=buf;
if(ioctl(fd, SIOCGIFCONF, &ifconf)!=0) {
perror("ioctl(SIOCGIFCONF)");
}
/* Go through each interface */
struct ifreq *ifreq;
ifreq = ifconf.ifc_req;
for (int i = 0; i < ifconf.ifc_len; ) {
if (ifreq->ifr_addr.sa_family == AF_INET) {
/* Get the network interface ip */
char host[128] = { 0 };
const int error = getnameinfo(&(ifreq->ifr_addr), sizeof(ifreq->ifr_addr),
host, sizeof(host),
0, 0, NI_NUMERICHOST);
if (!error) {
std::string interfaceName = std::string(ifreq->ifr_name);
std::string interfaceIp = std::string(host);
/* Add to the map */
interfaces[interfaceName] = interfaceIp;
} else {
perror("getnameinfo()");
}
}
/* some systems have ifr_addr.sa_len and adjust the length that
* way, but not mine. weird */
size_t len;
#ifndef __linux__
len = IFNAMSIZ + ifreq->ifr_addr.sa_len;
#else
len = sizeof(*ifreq);
#endif
ifreq = (struct ifreq*)((char*)ifreq+len);
i += len;
}
#endif
return interfaces;
}
std::string kiwix::getBestPublicIp() {
auto interfaces = getNetworkInterfaces();
#ifndef _WIN32
const char* const prioritizedNames[] =
{ "eth0", "eth1", "wlan0", "wlan1", "en0", "en1" };
for(auto name: prioritizedNames) {
auto it = interfaces.find(name);
if(it != interfaces.end()) {
return it->second;
}
}
#endif
const char* const prefixes[] = { "192.168", "172.16.", "10.0" };
for(auto prefix : prefixes){
for(auto& itr : interfaces) {
auto interfaceIp = itr.second;
if (interfaceIp.find(prefix) == 0) {
return interfaceIp;
}
}
}
return "127.0.0.1";
}

View File

@@ -17,8 +17,14 @@
* MA 02110-1301, USA.
*/
// Implement function declared in tools.h and tools/otherTools.h
#include "tools.h"
#include "tools/otherTools.h"
#include <algorithm>
#include <iomanip>
#ifdef _WIN32
#include <windows.h>
#else
@@ -31,6 +37,8 @@
#include <sstream>
#include <pugixml.hpp>
#include <zim/uuid.h>
static std::map<std::string, std::string> codeisomapping {
{ "aa", "aar" },
@@ -280,3 +288,102 @@ bool kiwix::convertStrToBool(const std::string& value)
throw std::domain_error(ss.str());
}
namespace
{
// The counter metadata format is a list of item separated by a `;` :
// item0;item1;item2
// Each item is a "tuple" mimetype=number.
// However, the mimetype may contains parameters:
// text/html;raw=true;foo=bar
// So the final format may be complex to parse:
// key0=value0;key1;foo=bar=value1;key2=value2
typedef kiwix::MimeCounterType::value_type MimetypeAndCounter;
std::string readFullMimetypeAndCounterString(std::istream& in)
{
std::string mtcStr, params;
getline(in, mtcStr, ';');
if ( mtcStr.find('=') == std::string::npos )
{
do
{
if ( !getline(in, params, ';' ) )
return std::string();
mtcStr += ";" + params;
}
while ( std::count(params.begin(), params.end(), '=') != 2 );
}
return mtcStr;
}
MimetypeAndCounter parseASingleMimetypeCounter(const std::string& s)
{
const std::string::size_type k = s.find_last_of("=");
if ( k != std::string::npos )
{
const std::string mimeType = s.substr(0, k);
std::istringstream counterSS(s.substr(k+1));
unsigned int counter;
if (counterSS >> counter && counterSS.eof())
return MimetypeAndCounter{mimeType, counter};
}
return MimetypeAndCounter{"", 0};
}
} // unnamed namespace
kiwix::MimeCounterType kiwix::parseMimetypeCounter(const std::string& counterData)
{
kiwix::MimeCounterType counters;
std::istringstream ss(counterData);
while (ss)
{
const std::string mtcStr = readFullMimetypeAndCounterString(ss);
const MimetypeAndCounter mtc = parseASingleMimetypeCounter(mtcStr);
if ( !mtc.first.empty() )
counters.insert(mtc);
}
return counters;
}
std::string kiwix::gen_date_str()
{
auto now = std::time(0);
auto tm = std::localtime(&now);
std::stringstream is;
is << std::setw(2) << std::setfill('0')
<< 1900+tm->tm_year << "-"
<< std::setw(2) << std::setfill('0') << tm->tm_mon+1 << "-"
<< std::setw(2) << std::setfill('0') << tm->tm_mday << "T"
<< std::setw(2) << std::setfill('0') << tm->tm_hour << ":"
<< std::setw(2) << std::setfill('0') << tm->tm_min << ":"
<< std::setw(2) << std::setfill('0') << tm->tm_sec << "Z";
return is.str();
}
std::string kiwix::gen_uuid(const std::string& s)
{
return kiwix::to_string(zim::Uuid::generate(s));
}
kainjow::mustache::data kiwix::onlyAsNonEmptyMustacheValue(const std::string& s)
{
return s.empty()
? kainjow::mustache::data(false)
: kainjow::mustache::data(s);
}
std::string kiwix::render_template(const std::string& template_str, kainjow::mustache::data data)
{
kainjow::mustache::mustache tmpl(template_str);
kainjow::mustache::data urlencode{kainjow::mustache::lambda2{
[](const std::string& str,const kainjow::mustache::renderer& r) { return urlEncode(r(str), true); }}};
data.set("urlencoded", urlencode);
std::stringstream ss;
tmpl.render(data, [&ss](const std::string& str) { ss << str; });
return ss.str();
}

View File

@@ -22,6 +22,12 @@
#include <string>
#include <vector>
#include <map>
#include <cstdlib>
#include <zim/zim.h>
#include <mustache.hpp>
#include "stringTools.h"
namespace pugi {
class xml_node;
@@ -29,9 +35,7 @@ namespace pugi {
namespace kiwix
{
void sleep(unsigned int milliseconds);
std::string nodeToString(const pugi::xml_node& node);
std::string converta2toa3(const std::string& a2code);
/*
* Convert all format tag string to new format
@@ -41,6 +45,31 @@ namespace kiwix
const std::string& tagName);
bool convertStrToBool(const std::string& value);
using MimeCounterType = std::map<const std::string, zim::entry_index_type>;
MimeCounterType parseMimetypeCounter(const std::string& counterData);
std::string gen_date_str();
std::string gen_uuid(const std::string& s);
// if s is empty then returns kainjow::mustache::data(false)
// otherwise kainjow::mustache::data(value)
kainjow::mustache::data onlyAsNonEmptyMustacheValue(const std::string& s);
std::string render_template(const std::string& template_str, kainjow::mustache::data data);
template<typename T>
T getEnvVar(const char* name, const T& defaultValue)
{
try {
const char* envString = std::getenv(name);
if (envString == nullptr) {
throw std::runtime_error("Environment variable not set");
}
return extractFromString<T>(envString);
} catch (...) {}
return defaultValue;
}
}
#endif

View File

@@ -17,7 +17,10 @@
* MA 02110-1301, USA.
*/
// Implement method defined in <kiwix/tools.h> and "tools/pathTools.h"
#include "tools.h"
#include "tools/pathTools.h"
#include <stdexcept>
#ifdef __APPLE__
@@ -59,7 +62,6 @@
#define PATH_MAX 1024
#endif
#ifdef _WIN32
std::string WideToUtf8(const std::wstring& wstr)
{
@@ -78,16 +80,22 @@ std::wstring Utf8ToWide(const std::string& str)
}
#endif
bool isRelativePath(const std::string& path)
bool kiwix::isRelativePath(const std::string& path)
{
#ifdef _WIN32
return path.empty() || path.substr(1, 2) == ":\\" ? false : true;
if (path.size() < 3 ) {
return true;
}
if (path.substr(1, 2) == ":\\" || path.substr(0, 2) == "\\\\") {
return false;
}
return true;
#else
return path.empty() || path.substr(0, 1) == "/" ? false : true;
#endif
}
std::vector<std::string> normalizeParts(std::vector<std::string> parts, bool absolute)
std::vector<std::string> normalizeParts(std::vector<std::string>& parts, bool absolute)
{
std::vector<std::string> ret;
#ifdef _WIN32
@@ -95,10 +103,35 @@ std::vector<std::string> normalizeParts(std::vector<std::string> parts, bool abs
//Starts from there.
auto it = find_if(parts.rbegin(), parts.rend(),
[](const std::string& p) ->bool
{ return p.length() == 2 && p[1] == ':'; });
{ return ((p.length() == 2 && p[1] == ':')
|| (p.length() > 2 && p[0] == '\\' && p[1] == '\\')); });
if (it != parts.rend()) {
parts.erase(parts.begin(), it.base()-1);
}
// Special case for samba mount point starting with two "\\" ("\\\\samba\\foo")
if (parts.size() > 2 && parts[0].empty() && parts[1].empty()) {
parts.erase(parts.begin(), parts.begin()+2);
parts[0] = "\\\\" + parts[0];
}
// Special case if we have a samba drive not at first.
// Path is "..\\\\sambdadrive\\..\\.." So we will have an empty part.
auto previous_empty = false;
for (it = parts.rbegin(); it!=parts.rend(); it++) {
if(it->empty()) {
if (previous_empty) {
it++;
break;
} else {
previous_empty = true;
}
} else {
previous_empty = false;
}
}
if (it != parts.rend()) {
parts.erase(parts.begin(), it.base()-1);
parts[0] = "\\\\" + parts[0];
}
#endif
size_t index = 0;
@@ -142,10 +175,12 @@ std::vector<std::string> normalizeParts(std::vector<std::string> parts, bool abs
return ret;
}
std::string computeRelativePath(const std::string& path, const std::string& absolutePath)
std::string kiwix::computeRelativePath(const std::string& path, const std::string& absolutePath)
{
auto pathParts = normalizeParts(kiwix::split(path, SEPARATOR, false), false);
auto absolutePathParts = kiwix::split(absolutePath, SEPARATOR, false);
auto parts = kiwix::split(path, SEPARATOR, false);
auto pathParts = normalizeParts(parts, false);
parts = kiwix::split(absolutePath, SEPARATOR, false);
auto absolutePathParts = normalizeParts(parts, true);
unsigned int commonCount = 0;
while (commonCount < pathParts.size()
@@ -165,24 +200,27 @@ std::string computeRelativePath(const std::string& path, const std::string& abso
return ret;
}
std::string computeAbsolutePath(const std::string& path, const std::string& relativePath)
std::string kiwix::computeAbsolutePath(const std::string& path, const std::string& relativePath)
{
std::string absolutePath = path;
if (path.empty()) {
absolutePath = getCurrentDirectory();
absolutePath = kiwix::getCurrentDirectory();
}
auto absoluteParts = normalizeParts(kiwix::split(absolutePath, SEPARATOR, false), true);
auto relativeParts = kiwix::split(relativePath, SEPARATOR, false);
auto parts = kiwix::split(absolutePath, SEPARATOR, false);
auto absoluteParts = normalizeParts(parts, true);
parts = kiwix::split(relativePath, SEPARATOR, false);
auto relativeParts = normalizeParts(parts, false);
absoluteParts.insert(absoluteParts.end(), relativeParts.begin(), relativeParts.end());
auto ret = kiwix::join(normalizeParts(absoluteParts, true), SEPARATOR);
return ret;
}
std::string removeLastPathElement(const std::string& path)
std::string kiwix::removeLastPathElement(const std::string& path)
{
auto parts = normalizeParts(kiwix::split(path, SEPARATOR, false), false);
auto parts_ = kiwix::split(path, SEPARATOR, false);
auto parts = normalizeParts(parts_, false);
if (!parts.empty()) {
parts.pop_back();
}
@@ -190,7 +228,7 @@ std::string removeLastPathElement(const std::string& path)
return ret;
}
std::string appendToDirectory(const std::string& directoryPath, const std::string& filename)
std::string kiwix::appendToDirectory(const std::string& directoryPath, const std::string& filename)
{
std::string newPath = directoryPath;
if (!directoryPath.empty() && directoryPath.back() != SEPARATOR[0]) {
@@ -200,9 +238,10 @@ std::string appendToDirectory(const std::string& directoryPath, const std::strin
return newPath;
}
std::string getLastPathElement(const std::string& path)
std::string kiwix::getLastPathElement(const std::string& path)
{
auto parts = normalizeParts(kiwix::split(path, SEPARATOR), false);
auto parts_ = kiwix::split(path, SEPARATOR);
auto parts = normalizeParts(parts_, false);
if (parts.empty()) {
return "";
}
@@ -230,7 +269,7 @@ std::string getFileSizeAsString(const std::string& path)
return convert.str();
}
std::string getFileContent(const std::string& path)
std::string kiwix::getFileContent(const std::string& path)
{
#ifdef _WIN32
auto wpath = Utf8ToWide(path);
@@ -263,19 +302,21 @@ std::string getFileContent(const std::string& path)
return content;
}
bool fileExists(const std::string& path)
bool kiwix::fileExists(const std::string& path)
{
#ifdef _WIN32
return PathFileExistsW(Utf8ToWide(path).c_str());
return (_waccess_s(Utf8ToWide(path).c_str(), 0) == 0);
#else
bool flag = false;
std::fstream fin;
fin.open(path.c_str(), std::ios::in);
if (fin.is_open()) {
flag = true;
}
fin.close();
return flag;
return (access(path.c_str(), F_OK) == 0);
#endif
}
bool kiwix::fileReadable(const std::string& path)
{
#ifdef _WIN32
return (_waccess_s(Utf8ToWide(path).c_str(), 4) == 0);
#else
return (access(path.c_str(), R_OK) == 0);
#endif
}
@@ -302,7 +343,7 @@ std::string makeTmpDirectory()
_wmkdir(ctmp);
return WideToUtf8(ctmp);
#else
char _template_array[] = {"/tmp/kiwix-lib_XXXXXX"};
char _template_array[] = {"/tmp/libkiwix_XXXXXX"};
std::string dir = mkdtemp(_template_array);
return dir;
#endif
@@ -329,7 +370,7 @@ bool copyFile(const std::string& sourcePath, const std::string& destPath)
#endif
}
std::string getExecutablePath(bool realPathOnly)
std::string kiwix::getExecutablePath(bool realPathOnly)
{
if (!realPathOnly) {
char* cAppImage = ::getenv("APPIMAGE");
@@ -383,7 +424,7 @@ bool writeTextFile(const std::string& path, const std::string& content)
return true;
}
std::string getCurrentDirectory()
std::string kiwix::getCurrentDirectory()
{
#ifdef _WIN32
wchar_t* a_cwd = _wgetcwd(NULL, 0);
@@ -397,7 +438,7 @@ std::string getCurrentDirectory()
return ret;
}
std::string getDataDirectory()
std::string kiwix::getDataDirectory()
{
// Try to get the dataDir from the `KIWIX_DATA_DIR` env var
#ifdef _WIN32
@@ -466,7 +507,7 @@ static std::map<std::string, std::string> extMimeTypes = {
};
/* Try to get the mimeType from the file extension */
std::string getMimeTypeForFile(const std::string& filename)
std::string kiwix::getMimeTypeForFile(const std::string& filename)
{
std::string mimeType = "text/plain";
auto pos = filename.find_last_of(".");
@@ -487,4 +528,3 @@ std::string getMimeTypeForFile(const std::string& filename)
return mimeType;
}

View File

@@ -26,23 +26,13 @@
std::string WideToUtf8(const std::wstring& wstr);
std::wstring Utf8ToWide(const std::string& str);
#endif
bool isRelativePath(const std::string& path);
std::string computeAbsolutePath(const std::string& path, const std::string& relativePath);
std::string computeRelativePath(const std::string& path, const std::string& absolutePath);
std::string removeLastPathElement(const std::string& path);
std::string appendToDirectory(const std::string& directoryPath, const std::string& filename);
unsigned int getFileSize(const std::string& path);
std::string getFileSizeAsString(const std::string& path);
std::string getFileContent(const std::string& path);
bool fileExists(const std::string& path);
bool makeDirectory(const std::string& path);
std::string makeTmpDirectory();
bool copyFile(const std::string& sourcePath, const std::string& destPath);
std::string getLastPathElement(const std::string& path);
std::string getExecutablePath(bool realPathOnly = false);
std::string getCurrentDirectory();
std::string getDataDirectory();
bool writeTextFile(const std::string& path, const std::string& content);
std::string getMimeTypeForFile(const std::string& filename);
#endif

View File

@@ -18,7 +18,6 @@
*/
#include <tools/regexTools.h>
#include <tools/lock.h>
#include <unicode/regex.h>
#include <unicode/ucnv.h>
@@ -26,10 +25,10 @@
#include <memory>
#include <map>
#include <stdexcept>
#include <pthread.h>
#include <mutex>
std::map<std::string, std::shared_ptr<icu::RegexPattern>> regexCache;
static pthread_mutex_t regexLock = PTHREAD_MUTEX_INITIALIZER;
static std::mutex regexLock;
std::unique_ptr<icu::RegexMatcher> buildMatcher(const std::string& regex, icu::UnicodeString& content)
{
@@ -39,7 +38,7 @@ std::unique_ptr<icu::RegexMatcher> buildMatcher(const std::string& regex, icu::U
pattern = regexCache.at(regex);
} catch (std::out_of_range&) {
// Redo the search with a lock to avoid race condition.
kiwix::Lock l(&regexLock);
std::lock_guard<std::mutex> l(regexLock);
try {
pattern = regexCache.at(regex);
} catch (std::out_of_range&) {
@@ -95,3 +94,22 @@ std::string appendToFirstOccurence(const std::string& content,
return content;
}
std::string prependToFirstOccurence(const std::string& content,
const std::string& regex,
const std::string& replacement)
{
ucnv_setDefaultName("UTF-8");
icu::UnicodeString ucontent(content.c_str());
icu::UnicodeString ureplacement(replacement.c_str());
auto matcher = buildMatcher(regex, ucontent);
if (matcher->find()) {
UErrorCode status = U_ZERO_ERROR;
ucontent.insert(matcher->start(status), ureplacement);
std::string tmp;
ucontent.toUTF8String(tmp);
return tmp;
}
return content;
}

View File

@@ -29,5 +29,8 @@ std::string replaceRegex(const std::string& content,
std::string appendToFirstOccurence(const std::string& content,
const std::string& regex,
const std::string& replacement);
std::string prependToFirstOccurence(const std::string& content,
const std::string& regex,
const std::string& replacement);
#endif

View File

@@ -17,9 +17,11 @@
* MA 02110-1301, USA.
*/
#include <tools/stringTools.h>
// Implement function declared in tools.h and tools/stringTools.h
#include "tools.h"
#include "tools/stringTools.h"
#include <tools/pathTools.h>
#include "tools/pathTools.h"
#include <unicode/normlzr.h>
#include <unicode/rep.h>
#include <unicode/translit.h>
@@ -47,6 +49,24 @@ void kiwix::loadICUExternalTables()
#endif
}
kiwix::ICULanguageInfo::ICULanguageInfo(const std::string& langCode)
: locale(langCode.c_str())
{}
std::string kiwix::ICULanguageInfo::iso3Code() const
{
return locale.getISO3Language();
}
std::string kiwix::ICULanguageInfo::selfName() const
{
icu::UnicodeString langSelfNameICUString;
locale.getDisplayLanguage(locale, langSelfNameICUString);
std::string langSelfName;
langSelfNameICUString.toUTF8String(langSelfName);
return langSelfName;
}
std::string kiwix::removeAccents(const std::string& text)
{
loadICUExternalTables();
@@ -268,7 +288,8 @@ std::string kiwix::urlDecode(const std::string& value, bool component)
/* Split string in a token array */
std::vector<std::string> kiwix::split(const std::string& str,
const std::string& delims,
bool trimEmpty)
bool dropEmpty,
bool keepDelim)
{
std::string::size_type lastPos = 0;
std::string::size_type pos = 0;
@@ -276,14 +297,17 @@ std::vector<std::string> kiwix::split(const std::string& str,
while( (pos = str.find_first_of(delims, lastPos)) < str.length() )
{
auto token = str.substr(lastPos, pos - lastPos);
if (!trimEmpty || !token.empty()) {
if (!dropEmpty || !token.empty()) {
tokens.push_back(token);
}
if (keepDelim) {
tokens.push_back(str.substr(pos, 1));
}
lastPos = pos + 1;
}
auto token = str.substr(lastPos);
if (!trimEmpty || !token.empty()) {
if (!dropEmpty || !token.empty()) {
tokens.push_back(token);
}
return tokens;
@@ -391,3 +415,16 @@ bool kiwix::startsWith(const std::string& base, const std::string& start)
&& std::equal(start.begin(), start.end(), base.begin());
}
std::vector<std::string> kiwix::getTitleVariants(const std::string& title) {
std::vector<std::string> variants;
variants.push_back(title);
variants.push_back(kiwix::ucFirst(title));
variants.push_back(kiwix::lcFirst(title));
variants.push_back(kiwix::toTitle(title));
return variants;
}
template<>
std::string kiwix::extractFromString(const std::string& str) {
return str;
}

View File

@@ -21,10 +21,12 @@
#define KIWIX_STRINGTOOLS_H
#include <unicode/unistr.h>
#include <unicode/locid.h>
#include <string>
#include <vector>
#include <sstream>
#include <stdexcept>
namespace kiwix
{
@@ -40,10 +42,22 @@ std::string encodeDiples(const std::string& str);
std::string removeAccents(const std::string& text);
void loadICUExternalTables();
class ICULanguageInfo
{
public:
explicit ICULanguageInfo(const std::string& langCode);
std::string iso3Code() const;
std::string selfName() const;
private:
const icu::Locale locale;
};
std::string urlEncode(const std::string& value, bool encodeReserved = false);
std::string urlDecode(const std::string& value, bool component = false);
std::vector<std::string> split(const std::string&, const std::string&, bool trimEmpty = true);
std::string join(const std::vector<std::string>& list, const std::string& sep);
std::string ucAll(const std::string& word);
@@ -66,9 +80,17 @@ T extractFromString(const std::string& str) {
std::istringstream iss(str);
T ret;
iss >> ret;
if(iss.fail() || !iss.eof()) {
throw std::invalid_argument("no conversion");
}
return ret;
}
template<>
std::string extractFromString(const std::string& str);
bool startsWith(const std::string& base, const std::string& start);
std::vector<std::string> getTitleVariants(const std::string& title);
} //namespace kiwix
#endif

77
src/version.cpp Normal file
View File

@@ -0,0 +1,77 @@
/*
* Copyright 2021 Emmanuel Engelhart <kelson@kiwix.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include <iostream>
#include <sstream>
#include <version.h>
#include <zim/zim.h>
#include <kiwix_config.h>
#include <unicode/uversion.h>
#include <pugixml.hpp>
#include <curl/curl.h>
#include <microhttpd.h>
#include <xapian.h>
#include <mustache.hpp>
#include <zlib.h>
namespace kiwix
{
LibVersions getVersions() {
LibVersions versions = {
{ "libkiwix", LIBKIWIX_VERSION },
{ "libzim", LIBZIM_VERSION },
{ "libxapian", XAPIAN_VERSION },
{ "libcurl", LIBCURL_VERSION },
{ "libmicrohttpd", MHD_get_version() },
{ "libz", ZLIB_VERSION }
};
// U_ICU_VERSION does not include the patch level if 0
std::ostringstream libicu_version;
libicu_version << U_ICU_VERSION_MAJOR_NUM << "." << U_ICU_VERSION_MINOR_NUM << "." << U_ICU_VERSION_PATCHLEVEL_NUM;
versions.push_back({ "libicu", libicu_version.str() });
// No human readable version string for pugixml
const unsigned pugixml_major = (PUGIXML_VERSION - PUGIXML_VERSION % 1000) / 1000;
const unsigned pugixml_minor = (PUGIXML_VERSION - pugixml_major * 1000 - PUGIXML_VERSION % 10) / 10;
const unsigned pugixml_patch = PUGIXML_VERSION - pugixml_major * 1000 - pugixml_minor * 10;
std::ostringstream libpugixml_version;
libpugixml_version << pugixml_major << "." << pugixml_minor << "." << pugixml_patch;
versions.push_back({ "libpugixml", libpugixml_version.str() });
// Needs version 5.0 of Mustache
#if defined(KAINJOW_MUSTACHE_VERSION_MAJOR)
std::ostringstream libmustache_version;
libmustache_version << KAINJOW_MUSTACHE_VERSION_MAJOR << "." <<
KAINJOW_MUSTACHE_VERSION_MINOR << "." << KAINJOW_MUSTACHE_VERSION_PATCH;
versions.push_back({ "libmustache", libmustache_version.str() });
#endif
return versions;
}
void printVersions(std::ostream& out) {
LibVersions versions = getVersions();
for (const auto& iter : versions) {
out << (iter != versions.front() ? "+ " : "")
<< iter.first << " " << iter.second << std::endl;
}
}
} //namespace kiwix

View File

@@ -1,13 +0,0 @@
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
package="kiwix.org.kiwixlib"
>
<application android:allowBackup="true"
android:label="@string/app_name"
android:supportsRtl="true"
>
</application>
</manifest>

View File

@@ -1,87 +0,0 @@
/*
* Copyright (C) 2020 Matthieu Gautier <mgautier@kymeria.fr>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
#include <jni.h>
#include "org_kiwix_kiwixlib_Book.h"
#include "utils.h"
#include "book.h"
JNIEXPORT void JNICALL
Java_org_kiwix_kiwixlib_Book_allocate(
JNIEnv* env, jobject thisObj)
{
allocate<kiwix::Book>(env, thisObj);
}
JNIEXPORT void JNICALL
Java_org_kiwix_kiwixlib_Book_dispose(JNIEnv* env, jobject thisObj)
{
dispose<kiwix::Book>(env, thisObj);
}
#define BOOK (getPtr<kiwix::Book>(env, thisObj))
METHOD(void, Book, update__Lorg_kiwix_kiwixlib_Book_2, jobject otherBook)
{
BOOK->update(*getPtr<kiwix::Book>(env, otherBook));
}
METHOD(void, Book, update__Lorg_kiwix_kiwixlib_JNIKiwixReader_2, jobject reader)
{
BOOK->update(**Handle<kiwix::Reader>::getHandle(env, reader));
}
#define GETTER(retType, name) JNIEXPORT retType JNICALL \
Java_org_kiwix_kiwixlib_Book_##name (JNIEnv* env, jobject thisObj) \
{ \
auto cRet = BOOK->name(); \
retType ret = c2jni(cRet, env); \
return ret; \
}
GETTER(jstring, getId)
GETTER(jstring, getPath)
GETTER(jboolean, isPathValid)
GETTER(jstring, getTitle)
GETTER(jstring, getDescription)
GETTER(jstring, getLanguage)
GETTER(jstring, getCreator)
GETTER(jstring, getPublisher)
GETTER(jstring, getDate)
GETTER(jstring, getUrl)
GETTER(jstring, getName)
GETTER(jstring, getFlavour)
GETTER(jstring, getTags)
GETTER(jlong, getArticleCount)
GETTER(jlong, getMediaCount)
GETTER(jlong, getSize)
GETTER(jstring, getFavicon)
GETTER(jstring, getFaviconUrl)
GETTER(jstring, getFaviconMimeType)
METHOD(jstring, Book, getTagStr, jstring tagName) try {
auto cRet = BOOK->getTagStr(jni2c(tagName, env));
return c2jni(cRet, env);
} catch(...) {
return c2jni<std::string>("", env);
}
#undef GETTER

Some files were not shown because too many files have changed in this diff Show More