66 Commits

Author SHA1 Message Date
Jessica Stokes
ec8b2c6111 Add now-required Referer header to __wb/search queries
Fixes #48
2025-10-28 10:07:49 -07:00
Jessica Stokes
d9fee3c460 WebClient: update to synthesise the user-agent using env variables 2024-07-20 14:42:13 -07:00
Jessica Stokes
2084648afe Add AGPL License
fixes #21
2024-07-20 14:42:13 -07:00
Jessica Stokes
87d397f2de Make sure permit world writable temp has correct copyright information 2024-07-20 14:42:13 -07:00
Jessica Stokes
7bff16a815 Expand error handling framework to quash expected errors 2023-01-02 11:27:45 -08:00
Jessica Stokes
7adace5c5e Workaround another mysterious Wayback Machine CDX API bug 2021-08-08 19:56:17 -07:00
Jessica Stokes
c197daca3c Allow world-writable temp in cache clean script 2021-08-05 18:44:24 -07:00
Jessica Stokes
1b53b3e367 Make cache cleaning verbose 2021-08-04 13:17:44 -07:00
Jessica Stokes
459afbef61 Make sitemap.cgi executable (oops) 2021-08-01 22:04:34 -07:00
Jessica Stokes
9a6e6b4a06 Handle some particularly bizarre dates returned by the Wayback Machine 2021-08-01 21:47:13 -07:00
Jessica Stokes
f3e014b2b0 Improve error message for Sitemap 2021-08-01 20:53:18 -07:00
Jessica Stokes
bb63255d65 Add a script to allow cleaning the WebClient cache periodically 2021-08-01 20:48:32 -07:00
Jessica Stokes
92867ace6c Split out WebClient Cache module 2021-08-01 20:38:00 -07:00
Jessica Stokes
9db90412cf Secure requests 2021-08-01 20:34:21 -07:00
Jessica Stokes
9ff5dd899d Add some more comments 2021-08-01 17:20:36 -07:00
Jessica Stokes
9c3f2a780e Filter case-insensitively 2021-08-01 17:19:41 -07:00
Jessica Stokes
a8ec1d0ea2 Add filtering by URL and MIME type 2021-08-01 17:18:39 -07:00
Jessica Stokes
8627ed9371 Paginate the sitemap 2021-08-01 16:49:46 -07:00
Jessica Stokes
c76aa7f8dc Make pluralize function force numbers to be integers 2021-08-01 16:29:19 -07:00
Jessica Stokes
ed07256e71 Initial pass at handling wildcards and generating a site map 2021-08-01 12:42:22 -07:00
Jessica Stokes
6a250b5aa8 Refactor lookup redirect code to only have one redirect block 2021-08-01 11:53:18 -07:00
Jessica Stokes
48fef51428 Rubocop a bunch of stuff 2021-07-31 11:56:20 -07:00
Jessica Stokes
03ffb21ee3 Allow turning off the cache instead of using mocking 2021-07-31 11:45:15 -07:00
Jessica Stokes
7f3ee89372 Wrap CGI modules in a Rack app so we can do in-process Capybara requests 2021-07-14 09:48:16 -07:00
Jessica Stokes
d1ae5202ba Fix a warning from YAML.safe_load 2021-07-14 09:45:26 -07:00
Jessica Stokes
060df35d6d Make all CGI files self-calling modules so we can test them more easily 2021-06-01 20:41:09 -07:00
Jessica Stokes
482f604a45 Add a spec for the LegacyClientEncoding class 2021-05-29 15:37:57 -07:00
Jessica Stokes
ea9b8884f5 Add a spec for the CDX module, playing with spec'ing the whole thing 2021-05-29 13:02:38 -07:00
Jessica Stokes
b9791ae3e5 Add some utility functions for pluralisation and encoding overrides 2021-05-29 09:56:25 -07:00
Jessica Stokes
48e90c49c1 Add a TODO about curl's UTF-8 parameter behaviour 2021-05-29 09:45:47 -07:00
Jessica Stokes
195212d68d Another attempted fix for non-HTTPS redirects on NFSN 2021-05-29 09:38:27 -07:00
Jessica Stokes
ae3e0bc00a Fix non-HTTPS redirects on NFSN 2021-05-29 09:35:42 -07:00
Jessica Stokes
1516654bc4 Fix scheme handling of jump function 2021-05-29 09:30:54 -07:00
Jessica Stokes
e544921ade Add a function to jump to the earliest or most recent snapshots 2021-05-29 09:21:17 -07:00
Jessica Stokes
a1d9f66113 Remove unused imports from history.cgi 2021-05-29 08:58:16 -07:00
Jessica Stokes
a2829d04e8 Update outgoing HTTP request error handling 2021-05-28 21:15:31 -07:00
Jessica Stokes
7fcdf34899 Get Rubocopped 2021-05-28 19:34:58 -07:00
Jessica Stokes
2471131198 Move quotify into legacy encoding helper 2021-05-28 19:33:20 -07:00
Jessica Stokes
459bc199bd Move legacy encoding support into a class 2021-05-28 19:30:23 -07:00
Jessica Stokes
780ce02e67 Update error handling to hide unhandled errors, and show input errors 2021-05-28 19:19:31 -07:00
Jessica Stokes
d24b08f6d5 Implement a custom WebClient class, which caches web responses 2021-05-28 18:51:44 -07:00
Jessica Stokes
f6f9d13dd4 Ensure JSON is imported in the error reporting file 2021-05-28 18:50:36 -07:00
Jessica Stokes
1fc24a321e Move CDX functions to a class 2021-05-28 18:50:13 -07:00
Jessica Stokes
aedd201628 Gracefully handle zero-result pages 2021-05-28 17:17:07 -07:00
Jessica Stokes
4b7c48b5d9 Remove unneeded CGI imports 2021-05-24 17:27:30 -07:00
Jessica Stokes
466f864089 Run Rubocop's defaults over the whole thing 2021-05-24 17:25:26 -07:00
Jessica Stokes
1749427c15 Report errors to stderr instead 2021-05-24 16:49:16 -07:00
Jessica Stokes
6f27953090 Squelch Bugsnag's logging so CGI doesn't break 2021-05-24 15:07:50 -07:00
Jessica Stokes
7e64e42820 Integrate with Bugsnag so errors can be handled better 2021-05-24 14:12:39 -07:00
Jessica Stokes
f881856e58 Implement a patch to fix OpenURI on nearlyfreespeech.net hosting
NFSN's tmp directory is world-writable, but the CGI process runs as a user who is part of "world," which means Ruby's usual security protections here don't make sense. `FORCE_WORLD_WRITABLE_TEMP` should not be set in any normal circumstances, as it would cause a potentially significant security risk.
2021-05-24 13:00:18 -07:00