mirror of
https://github.com/calibrain/shelfmark.git
synced 2026-06-15 16:48:47 -04:00
Backport bug fixes from NemesisHubris/litfinder (addresses #999, #956, #1010, #1021, #1025, #1040) (#1066)
## Backport bug fixes from `NemesisHubris/litfinder` Forwards a curated set of bug fixes from [NemesisHubris/litfinder](https://github.com/NemesisHubris/litfinder) — a community fork of this project — that address open issues here. All commits preserve original authorship via `git cherry-pick`; this PR is a backport rather than original work. Each fix has been reviewed locally, lint/format-cleaned to match this repo's existing ruff config, and verified with the test suite. Rebrand strings, license switches, and features have been deliberately excluded. ### Upstream issues addressed - **#999** — Mirror URLs with query params no longer break search requests (strip query string/fragment in `normalize_http_url`) - **#956** — Apprise notifications now respect the configured proxy (proxy env vars injected before dispatch) - **#1025** — rTorrent: separate `RTORRENT_AUDIOBOOK_LABEL` setting, falls back to book label if unset - **#1010** — Stop button in Activity no longer makes the panel disappear (snapshot refresh on cancel) - **#1021** — Anna's Archive slow-download countdown now caps retries instead of looping forever - **#1040** — Empty destination directory cleaned up when write probe fails - **PR #1031** — Language detection from Anna's Archive distant path when listing metadata is missing ### Additional fixes (no open issue but clear bugs) - **fix: Python 2 `except` syntax across 27 files** — `except X, Y:` is a SyntaxError in Python 3 and prevents affected modules from importing at runtime. Mechanical sweep to `except (X, Y):`. - **fix(abb): info hash validation with magnet fallback** — adds SHA-1/SHA-256 hex validation on extracted info hashes; falls back to scanning the full page for a magnet link (e.g. posted in comments) when the table value is malformed. Also extends the exact-phrase fallback to manual queries and defaults the ABB listing language to `en` when missing, preventing valid results from being hidden by the language filter. Includes a small test-fixture fix (`test(abb): use valid hex info hashes in scraper test fixtures`) since the existing fixtures used non-hex placeholders that the new validation correctly rejects. - **fix: Anna's Archive title parser** — handles nested edition spans and filters `lgli` catalog descriptor entries (e.g. "Book/Online Audio") that were polluting search results. ### Deliberately not included - LitFinder rebranding (UI strings, Apprise app ID, logo). The `fix: three upstream bugs` commit (#999/#956/#1025) was cherry-picked with Apprise app-id, description, and logo-URL strings reverted from "LitFinder" back to "Shelfmark"; noted in the commit body. - Features from the LitFinder fork (multi-variant title search, multi-book flat-folder grouping, fuzzy text matching, "Leave in Place" output handler, admin display name, custom-source plugin system). These are larger behavior changes that each warrant their own focused review — happy to send any of them separately if of interest. - LitFinder-specific test environment and CI infrastructure. ### Verification - Backend: **1879 passed**, 96 skipped (1 preexisting failure on `seleniumbase`-dependent test in local venv; runs fine in the standard Docker image with the `browser` extra) - Lint, format, dead-code: all clean against this repo's existing ruff/vulture config - One follow-up cleanup commit (`style: ruff lint and format fixes for ported commits`) brings the cherry-picked code into compliance with this repo's ruff settings — no behavior changes there ### Etiquette / credit Per-commit authorship preserved by cherry-pick. The only edits to the original commits are: - `fix: three upstream bugs` — Apprise rebrand strings reverted to "Shelfmark" (noted in commit body, original author retained as `Co-Authored-By` via cherry-pick) - One follow-up `style:` commit for ruff config alignment Big thanks to [@NemesisHubris](https://github.com/NemesisHubris) for the original work in LitFinder; this PR exists to make sure these fixes reach Shelfmark's wider user base. Happy to revise scope, split into smaller PRs, or split off the Py2 cleanup separately if that's preferable. --------- Co-authored-by: NemesisHubris <155838970+NemesisHubris@users.noreply.github.com> Co-authored-by: CaliBrain <calibrain@l4n.xyz>
This commit is contained in:
@@ -1448,6 +1448,17 @@ def download_source_settings() -> list[SettingsField]:
|
||||
),
|
||||
default=False,
|
||||
),
|
||||
CheckboxField(
|
||||
key="DIRECT_DOWNLOAD_LANGUAGE_FROM_PATH",
|
||||
label="Detect Language From Distant Path",
|
||||
description=(
|
||||
"When language metadata is missing or unknown, parse the distant path "
|
||||
"(file path shown in search results) for language tags like [BD FR] or [En]. "
|
||||
"Also enables local language filtering so lgli files without AA language "
|
||||
"metadata are not excluded before the distant path can be checked."
|
||||
),
|
||||
default=False,
|
||||
),
|
||||
PasswordField(
|
||||
key="AA_DONATOR_KEY",
|
||||
label="Account Donator Key",
|
||||
|
||||
@@ -393,6 +393,41 @@ def _plugin_label(plugin: object, fallback_scheme: str) -> str:
|
||||
return " ".join(parts)
|
||||
|
||||
|
||||
def _apprise_proxy_env() -> dict[str, str]:
|
||||
"""Build proxy env vars from app config so Apprise respects the proxy setting."""
|
||||
import os
|
||||
|
||||
from shelfmark.core.config import config as _cfg
|
||||
|
||||
mode = str(_cfg.get("PROXY_MODE", "") or "").lower()
|
||||
env: dict[str, str] = {}
|
||||
|
||||
if mode == "http":
|
||||
http = str(_cfg.get("HTTP_PROXY", "") or "").strip()
|
||||
https = str(_cfg.get("HTTPS_PROXY", "") or "").strip() or http
|
||||
if http:
|
||||
env["HTTP_PROXY"] = http
|
||||
env["http_proxy"] = http
|
||||
if https:
|
||||
env["HTTPS_PROXY"] = https
|
||||
env["https_proxy"] = https
|
||||
elif mode == "socks5":
|
||||
socks = str(_cfg.get("SOCKS5_PROXY", "") or "").strip()
|
||||
if socks:
|
||||
env["HTTP_PROXY"] = socks
|
||||
env["http_proxy"] = socks
|
||||
env["HTTPS_PROXY"] = socks
|
||||
env["https_proxy"] = socks
|
||||
|
||||
no_proxy = str(_cfg.get("NO_PROXY", "") or "").strip()
|
||||
if no_proxy and env:
|
||||
env["NO_PROXY"] = no_proxy
|
||||
env["no_proxy"] = no_proxy
|
||||
|
||||
# Don't override if the user already set these in the environment directly
|
||||
return {k: v for k, v in env.items() if not os.environ.get(k)}
|
||||
|
||||
|
||||
def _dispatch_to_apprise(
|
||||
urls: Iterable[str],
|
||||
*,
|
||||
@@ -400,6 +435,8 @@ def _dispatch_to_apprise(
|
||||
body: str,
|
||||
notify_type: object,
|
||||
) -> dict[str, Any]:
|
||||
import os
|
||||
|
||||
normalized_urls = _normalize_urls(list(urls))
|
||||
url_schemes = _extract_url_schemes(normalized_urls)
|
||||
if not normalized_urls:
|
||||
@@ -408,6 +445,11 @@ def _dispatch_to_apprise(
|
||||
if apprise is None:
|
||||
return {"success": False, "message": "Apprise is not installed"}
|
||||
|
||||
proxy_env = _apprise_proxy_env()
|
||||
if proxy_env:
|
||||
logger.debug("Applying proxy env for Apprise dispatch: %s", list(proxy_env.keys()))
|
||||
os.environ.update(proxy_env)
|
||||
|
||||
valid_urls = 0
|
||||
invalid_urls = 0
|
||||
delivered_urls = 0
|
||||
|
||||
@@ -52,6 +52,13 @@ def normalize_http_url(
|
||||
if scheme:
|
||||
normalized = f"{scheme}://{normalized}"
|
||||
|
||||
# Strip query string and fragment — mirrors are used as base URLs for
|
||||
# constructing search requests; params/fragments on the configured URL
|
||||
# produce malformed URLs when paths are appended (issue #999).
|
||||
parsed = urlparse(normalized)
|
||||
if parsed.query or parsed.fragment:
|
||||
normalized = parsed._replace(query="", fragment="").geturl()
|
||||
|
||||
if strip_trailing_slash:
|
||||
normalized = normalized.rstrip("/")
|
||||
|
||||
|
||||
@@ -115,6 +115,7 @@ class RTorrentClient(DownloadClient):
|
||||
self._rpc = _create_rtorrent_server_proxy(self._base_url)
|
||||
self._download_dir = config_text(config.get("RTORRENT_DOWNLOAD_DIR", ""))
|
||||
self._label = config_text(config.get("RTORRENT_LABEL", ""))
|
||||
self._audiobook_label = config_text(config.get("RTORRENT_AUDIOBOOK_LABEL", ""))
|
||||
|
||||
@staticmethod
|
||||
def is_configured() -> bool:
|
||||
@@ -161,7 +162,11 @@ class RTorrentClient(DownloadClient):
|
||||
|
||||
commands = []
|
||||
|
||||
label = category or self._label
|
||||
is_audiobook = kwargs.get("content_type") == "audiobook"
|
||||
default_label = (
|
||||
self._audiobook_label if is_audiobook and self._audiobook_label else self._label
|
||||
)
|
||||
label = category or default_label
|
||||
if label:
|
||||
logger.debug("Setting rTorrent label: %s", label)
|
||||
commands.append(f"d.custom1.set={label}")
|
||||
|
||||
@@ -748,11 +748,18 @@ def prowlarr_clients_settings() -> list[SettingsField]:
|
||||
TextField(
|
||||
key="RTORRENT_LABEL",
|
||||
label="Book Label",
|
||||
description="Label to assign to book downloads in rTorrent",
|
||||
description="Label to assign to ebook downloads in rTorrent",
|
||||
placeholder="cwabd",
|
||||
default="cwabd",
|
||||
show_when={"field": "PROWLARR_TORRENT_CLIENT", "value": "rtorrent"},
|
||||
),
|
||||
TextField(
|
||||
key="RTORRENT_AUDIOBOOK_LABEL",
|
||||
label="Audiobook Label",
|
||||
description="Label to assign to audiobook downloads in rTorrent (falls back to Book Label if not set)",
|
||||
placeholder="audiobooks",
|
||||
show_when={"field": "PROWLARR_TORRENT_CLIENT", "value": "rtorrent"},
|
||||
),
|
||||
TextField(
|
||||
key="RTORRENT_DOWNLOAD_DIR",
|
||||
label="Download Directory",
|
||||
|
||||
@@ -2,6 +2,7 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import contextlib
|
||||
import uuid
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
@@ -40,9 +41,11 @@ def validate_destination(
|
||||
status_callback("error", f"Destination is not a directory: {destination}")
|
||||
return False
|
||||
|
||||
created_by_us = False
|
||||
if not destination_exists:
|
||||
try:
|
||||
run_blocking_io(destination.mkdir, parents=True, exist_ok=True)
|
||||
created_by_us = True
|
||||
except (OSError, PermissionError) as exc:
|
||||
log_path_permission_context("destination_create", destination)
|
||||
logger.warning("Cannot create destination: %s (%s)", destination, exc)
|
||||
@@ -63,6 +66,9 @@ def validate_destination(
|
||||
log_path_permission_context("destination_write_probe", destination)
|
||||
logger.warning("Destination not writable: %s (%s)", destination, exc)
|
||||
status_callback("error", f"Destination not writable: {destination} ({exc})")
|
||||
if created_by_us:
|
||||
with contextlib.suppress(OSError):
|
||||
run_blocking_io(destination.rmdir)
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
@@ -417,6 +417,18 @@ def extract_magnet_link(details_url: str, hostname: str = "audiobookbay.lu") ->
|
||||
# Clean up info hash (remove whitespace, ensure uppercase)
|
||||
info_hash = re.sub(r"\s+", "", info_hash).upper()
|
||||
|
||||
# Validate: SHA1 = 40 hex chars, SHA256 = 64 hex chars
|
||||
if not re.match(r"^[0-9A-F]{40}$|^[0-9A-F]{64}$", info_hash):
|
||||
logger.warning("Info Hash invalid (got %r), trying magnet fallback.", info_hash)
|
||||
# Fallback: search entire page for a complete magnet link (e.g. posted in comments)
|
||||
magnet_match = re.search(r"magnet:\?xt=urn:btih:([0-9a-fA-F]{40,64})", detail_html)
|
||||
if magnet_match:
|
||||
info_hash = magnet_match.group(1).upper()
|
||||
logger.info("Found hash via magnet fallback: %s", info_hash)
|
||||
else:
|
||||
logger.warning("No valid magnet link found on page, giving up.")
|
||||
return None
|
||||
|
||||
# 2. Extract Trackers
|
||||
# Find all <td> containing udp:// or http://
|
||||
trackers = []
|
||||
|
||||
@@ -238,8 +238,8 @@ class AudiobookBaySource(ReleaseSource):
|
||||
exact_phrase=exact_phrase,
|
||||
)
|
||||
|
||||
# For auto-generated queries, fallback to broad matching if exact phrase returns nothing.
|
||||
if exact_phrase and not results and not plan.manual_query:
|
||||
# Fallback to broad matching if exact phrase returns nothing (manual or auto query).
|
||||
if exact_phrase and not results:
|
||||
logger.info(
|
||||
"No exact phrase results, retrying AudiobookBay search without quotes"
|
||||
)
|
||||
@@ -288,7 +288,7 @@ class AudiobookBaySource(ReleaseSource):
|
||||
size_str = result.get("size")
|
||||
size_bytes = parse_size(size_str) if size_str else None
|
||||
language_raw = result.get("language")
|
||||
language_code = _map_language(language_raw) if language_raw else None
|
||||
language_code = _map_language(language_raw) if language_raw else "en"
|
||||
bitrate = result.get("bitrate")
|
||||
bitrate_kbps = _parse_bitrate_to_kbps(bitrate)
|
||||
|
||||
|
||||
@@ -3,9 +3,12 @@
|
||||
import itertools
|
||||
import json
|
||||
import re
|
||||
import threading
|
||||
import time
|
||||
import unicodedata
|
||||
from dataclasses import replace
|
||||
from http import HTTPStatus
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, ClassVar, NoReturn, TypedDict
|
||||
from urllib.parse import quote, urlparse
|
||||
|
||||
@@ -197,6 +200,49 @@ _SOURCE_FAILURE_THRESHOLD = 4
|
||||
_MIN_VALID_FILE_SIZE = 10 * 1024
|
||||
_AA_COUNTDOWN_MAX_SECONDS = 300
|
||||
|
||||
# --- Distant-path language detection ---
|
||||
|
||||
_DISTANT_PATH_EXTENSIONS = (
|
||||
"epub",
|
||||
"mobi",
|
||||
"azw3",
|
||||
"fb2",
|
||||
"djvu",
|
||||
"cbz",
|
||||
"cbr",
|
||||
"pdf",
|
||||
"zip",
|
||||
"rar",
|
||||
"m4b",
|
||||
"mp3",
|
||||
)
|
||||
_DISTANT_PATH_EXTENSION_PATTERN = "|".join(re.escape(e) for e in _DISTANT_PATH_EXTENSIONS)
|
||||
_DISTANT_PATH_PATTERN = re.compile(
|
||||
rf"(?:[A-Za-z0-9._-]+/)?[A-Za-z]:(?:\\|/)[^\n\r<>\"]+?\.(?:{_DISTANT_PATH_EXTENSION_PATTERN})\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_DISTANT_PATH_FALLBACK_PATTERN = re.compile(
|
||||
r"(?:[A-Za-z0-9._-]+/)?[A-Za-z]:(?:\\|/)[^\n\r<>\"]+",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_BRACKETED_LANGUAGE_CODE_PATTERN = re.compile(
|
||||
r"\[(?:bd[\s._-]*)?([A-Za-z]{2,3})\]",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_KEYED_LANGUAGE_CODE_PATTERN = re.compile(
|
||||
r"\b(?:bd|lang(?:uage)?)\s*[:._-]?\s*([A-Za-z]{2,3})\b",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
_LANGUAGE_CODE_TOKEN_PATTERN = re.compile(
|
||||
r"(?:^|[\s_./\\\-\[(])([A-Za-z]{2,3})(?=$|[\s_./\\\-)\]])"
|
||||
)
|
||||
_LANGUAGE_NAME_TOKEN_PATTERN = re.compile(r"[a-z]{4,}(?:-[a-z0-9]+)?")
|
||||
_LANGUAGE_ALIAS_TO_CODE: dict[str, str] | None = None
|
||||
_LANGUAGE_ALIAS_LOCK = threading.Lock()
|
||||
_LANGUAGE_PLACEHOLDERS = frozenset({"", "-", "--", "unknown", "unk", "n/a", "na"})
|
||||
# Short codes that appear in common words — require bracket/key context to accept
|
||||
_AMBIGUOUS_SHORT_LANGUAGE_CODES = frozenset({"de", "en", "it", "la", "no", "or", "is", "in"})
|
||||
|
||||
# Sources that require Cloudflare bypass
|
||||
_CF_BYPASS_REQUIRED = frozenset({"aa-slow-nowait", "aa-slow-wait", "zlib", "welib"})
|
||||
|
||||
@@ -204,6 +250,189 @@ _CF_BYPASS_REQUIRED = frozenset({"aa-slow-nowait", "aa-slow-wait", "zlib", "weli
|
||||
_AA_PAGE_SOURCES = frozenset({"aa-slow-nowait", "aa-slow-wait"})
|
||||
|
||||
|
||||
def _is_language_from_path_enabled() -> bool:
|
||||
return bool(config.get("DIRECT_DOWNLOAD_LANGUAGE_FROM_PATH", False))
|
||||
|
||||
|
||||
def _normalize_language_token(value: str) -> str:
|
||||
normalized = value.strip().lower()
|
||||
for dash in ("‑", "–", "—", "−"):
|
||||
normalized = normalized.replace(dash, "-")
|
||||
return normalized
|
||||
|
||||
|
||||
def _fold_text(value: str) -> str:
|
||||
normalized = unicodedata.normalize("NFKD", value)
|
||||
return "".join(c for c in normalized if not unicodedata.combining(c)).lower()
|
||||
|
||||
|
||||
def _language_alias_to_code() -> dict[str, str]:
|
||||
"""Build alias→code map from bundled language metadata (lazy, cached)."""
|
||||
global _LANGUAGE_ALIAS_TO_CODE
|
||||
cached = _LANGUAGE_ALIAS_TO_CODE
|
||||
if cached is not None:
|
||||
return cached
|
||||
|
||||
with _LANGUAGE_ALIAS_LOCK:
|
||||
cached = _LANGUAGE_ALIAS_TO_CODE
|
||||
if cached is not None:
|
||||
return cached
|
||||
|
||||
mapping: dict[str, str] = {}
|
||||
data_path = Path(__file__).resolve().parents[2] / "data" / "book-languages.json"
|
||||
|
||||
try:
|
||||
raw = json.loads(data_path.read_text(encoding="utf-8"))
|
||||
except OSError, ValueError, TypeError:
|
||||
_LANGUAGE_ALIAS_TO_CODE = {}
|
||||
return _LANGUAGE_ALIAS_TO_CODE
|
||||
|
||||
if not isinstance(raw, list):
|
||||
_LANGUAGE_ALIAS_TO_CODE = {}
|
||||
return _LANGUAGE_ALIAS_TO_CODE
|
||||
|
||||
for item in raw:
|
||||
if not isinstance(item, dict):
|
||||
continue
|
||||
code = _normalize_language_token(str(item.get("code", "")))
|
||||
name = _normalize_language_token(str(item.get("language", "")))
|
||||
if not code:
|
||||
continue
|
||||
mapping.setdefault(code, code)
|
||||
mapping.setdefault(code.replace("-", "_"), code)
|
||||
mapping.setdefault(code.split("-")[0], code)
|
||||
mapping.setdefault(_fold_text(code), code)
|
||||
if name:
|
||||
mapping.setdefault(name, code)
|
||||
mapping.setdefault(_fold_text(name), code)
|
||||
|
||||
_LANGUAGE_ALIAS_TO_CODE = mapping
|
||||
return _LANGUAGE_ALIAS_TO_CODE
|
||||
|
||||
|
||||
def _extract_distant_path(row: Tag, *, enabled: bool) -> str | None:
|
||||
"""Extract the Windows-style file path from an AA search result row."""
|
||||
if not enabled:
|
||||
return None
|
||||
|
||||
def _normalize_candidate(text: str) -> str:
|
||||
normalized = re.sub(r"\s*([\\/])\s*", r"\1", text)
|
||||
normalized = re.sub(r":\s*([\\/])", r":\1", normalized)
|
||||
return re.sub(
|
||||
r"\s+\.(epub|mobi|azw3|fb2|djvu|cbz|cbr|pdf|zip|rar|m4b|mp3)\b",
|
||||
r".\1",
|
||||
normalized,
|
||||
flags=re.IGNORECASE,
|
||||
)
|
||||
|
||||
candidates = [row.get_text(" ", strip=True)]
|
||||
for cell in row.find_all("td"):
|
||||
cell_text = cell.get_text(" ", strip=True)
|
||||
if cell_text:
|
||||
candidates.append(cell_text)
|
||||
|
||||
best: str | None = None
|
||||
for text in candidates:
|
||||
for match in _DISTANT_PATH_PATTERN.findall(_normalize_candidate(text)):
|
||||
candidate = match.strip().rstrip(".,;")
|
||||
if best is None or len(candidate) > len(best):
|
||||
best = candidate
|
||||
|
||||
if best is not None:
|
||||
return best
|
||||
|
||||
for text in candidates:
|
||||
for match in _DISTANT_PATH_FALLBACK_PATTERN.findall(_normalize_candidate(text)):
|
||||
candidate = match.strip().rstrip(".,;")
|
||||
if best is None or len(candidate) > len(best):
|
||||
best = candidate
|
||||
|
||||
return best
|
||||
|
||||
|
||||
def _detect_language_from_distant_path(path: str | None) -> str | None:
|
||||
"""Infer a language code from distant-path tags such as [BD FR] or [Fr]."""
|
||||
if not path:
|
||||
return None
|
||||
|
||||
aliases = _language_alias_to_code()
|
||||
if not aliases:
|
||||
return None
|
||||
|
||||
folded_path = _fold_text(path)
|
||||
strong_candidates: list[str] = []
|
||||
|
||||
for code in _BRACKETED_LANGUAGE_CODE_PATTERN.findall(path):
|
||||
normalized = _normalize_language_token(code)
|
||||
if normalized in aliases:
|
||||
strong_candidates.append(aliases[normalized])
|
||||
|
||||
for code in _KEYED_LANGUAGE_CODE_PATTERN.findall(path):
|
||||
normalized = _normalize_language_token(code)
|
||||
if normalized in aliases:
|
||||
strong_candidates.append(aliases[normalized])
|
||||
|
||||
non_ambiguous = [c for c in strong_candidates if c not in _AMBIGUOUS_SHORT_LANGUAGE_CODES]
|
||||
if non_ambiguous:
|
||||
return non_ambiguous[0]
|
||||
|
||||
for token in _LANGUAGE_NAME_TOKEN_PATTERN.findall(folded_path):
|
||||
normalized = _normalize_language_token(token)
|
||||
if normalized in aliases:
|
||||
candidate = aliases[normalized]
|
||||
if candidate not in _AMBIGUOUS_SHORT_LANGUAGE_CODES:
|
||||
return candidate
|
||||
|
||||
if strong_candidates:
|
||||
return strong_candidates[0]
|
||||
|
||||
for code in _LANGUAGE_CODE_TOKEN_PATTERN.findall(path):
|
||||
normalized = _normalize_language_token(code)
|
||||
if normalized in _AMBIGUOUS_SHORT_LANGUAGE_CODES:
|
||||
continue
|
||||
if normalized in aliases:
|
||||
return aliases[normalized]
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def _is_missing_or_placeholder_language(language: str | None) -> bool:
|
||||
if language is None:
|
||||
return True
|
||||
return _normalize_language_token(language) in _LANGUAGE_PLACEHOLDERS
|
||||
|
||||
|
||||
def _normalize_requested_languages(languages: list[str] | None) -> set[str]:
|
||||
if not languages:
|
||||
return set()
|
||||
aliases = _language_alias_to_code()
|
||||
normalized: set[str] = set()
|
||||
for value in languages:
|
||||
token = _normalize_language_token(str(value))
|
||||
if not token or token == "all": # noqa: S105 - "all" is a language sentinel
|
||||
continue
|
||||
normalized.add(aliases.get(token, token))
|
||||
return normalized
|
||||
|
||||
|
||||
def _book_matches_requested_languages(book_language: str | None, requested: set[str]) -> bool:
|
||||
"""Return True when a book's language matches the requested filter.
|
||||
|
||||
Books with unknown/missing language always pass — the server-side &lang= filter
|
||||
already narrowed the result set, so dropping unlabelled rows hides valid results.
|
||||
"""
|
||||
if not requested:
|
||||
return True
|
||||
if not book_language:
|
||||
return True
|
||||
aliases = _language_alias_to_code()
|
||||
normalized_book = aliases.get(
|
||||
_normalize_language_token(book_language),
|
||||
_normalize_language_token(book_language),
|
||||
)
|
||||
return normalized_book in requested
|
||||
|
||||
|
||||
def _is_configured_zlib_link(url: str) -> bool:
|
||||
"""Return True when a URL belongs to a configured Z-Library mirror."""
|
||||
from shelfmark.core.mirrors import get_zlib_cookie_domains
|
||||
@@ -360,9 +589,17 @@ def search_books(query: str, filters: SearchFilters) -> list[BrowseRecord]:
|
||||
|
||||
filters_query = ""
|
||||
|
||||
for value in filters.lang or []:
|
||||
if value and value != "all":
|
||||
filters_query += f"&lang={quote(value)}"
|
||||
path_language_enabled = _is_language_from_path_enabled()
|
||||
requested_langs = _normalize_requested_languages(filters.lang)
|
||||
|
||||
# When path-language inference is on and a language is requested, skip the
|
||||
# server-side &lang= filter: lgli files often have no AA language metadata
|
||||
# and would be excluded before we can infer language from the distant path.
|
||||
# Local filtering below handles the narrowing instead.
|
||||
if not (path_language_enabled and requested_langs):
|
||||
for value in filters.lang or []:
|
||||
if value and value != "all":
|
||||
filters_query += f"&lang={quote(value)}"
|
||||
|
||||
if filters.sort and filters.sort != "relevance":
|
||||
filters_query += f"&sort={quote(filters.sort)}"
|
||||
@@ -417,6 +654,9 @@ def search_books(query: str, filters: SearchFilters) -> list[BrowseRecord]:
|
||||
if book:
|
||||
books.append(book)
|
||||
|
||||
if path_language_enabled and requested_langs:
|
||||
books = [b for b in books if _book_matches_requested_languages(b.language, requested_langs)]
|
||||
|
||||
supported_formats = _get_supported_formats()
|
||||
|
||||
books.sort(
|
||||
@@ -470,10 +710,23 @@ def _parse_search_result_row(row: Tag) -> BrowseRecord | None:
|
||||
if not record_id:
|
||||
return None
|
||||
|
||||
path_language_enabled = _is_language_from_path_enabled()
|
||||
distant_path = _extract_distant_path(row, enabled=path_language_enabled)
|
||||
|
||||
preview_img = cells[0].find("img")
|
||||
preview = _get_attr(preview_img, "src") if isinstance(preview_img, Tag) else None
|
||||
|
||||
title = _first_stripped_text(cells[1].find("span"))
|
||||
title_span = cells[1].find("span")
|
||||
if isinstance(title_span, Tag):
|
||||
# AA nests related-edition spans inside the main title span — take only direct text.
|
||||
direct = " ".join(
|
||||
str(c).strip()
|
||||
for c in title_span.children
|
||||
if isinstance(c, NavigableString) and str(c).strip()
|
||||
).strip()
|
||||
title = direct or _first_stripped_text(title_span)
|
||||
else:
|
||||
title = None
|
||||
author = _first_stripped_text(cells[2].find("span"))
|
||||
publisher = _first_stripped_text(cells[3].find("span"))
|
||||
year = _first_stripped_text(cells[4].find("span"))
|
||||
@@ -482,18 +735,19 @@ def _parse_search_result_row(row: Tag) -> BrowseRecord | None:
|
||||
file_format = _first_stripped_text(cells[9].find("span"))
|
||||
size = _first_stripped_text(cells[10].find("span"))
|
||||
|
||||
if (
|
||||
title is None
|
||||
or author is None
|
||||
or publisher is None
|
||||
or year is None
|
||||
or language is None
|
||||
or content is None
|
||||
or file_format is None
|
||||
or size is None
|
||||
):
|
||||
# Only title and format are truly required — lgli rows often have sparse metadata
|
||||
if title is None or file_format is None:
|
||||
return None
|
||||
|
||||
# Skip entries where the title is a catalog format descriptor, not a real title
|
||||
# e.g. "Book/Online Audio", "Print book" — lgli metadata pollution
|
||||
if title and "/" in title and len(title) < 40 and not any(c.isdigit() for c in title):
|
||||
return None
|
||||
|
||||
if path_language_enabled and _is_missing_or_placeholder_language(language):
|
||||
detected = _detect_language_from_distant_path(distant_path)
|
||||
language = detected or "unknown"
|
||||
|
||||
return BrowseRecord(
|
||||
id=record_id,
|
||||
title=title,
|
||||
@@ -506,6 +760,7 @@ def _parse_search_result_row(row: Tag) -> BrowseRecord | None:
|
||||
content=content.lower() if content else None,
|
||||
format=file_format.lower() if file_format else None,
|
||||
size=size,
|
||||
download_path=distant_path,
|
||||
)
|
||||
except (AttributeError, IndexError, KeyError, TypeError) as e:
|
||||
logger.error_trace(f"Error parsing search result row: {e}")
|
||||
@@ -1228,6 +1483,9 @@ def _get_download_url(
|
||||
return downloader.get_absolute_url(link, url)
|
||||
|
||||
|
||||
_AA_COUNTDOWN_MAX_RETRIES = 3
|
||||
|
||||
|
||||
def _extract_slow_download_url(
|
||||
soup: BeautifulSoup,
|
||||
link: str,
|
||||
@@ -1236,6 +1494,7 @@ def _extract_slow_download_url(
|
||||
status_callback: Callable[[str, str | None], None] | None,
|
||||
selector: network.AAMirrorSelector,
|
||||
source_context: str | None = None,
|
||||
_countdown_attempts: int = 0,
|
||||
) -> str:
|
||||
"""Extract download URL from AA slow download pages."""
|
||||
html_str = str(soup)
|
||||
@@ -1300,6 +1559,14 @@ def _extract_slow_download_url(
|
||||
|
||||
countdown_seconds = _extract_countdown_seconds(soup, html_str)
|
||||
if countdown_seconds > 0:
|
||||
if _countdown_attempts >= _AA_COUNTDOWN_MAX_RETRIES:
|
||||
logger.warning(
|
||||
"Countdown retry limit (%s) reached for %s, giving up",
|
||||
_AA_COUNTDOWN_MAX_RETRIES,
|
||||
title,
|
||||
)
|
||||
return ""
|
||||
|
||||
max_countdown_seconds = 600
|
||||
sleep_time = min(countdown_seconds, max_countdown_seconds)
|
||||
if countdown_seconds > max_countdown_seconds:
|
||||
@@ -1308,7 +1575,13 @@ def _extract_slow_download_url(
|
||||
countdown_seconds,
|
||||
max_countdown_seconds,
|
||||
)
|
||||
logger.info("AA waitlist: %ss for %s", sleep_time, title)
|
||||
logger.info(
|
||||
"AA waitlist: %ss for %s (attempt %s/%s)",
|
||||
sleep_time,
|
||||
title,
|
||||
_countdown_attempts + 1,
|
||||
_AA_COUNTDOWN_MAX_RETRIES,
|
||||
)
|
||||
|
||||
# Live countdown with status updates
|
||||
for remaining in range(sleep_time, 0, -1):
|
||||
@@ -1329,8 +1602,21 @@ def _extract_slow_download_url(
|
||||
if status_callback and source_context:
|
||||
status_callback("resolving", f"{source_context} - Fetching")
|
||||
|
||||
return _get_download_url(
|
||||
link, title, cancel_flag, status_callback, selector, source_context
|
||||
html = downloader.html_get_page(
|
||||
link, selector=selector, cancel_flag=cancel_flag, status_callback=status_callback
|
||||
)
|
||||
if not html:
|
||||
return ""
|
||||
new_soup = BeautifulSoup(_html_response_text(html), "html.parser")
|
||||
return _extract_slow_download_url(
|
||||
new_soup,
|
||||
link,
|
||||
title,
|
||||
cancel_flag,
|
||||
status_callback,
|
||||
selector,
|
||||
source_context,
|
||||
_countdown_attempts + 1,
|
||||
)
|
||||
|
||||
link_texts = [a.get_text(strip=True)[:50] for a in soup.find_all("a", href=True)[:10]]
|
||||
@@ -1645,7 +1931,6 @@ class DirectDownloadSource(ReleaseSource):
|
||||
except Exception:
|
||||
logger.exception("Search error")
|
||||
|
||||
logger.info("Found %s releases via title+author", len(all_results))
|
||||
return [_browse_record_to_release(record) for record in all_results]
|
||||
|
||||
def is_available(self) -> bool:
|
||||
|
||||
@@ -1491,7 +1491,7 @@ function App() {
|
||||
const handleCancel = async (id: string) => {
|
||||
try {
|
||||
await cancelDownload(id);
|
||||
await fetchStatus();
|
||||
await Promise.all([fetchStatus(), refreshActivitySnapshot()]);
|
||||
} catch (error) {
|
||||
console.error('Cancel failed:', error);
|
||||
showToast('Failed to cancel/clear download', 'error');
|
||||
|
||||
@@ -58,7 +58,7 @@ SAMPLE_DETAIL_HTML = """
|
||||
<table>
|
||||
<tr>
|
||||
<td>Info Hash</td>
|
||||
<td>ABC123DEF456GHI789JKL012MNO345PQR678STU</td>
|
||||
<td>ABC123DEF456789012345678901234567890ABCD</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Tracker 1</td>
|
||||
@@ -83,7 +83,7 @@ DETAIL_HTML_NO_TRACKERS = """
|
||||
<table>
|
||||
<tr>
|
||||
<td>Info Hash</td>
|
||||
<td>ABC123DEF456GHI789JKL012MNO345PQR678STU</td>
|
||||
<td>ABC123DEF456789012345678901234567890ABCD</td>
|
||||
</tr>
|
||||
</table>
|
||||
</body>
|
||||
@@ -360,7 +360,7 @@ class TestExtractMagnetLink:
|
||||
|
||||
assert magnet_link is not None
|
||||
assert magnet_link.startswith("magnet:?xt=urn:btih:")
|
||||
assert "ABC123DEF456GHI789JKL012MNO345PQR678STU" in magnet_link
|
||||
assert "ABC123DEF456789012345678901234567890ABCD" in magnet_link
|
||||
assert "udp%3A//tracker.openbittorrent.com%3A80" in magnet_link
|
||||
assert "http%3A//tracker.example.com%3A8080" in magnet_link
|
||||
assert mock_html_get.call_count == 2
|
||||
@@ -395,7 +395,7 @@ class TestExtractMagnetLink:
|
||||
|
||||
assert magnet_link is not None
|
||||
assert magnet_link.startswith("magnet:?xt=urn:btih:")
|
||||
assert "ABC123DEF456GHI789JKL012MNO345PQR678STU" in magnet_link
|
||||
assert "ABC123DEF456789012345678901234567890ABCD" in magnet_link
|
||||
# Should contain default trackers
|
||||
assert "udp%3A//tracker.openbittorrent.com%3A80" in magnet_link
|
||||
|
||||
@@ -441,7 +441,7 @@ class TestExtractMagnetLink:
|
||||
<table>
|
||||
<tr>
|
||||
<td>Info Hash</td>
|
||||
<td>ABC 123 DEF 456</td>
|
||||
<td>ABC 123 DEF 456 789 012 345 678 901 234 567 890 ABC D</td>
|
||||
</tr>
|
||||
</table>
|
||||
</body>
|
||||
|
||||
@@ -368,6 +368,20 @@ def test_download_source_settings_include_direct_download_toggle():
|
||||
assert "Add your own mirror URLs" in toggle_field.description
|
||||
|
||||
|
||||
def test_download_source_settings_include_distant_path_language_toggle():
|
||||
from shelfmark.config.settings import download_source_settings
|
||||
|
||||
fields = download_source_settings()
|
||||
toggle_field = next(
|
||||
field
|
||||
for field in fields
|
||||
if getattr(field, "key", None) == "DIRECT_DOWNLOAD_LANGUAGE_FROM_PATH"
|
||||
)
|
||||
|
||||
assert toggle_field.default is False
|
||||
assert "distant path" in toggle_field.description.lower()
|
||||
|
||||
|
||||
def test_fast_source_options_lock_entries_without_mirror_or_donator_requirements(monkeypatch):
|
||||
from shelfmark.config.settings import _get_fast_source_options
|
||||
|
||||
|
||||
@@ -597,3 +597,72 @@ def test_resolve_user_routes_expands_multiselect_event_rows(monkeypatch):
|
||||
{"event": "request_fulfilled", "url": "ntfys://ntfy.sh/user-main"},
|
||||
{"event": "all", "url": "ntfys://ntfy.sh/user-all"},
|
||||
]
|
||||
|
||||
|
||||
class TestAppriseProxyEnv:
|
||||
"""Regression tests for issue #956 — proxy settings ignored for notifications."""
|
||||
|
||||
def _patch_config(self, monkeypatch, values):
|
||||
from shelfmark.core import config as config_module
|
||||
|
||||
def _fake_get(key, default="", **_kwargs):
|
||||
return values.get(key, default)
|
||||
|
||||
monkeypatch.setattr(config_module.config, "get", _fake_get)
|
||||
|
||||
def test_http_proxy_mode_injects_proxy_env(self, monkeypatch):
|
||||
self._patch_config(
|
||||
monkeypatch,
|
||||
{
|
||||
"PROXY_MODE": "http",
|
||||
"HTTP_PROXY": "http://proxy.example.com:8080",
|
||||
"HTTPS_PROXY": "",
|
||||
"NO_PROXY": "",
|
||||
},
|
||||
)
|
||||
monkeypatch.delenv("HTTP_PROXY", raising=False)
|
||||
monkeypatch.delenv("HTTPS_PROXY", raising=False)
|
||||
|
||||
result = notifications_module._apprise_proxy_env()
|
||||
|
||||
assert result["HTTP_PROXY"] == "http://proxy.example.com:8080"
|
||||
assert result["HTTPS_PROXY"] == "http://proxy.example.com:8080"
|
||||
|
||||
def test_socks5_proxy_mode_injects_socks_env(self, monkeypatch):
|
||||
self._patch_config(
|
||||
monkeypatch,
|
||||
{
|
||||
"PROXY_MODE": "socks5",
|
||||
"SOCKS5_PROXY": "socks5://proxy.example.com:1080",
|
||||
"NO_PROXY": "",
|
||||
},
|
||||
)
|
||||
monkeypatch.delenv("HTTP_PROXY", raising=False)
|
||||
monkeypatch.delenv("HTTPS_PROXY", raising=False)
|
||||
|
||||
result = notifications_module._apprise_proxy_env()
|
||||
|
||||
assert result["HTTP_PROXY"] == "socks5://proxy.example.com:1080"
|
||||
assert result["HTTPS_PROXY"] == "socks5://proxy.example.com:1080"
|
||||
|
||||
def test_no_proxy_mode_returns_empty_dict(self, monkeypatch):
|
||||
self._patch_config(monkeypatch, {"PROXY_MODE": ""})
|
||||
|
||||
result = notifications_module._apprise_proxy_env()
|
||||
|
||||
assert result == {}
|
||||
|
||||
def test_does_not_override_already_set_env_vars(self, monkeypatch):
|
||||
self._patch_config(
|
||||
monkeypatch,
|
||||
{
|
||||
"PROXY_MODE": "http",
|
||||
"HTTP_PROXY": "http://new-proxy.example.com:8080",
|
||||
"NO_PROXY": "",
|
||||
},
|
||||
)
|
||||
monkeypatch.setenv("HTTP_PROXY", "http://existing-proxy.example.com:3128")
|
||||
|
||||
result = notifications_module._apprise_proxy_env()
|
||||
|
||||
assert "HTTP_PROXY" not in result
|
||||
|
||||
@@ -5,6 +5,31 @@ import types
|
||||
import xmlrpc.client as stdlib_xmlrpc_client
|
||||
|
||||
from shelfmark.core import utils
|
||||
from shelfmark.core.utils import normalize_http_url
|
||||
|
||||
|
||||
class TestNormalizeHttpUrlQueryStripping:
|
||||
"""Regression tests for issue #999 — mirror URLs with query params/fragments."""
|
||||
|
||||
def test_strips_query_string_from_configured_url(self) -> None:
|
||||
result = normalize_http_url("http://mirror.example.com/search?token=abc123")
|
||||
assert result == "http://mirror.example.com/search"
|
||||
|
||||
def test_strips_fragment_from_configured_url(self) -> None:
|
||||
result = normalize_http_url("http://mirror.example.com/search#section")
|
||||
assert result == "http://mirror.example.com/search"
|
||||
|
||||
def test_strips_both_query_and_fragment(self) -> None:
|
||||
result = normalize_http_url("https://mirror.example.com/path?key=val&x=1#top")
|
||||
assert result == "https://mirror.example.com/path"
|
||||
|
||||
def test_plain_url_unchanged(self) -> None:
|
||||
result = normalize_http_url("http://mirror.example.com/search")
|
||||
assert result == "http://mirror.example.com/search"
|
||||
|
||||
def test_trailing_slash_still_stripped_after_query_removal(self) -> None:
|
||||
result = normalize_http_url("http://mirror.example.com/?token=x")
|
||||
assert result == "http://mirror.example.com"
|
||||
|
||||
|
||||
def test_get_hardened_xmlrpc_client_tolerates_patch_runtime_error(monkeypatch) -> None:
|
||||
|
||||
@@ -182,3 +182,170 @@ class TestDirectDownloadSearchQueries:
|
||||
("mistborn custom query", ["en"], ["epub"]),
|
||||
("mistborn custom query", None, ["epub"]),
|
||||
]
|
||||
|
||||
|
||||
# --- Distant-path language detection tests ---
|
||||
|
||||
|
||||
def _patch_path_language(monkeypatch, enabled: bool = True):
|
||||
import shelfmark.release_sources.direct_download as dd
|
||||
|
||||
original_get = dd.config.get
|
||||
|
||||
def _fake_get(key: str, default=None, user_id=None):
|
||||
del user_id
|
||||
if key == "DIRECT_DOWNLOAD_LANGUAGE_FROM_PATH":
|
||||
return enabled
|
||||
return original_get(key, default)
|
||||
|
||||
monkeypatch.setattr(dd.config, "get", _fake_get)
|
||||
return dd
|
||||
|
||||
|
||||
def _row_from_html(html: str):
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
return BeautifulSoup(html, "html.parser").find("tr")
|
||||
|
||||
|
||||
def _make_row(distant_path: str, language: str = "", record_id: str = "rec-1") -> str:
|
||||
return rf"""
|
||||
<tr>
|
||||
<td><a href="/md5/{record_id}"><img src="cover.jpg"></a></td>
|
||||
<td><span>A Book Title</span></td>
|
||||
<td><span>Author Name</span></td>
|
||||
<td><span>Publisher</span></td>
|
||||
<td><span>2024</span></td>
|
||||
<td><span>-</span></td>
|
||||
<td><span>-</span></td>
|
||||
<td><span>{language}</span></td>
|
||||
<td><span>fiction</span></td>
|
||||
<td><span>epub</span></td>
|
||||
<td><span>1 mb</span></td>
|
||||
<td><span>{distant_path}</span></td>
|
||||
</tr>
|
||||
"""
|
||||
|
||||
|
||||
def test_detects_bracketed_language_from_distant_path(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
row = _row_from_html(_make_row(r"lgli/N:\comics1\emule\2021.08.01\[BD FR] Scrameustache.cbz"))
|
||||
record = dd._parse_search_result_row(row)
|
||||
assert record is not None
|
||||
assert record.language == "fr"
|
||||
assert record.download_path is not None
|
||||
|
||||
|
||||
def test_detects_mixed_case_bracketed_language(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
row = _row_from_html(_make_row(r"lgli/V:\comics\_0DAY3\[Fr]\BDs [Fr]\!Pdf\S\Book.pdf"))
|
||||
record = dd._parse_search_result_row(row)
|
||||
assert record is not None
|
||||
assert record.language == "fr"
|
||||
|
||||
|
||||
def test_overrides_unknown_language_with_path_detection(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
row = _row_from_html(_make_row(r"lgli/V:\comics\_0DAY3\[Fr]\Book.pdf", language="unknown"))
|
||||
record = dd._parse_search_result_row(row)
|
||||
assert record is not None
|
||||
assert record.language == "fr"
|
||||
|
||||
|
||||
def test_sets_unknown_when_path_has_no_language(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
row = _row_from_html(_make_row(r"lgli/N:\comics1\emule\NoLanguageHere.epub"))
|
||||
record = dd._parse_search_result_row(row)
|
||||
assert record is not None
|
||||
assert record.language == "unknown"
|
||||
|
||||
|
||||
def test_avoids_en_false_positive_when_french_present(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
row = _row_from_html(
|
||||
_make_row(r"lgli/V:\comics\_0DAY2\Stripboeken Frans - BD en Français\[BD Fr] Book.cbr")
|
||||
)
|
||||
record = dd._parse_search_result_row(row)
|
||||
assert record is not None
|
||||
assert record.language == "fr"
|
||||
|
||||
|
||||
def test_keeps_row_with_missing_language_when_toggle_disabled(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch, enabled=False)
|
||||
row = _row_from_html(_make_row(r"lgli/N:\comics1\[BD FR] Scrameustache.cbz"))
|
||||
record = dd._parse_search_result_row(row)
|
||||
assert record is not None
|
||||
assert record.language is None
|
||||
|
||||
|
||||
def test_keeps_sparse_lgli_row(monkeypatch):
|
||||
"""lgli rows missing author/publisher/year must not be dropped."""
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
html = r"""
|
||||
<tr>
|
||||
<td><a href="/md5/sparse-1"><img src="cover.jpg"></a></td>
|
||||
<td><span>Gos - 1978 - Le scrameustache T06.cbz</span></td>
|
||||
<td></td><td></td><td></td><td></td><td></td><td></td>
|
||||
<td><span>Comic book</span></td>
|
||||
<td><span>cbz</span></td>
|
||||
<td><span>17.4MB</span></td>
|
||||
<td><span>lgli/N:\comics1\ftp\[BD.FR] French Comics\Book.cbz</span></td>
|
||||
</tr>
|
||||
"""
|
||||
record = dd._parse_search_result_row(_row_from_html(html))
|
||||
assert record is not None
|
||||
assert record.id == "sparse-1"
|
||||
assert record.language == "fr"
|
||||
assert record.author is None
|
||||
|
||||
|
||||
def test_search_books_filters_locally_when_path_language_enabled(monkeypatch):
|
||||
dd = _patch_path_language(monkeypatch)
|
||||
monkeypatch.setattr(dd.network, "get_aa_base_url", lambda: "https://mirror.example")
|
||||
monkeypatch.setattr(dd.network, "AAMirrorSelector", lambda: object())
|
||||
|
||||
captured_url: dict[str, str] = {}
|
||||
|
||||
def _fake_html_get_page(url: str, selector, allow_bypasser_fallback=False):
|
||||
del selector, allow_bypasser_fallback
|
||||
captured_url["url"] = url
|
||||
return r"""
|
||||
<table>
|
||||
<tr>
|
||||
<td><a href="/md5/rec-fr"><img src="c.jpg"></a></td>
|
||||
<td><span>Livre FR</span></td><td><span>Auteur</span></td>
|
||||
<td><span>Editeur</span></td><td><span>2025</span></td>
|
||||
<td><span>-</span></td><td><span>-</span></td><td></td>
|
||||
<td><span>fiction</span></td><td><span>pdf</span></td>
|
||||
<td><span>2 mb</span></td>
|
||||
<td><span>lgli/V:\comics\_0DAY3\[Fr]\Book FR.pdf</span></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><a href="/md5/rec-en"><img src="c.jpg"></a></td>
|
||||
<td><span>Book EN</span></td><td><span>Author</span></td>
|
||||
<td><span>Publisher</span></td><td><span>2025</span></td>
|
||||
<td><span>-</span></td><td><span>-</span></td><td></td>
|
||||
<td><span>fiction</span></td><td><span>pdf</span></td>
|
||||
<td><span>2 mb</span></td>
|
||||
<td><span>lgli/V:\comics\_0DAY3\[En]\Book EN.pdf</span></td>
|
||||
</tr>
|
||||
</table>
|
||||
"""
|
||||
|
||||
monkeypatch.setattr(dd.downloader, "html_get_page", _fake_html_get_page)
|
||||
|
||||
records = dd.search_books("demo", SearchFilters(lang=["fr"], format=["pdf"]))
|
||||
|
||||
assert "&lang=" not in captured_url["url"]
|
||||
assert len(records) == 1
|
||||
assert records[0].id == "rec-fr"
|
||||
assert records[0].language == "fr"
|
||||
|
||||
|
||||
def test_book_matches_requested_languages_logic():
|
||||
import shelfmark.release_sources.direct_download as dd
|
||||
|
||||
assert dd._book_matches_requested_languages(None, {"fr"}) is True
|
||||
assert dd._book_matches_requested_languages(None, set()) is True
|
||||
assert dd._book_matches_requested_languages("en", {"fr"}) is False
|
||||
assert dd._book_matches_requested_languages("fr", {"fr"}) is True
|
||||
|
||||
@@ -349,6 +349,105 @@ class TestRTorrentClientAddDownload:
|
||||
assert "RPC Error" in str(excinfo.value)
|
||||
|
||||
|
||||
class TestRTorrentClientAudiobookLabel:
|
||||
"""Regression tests for issue #1025 — rTorrent audiobook label selection."""
|
||||
|
||||
def _make_client(self, monkeypatch, config_values):
|
||||
monkeypatch.setattr(
|
||||
"shelfmark.download.clients.rtorrent.config.get",
|
||||
make_config_getter(config_values),
|
||||
)
|
||||
mock_rpc = MagicMock()
|
||||
mock_xmlrpc = create_mock_xmlrpc_module()
|
||||
mock_xmlrpc.ServerProxy.return_value = mock_rpc
|
||||
|
||||
mock_torrent_info = MagicMock()
|
||||
mock_torrent_info.torrent_data = None
|
||||
mock_torrent_info.magnet_url = "magnet:?xt=urn:btih:abc123"
|
||||
mock_torrent_info.info_hash = "abc123"
|
||||
mock_torrent_info.is_magnet = True
|
||||
|
||||
return mock_rpc, mock_xmlrpc, mock_torrent_info
|
||||
|
||||
def test_uses_audiobook_label_when_content_type_is_audiobook(self, monkeypatch):
|
||||
config_values = {
|
||||
"RTORRENT_URL": "http://localhost:8080/RPC2",
|
||||
"RTORRENT_LABEL": "books",
|
||||
"RTORRENT_AUDIOBOOK_LABEL": "audiobooks",
|
||||
"RTORRENT_DOWNLOAD_DIR": "/downloads",
|
||||
}
|
||||
mock_rpc, mock_xmlrpc, mock_torrent_info = self._make_client(monkeypatch, config_values)
|
||||
|
||||
with patch.dict("sys.modules", {"xmlrpc.client": mock_xmlrpc}):
|
||||
with patch(
|
||||
"shelfmark.download.clients.torrent_utils.extract_torrent_info",
|
||||
return_value=mock_torrent_info,
|
||||
):
|
||||
if "shelfmark.download.clients.rtorrent" in sys.modules:
|
||||
del sys.modules["shelfmark.download.clients.rtorrent"]
|
||||
from shelfmark.download.clients.rtorrent import RTorrentClient
|
||||
|
||||
client = RTorrentClient()
|
||||
client.add_download(
|
||||
"magnet:?xt=urn:btih:abc123", "Test Audiobook", content_type="audiobook"
|
||||
)
|
||||
|
||||
args = mock_rpc.load.start.call_args[0]
|
||||
assert "d.custom1.set=audiobooks" in args[2]
|
||||
assert "d.custom1.set=books" not in args[2]
|
||||
|
||||
def test_falls_back_to_book_label_when_audiobook_label_not_set(self, monkeypatch):
|
||||
config_values = {
|
||||
"RTORRENT_URL": "http://localhost:8080/RPC2",
|
||||
"RTORRENT_LABEL": "books",
|
||||
"RTORRENT_AUDIOBOOK_LABEL": "",
|
||||
"RTORRENT_DOWNLOAD_DIR": "/downloads",
|
||||
}
|
||||
mock_rpc, mock_xmlrpc, mock_torrent_info = self._make_client(monkeypatch, config_values)
|
||||
|
||||
with patch.dict("sys.modules", {"xmlrpc.client": mock_xmlrpc}):
|
||||
with patch(
|
||||
"shelfmark.download.clients.torrent_utils.extract_torrent_info",
|
||||
return_value=mock_torrent_info,
|
||||
):
|
||||
if "shelfmark.download.clients.rtorrent" in sys.modules:
|
||||
del sys.modules["shelfmark.download.clients.rtorrent"]
|
||||
from shelfmark.download.clients.rtorrent import RTorrentClient
|
||||
|
||||
client = RTorrentClient()
|
||||
client.add_download(
|
||||
"magnet:?xt=urn:btih:abc123", "Test Audiobook", content_type="audiobook"
|
||||
)
|
||||
|
||||
args = mock_rpc.load.start.call_args[0]
|
||||
assert "d.custom1.set=books" in args[2]
|
||||
|
||||
def test_uses_book_label_for_non_audiobook_content(self, monkeypatch):
|
||||
config_values = {
|
||||
"RTORRENT_URL": "http://localhost:8080/RPC2",
|
||||
"RTORRENT_LABEL": "books",
|
||||
"RTORRENT_AUDIOBOOK_LABEL": "audiobooks",
|
||||
"RTORRENT_DOWNLOAD_DIR": "/downloads",
|
||||
}
|
||||
mock_rpc, mock_xmlrpc, mock_torrent_info = self._make_client(monkeypatch, config_values)
|
||||
|
||||
with patch.dict("sys.modules", {"xmlrpc.client": mock_xmlrpc}):
|
||||
with patch(
|
||||
"shelfmark.download.clients.torrent_utils.extract_torrent_info",
|
||||
return_value=mock_torrent_info,
|
||||
):
|
||||
if "shelfmark.download.clients.rtorrent" in sys.modules:
|
||||
del sys.modules["shelfmark.download.clients.rtorrent"]
|
||||
from shelfmark.download.clients.rtorrent import RTorrentClient
|
||||
|
||||
client = RTorrentClient()
|
||||
client.add_download("magnet:?xt=urn:btih:abc123", "Test Book")
|
||||
|
||||
args = mock_rpc.load.start.call_args[0]
|
||||
assert "d.custom1.set=books" in args[2]
|
||||
assert "d.custom1.set=audiobooks" not in args[2]
|
||||
|
||||
|
||||
class TestRTorrentClientGetStatus:
|
||||
"""Tests for RTorrentClient.get_status()."""
|
||||
|
||||
|
||||
@@ -383,6 +383,42 @@ class TestTransmissionClientGetStatus:
|
||||
assert status.complete is True
|
||||
assert "/downloads/Test Torrent" in status.file_path
|
||||
|
||||
def test_get_status_stopped_treated_as_complete(self, monkeypatch):
|
||||
"""Regression: torrents stopped after seeding ratio/idle limit must show complete."""
|
||||
config_values = {
|
||||
"TRANSMISSION_URL": "http://localhost:9091",
|
||||
"TRANSMISSION_USERNAME": "admin",
|
||||
"TRANSMISSION_PASSWORD": "password",
|
||||
"TRANSMISSION_CATEGORY": "test",
|
||||
}
|
||||
monkeypatch.setattr(
|
||||
"shelfmark.download.clients.transmission.config.get",
|
||||
make_config_getter(config_values),
|
||||
)
|
||||
|
||||
mock_torrent = MockTorrent(
|
||||
percent_done=1.0,
|
||||
status="stopped",
|
||||
download_dir="/downloads",
|
||||
)
|
||||
mock_client_instance = MagicMock()
|
||||
mock_client_instance.get_torrent.return_value = mock_torrent
|
||||
|
||||
mock_transmission_rpc = create_mock_transmission_rpc_module()
|
||||
mock_transmission_rpc.Client.return_value = mock_client_instance
|
||||
|
||||
with patch.dict("sys.modules", {"transmission_rpc": mock_transmission_rpc}):
|
||||
if "shelfmark.download.clients.transmission" in sys.modules:
|
||||
del sys.modules["shelfmark.download.clients.transmission"]
|
||||
|
||||
from shelfmark.download.clients.transmission import TransmissionClient
|
||||
|
||||
client = TransmissionClient()
|
||||
status = client.get_status("abc123")
|
||||
|
||||
assert status.complete is True
|
||||
assert status.progress == 100.0
|
||||
|
||||
def test_get_status_not_found(self, monkeypatch):
|
||||
"""Test status for non-existent torrent."""
|
||||
config_values = {
|
||||
|
||||
9
tor.sh
9
tor.sh
@@ -304,15 +304,20 @@ rotation_monitor() {
|
||||
echo "[*] Circuit rotation #$rotation_count at $(date)"
|
||||
|
||||
# Test DNS resolution through Tor
|
||||
dns_ok=true
|
||||
if ! timeout 10 nslookup google.com 127.0.0.1 > /dev/null 2>&1; then
|
||||
echo "[!] $(date): DNS resolution slow/failing, rotating circuits..."
|
||||
pkill -HUP tor || true
|
||||
sleep 10
|
||||
dns_ok=false
|
||||
fi
|
||||
|
||||
# Proactively rotate circuits every 5 minutes to keep them fresh
|
||||
echo "[*] $(date): Proactive circuit rotation..."
|
||||
pkill -HUP tor || true
|
||||
# Skip if we already rotated for DNS failure this cycle
|
||||
if $dns_ok; then
|
||||
echo "[*] $(date): Proactive circuit rotation..."
|
||||
pkill -HUP tor || true
|
||||
fi
|
||||
|
||||
# Verify Tor is still responsive after rotation
|
||||
sleep 5
|
||||
|
||||
Reference in New Issue
Block a user