Commit Graph

6326 Commits

Author SHA1 Message Date
Navid EMAD
32dbd716b1 Apply fragment parse-mode to DOMParser
Closes the DOMParser gap left as a follow-up in the previous review-fix
commit. DOMParser.parseFromString built its target Document via the
frame's parser without touching `_parse_mode`, so `Build.created` →
`linkAddedCallback` → `loadExternalStylesheet` saw `_parse_mode ==
.document` and fetched/registered sheets on the LIVE frame document for
every stylesheet link in the parsed string.

Bracket both the text/html and XML branches with the same fragment
parse-mode `parseHtmlAsChildren` uses. The existing gate in
`loadExternalStylesheet` already short-circuits on .fragment, so no
change is needed there. Side benefits: parser-emitted scripts in
DOMParser content stop reaching `scriptAddedCallback` against the live
frame, default-script injection skips DOMParser content, and mutation
observers on the live document no longer fan out for parsed nodes —
all of which match what DOMParser should do per spec.

Regression test extended to cover the DOMParser path alongside the
existing innerHTML case.

Refs #2343
2026-05-19 15:51:35 +02:00
Navid EMAD
f05efd6719 Harden external stylesheet path per code review
Addresses 8 findings from ultrareview on the external stylesheet feature:

* UAF on CDP teardown during syncRequest. `loadExternalStylesheet`
  pumps the CDP socket inline, so a `Target.closeTarget` arriving
  mid-fetch could drive `Session.removePage` and free the frame
  while we still held `self`. Set `_script_manager.base.is_evaluating`
  around the call — the same bracket every other syncRequest caller
  uses, which is what `Session.removePage`'s reentrancy guard checks.

* Disconnect leak. `link.remove()` left the sheet on
  `document.styleSheets` and in the cascade forever; the disconnect
  walker had a `<style>` branch but no `<link>` mirror. Common SPA
  theme-switch pattern (append new sheet, remove old) was broken.
  Added the parallel `else if` branch.

* Fragment-parsed links. `Build.created` fires for parser-instantiated
  elements before attachment, including innerHTML / outerHTML /
  insertAdjacentHTML / Range.createContextualFragment / <template>
  content. Without a guard those fetched against the live document
  and registered phantom sheets even when the fragment was never
  attached. Added `_parse_mode == .fragment` early-return mirroring
  the existing `nodeIsReady` short-circuit. DOMParser is a separate
  case (parses with `.document` into a different Document) and is
  left as a known follow-up.

* Missing Referer. Every other resource-fetch path
  (ScriptManagerBase, XHR, Fetch, WorkerGlobalScope) routes through
  `Frame.headersForRequest` to attach the cached `Referer` header.
  Many CDNs gate stylesheet delivery on Referer; without it requests
  returned 403/302 and the CSS silently failed. Added the call.

* Header OOM leak. `headers.add` between `newHeaders()` and
  `syncRequest` (which takes ownership) leaked the initial 3-entry
  slist on OOM. Added `errdefer headers.deinit()` mirroring
  RobotsLayer.zig:121-122.

* `_href` mutated before parse could fail. On parse error the cached
  sheet was left with the new URL but old rules dropped — violated
  the "previous sheet intact on failure" invariant the PR description
  promises. Moved the `_href` assignment to after `replaceSync`
  succeeds. Full atomicity would require a scratch-list pattern in
  `CSSStyleSheet.replaceSync` itself; documented as a known limit.

* `_sheet` cached before registration could OOM. If `sheets.add`
  failed, `link._sheet` pointed at an unregistered sheet and every
  future re-fetch short-circuited via the `orelse` branch, leaving
  the sheet permanently unreachable through `document.styleSheets`.
  Assign `link._sheet` only after `sheets.add` succeeds.

* Stale CLI help text claimed `--enable-external-stylesheets` was a
  no-op surface. Removed the obsolete sentence.

New regression tests cover fragment-parse skip and disconnect
removal+re-add. Full suite 694/694 pass.

Refs #2343
2026-05-19 15:51:34 +02:00
Navid EMAD
4592812027 Reuse cached sheet on link href change
Caught in code review: `loadExternalStylesheet` created a fresh
`CSSStyleSheet` and appended to `document.styleSheets` on every call, so
mutating `link.href` on a connected stylesheet element accumulated stale
sheets — the old rules kept cascading because the previous sheet was
never removed.

Cache the sheet on `Link._sheet` (mirroring `Style._sheet`) and reuse it
via `replaceSync` on re-fetch. First load creates + registers as before;
subsequent loads swap content in place, keeping `document.styleSheets`
length stable.

On fetch failure the cached sheet is untouched — matches browser
behavior where a broken href doesn't invalidate the previously loaded
sheet until the link itself is removed.

Refs #2343
2026-05-19 15:50:11 +02:00
Navid EMAD
3e409d49e9 Implement external stylesheet fetch + parse
Wires up --enable-external-stylesheets / LP.configureLoading.externalStylesheets
from the prior surface-only commit. When the flag is set, parser- and
JS-created <link rel=stylesheet> elements now synchronously fetch and parse
their href, register a CSSStyleSheet on document.styleSheets, and feed
StyleManager so checkVisibility() reflects external rules. Flag stays
default-off — scrapers that don't need accurate visibility pay nothing.

Frame.loadExternalStylesheet mirrors ScriptManager.addFromElement: same
HttpClient.syncRequest path, same arena ownership, same per-frame
notification + cookie wiring. Body is routed through CSSStyleSheet.replaceSync,
which already parses, populates cssRules, and calls sheetModified() — no
StyleManager changes needed. 2 MiB hard cap on a single sheet body, status
non-2xx and oversize both fire `error` on the link.

Link.Build.created is added so static head <link> elements reach
linkAddedCallback at all — void elements never trigger nodeComplete, which
is why static `<link>` had no observable effect before. Mirrors Image.

HttpClient.Request.ResourceType gains a `.stylesheet` variant so CDP Network
events report the right type; cdp.fetch.zig switches updated.

Refs #2343
2026-05-19 15:50:11 +02:00
Navid EMAD
6ed41ea346 Add --enable-external-stylesheets flag (no-op surface)
Reserves the CLI flag and LP.configureLoading externalStylesheets field
so drivers can adopt the API before the fetch implementation lands in a
follow-up that depends on #2303.

The bool is intentionally unread in this PR. Mirrors the existing
--disable-subframes / --disable-workers plumbing; the CDP field extends
LP.configureLoading alongside subFrame and worker without breaking
existing callers.

Refs #2343
2026-05-19 15:50:11 +02:00
Karl Seguin
e61eddf956 Merge pull request #2493 from lightpanda-io/nikneym/fix-injection-through-authority
`URL.zig`: fix NUL/CR/LF/TAB character injection through authority
2026-05-19 19:12:16 +08:00
Halil Durak
64a3f3edd7 URL.zig: update tests 2026-05-19 13:55:34 +03:00
Halil Durak
6bc4ebdfed URL.zig: fix NUL/CR/LF/TAB character injection through authority 2026-05-19 12:29:39 +03:00
Karl Seguin
fd0831fe93 Merge pull request #2469 from lightpanda-io/nikneym/samesite-strict-cookie-vulnerability
`Cookie`: honor SameSite=Strict on cross-site navigation
2026-05-19 16:20:08 +08:00
Halil Durak
f17a260d93 prefer initiator_url to calculate SameSite correctly when navigating
changes after rebase
2026-05-19 10:53:25 +03:00
Halil Durak
a8029c079e Cookie.zig: add a test for SameSite=Strict on cross-site navigation 2026-05-19 10:53:24 +03:00
Halil Durak
bdd456f76c Merge pull request #2491 from willmafh/improve-code-readability
more clean validateCookieString function to improve code readability
2026-05-18 17:53:45 +03:00
willmafh
2f66edc9b9 more clean validateCookieString function to improve code readability 2026-05-18 22:29:01 +08:00
Karl Seguin
b83cd9262b Merge pull request #2490 from lightpanda-io/blocking_read_failure_handling
On blocking read failure, break from loop
2026-05-18 21:19:40 +08:00
Karl Seguin
49aa0ad1a9 On blocking read failure, break from loop
Blocking read failure almost certainly means a disconnect client. As-is, that's
an endless loop. Instead, fail the request.
2026-05-18 19:44:25 +08:00
Pierre Tachoire
23a3d5476b Merge pull request #2458 from lightpanda-io/nikneym/cli-help-rework
`help`: rework `help` command
2026-05-18 11:54:29 +02:00
Pierre Tachoire
8b098a3c97 Merge pull request #2488 from lightpanda-io/ci-mcp-smoke-jq-tighten 2026-05-17 12:50:23 +02:00
Adrià Arrufat
8981a6245c ci: tighten mcp-smoke jq assertions
Replace `grep '"id":N' | jq -e ...` with `jq -ec 'select(.id == N) | ...'`.
The grep form also matched `"id":10`, `"id":11`, ... and any tool description
containing that substring; numeric `select` is type-correct. `jq -e` still
fails the job when `select` produces no output (exit 4), so the smoke
semantics are preserved.

Also add `jq --version` up front so the job fails fast and loud if the
`ubuntu-latest` image ever stops shipping jq.
2026-05-17 10:43:03 +02:00
Pierre Tachoire
803e4303c2 Merge pull request #2481 from navidemad/ci-mcp-smoke
ci: smoke test the MCP stdio server
2026-05-17 10:39:18 +02:00
Pierre Tachoire
4e80db6cf0 Merge pull request #2483 from navidemad/dockerfile-pipefail-hygiene
Dockerfile: fix curl|sh pipefail; trim builder stage
2026-05-16 19:21:30 +02:00
Pierre Tachoire
a3944a3b40 Merge pull request #2484 from lightpanda-io/e2e_kill_between_steps
Force kill lightpanda between steps to prevent "port already in-use" …
2026-05-16 18:51:36 +02:00
Karl Seguin
ab63cfbf39 Merge pull request #2478 from navidemad/fix-c10-inline-media-evaluation
css: evaluate @media and matchMedia against viewport
2026-05-16 21:42:56 +08:00
Karl Seguin
d870972ceb Small tweaks to @media
- Depth counter when recursing
- Better comment support
- Small perf tweak (e.g. lowercase once into stack buffer before multiple
  compares)
- Few more test cases
2026-05-16 20:52:11 +08:00
Karl Seguin
21e74b46ea Merge pull request #2486 from willmafh/typo-fix
typo fix
2026-05-16 20:39:36 +08:00
willmafh
c52356b6d7 chore: lowercase demo word 2026-05-16 20:07:32 +08:00
willmafh
c1e64232e5 chore: typo fix 2026-05-16 20:05:52 +08:00
Karl Seguin
7f8cb145e6 Merge pull request #2485 from lightpanda-io/nikneym/timers-hash
`Timers`: prefer integer-optimized hashing
2026-05-16 16:52:53 +08:00
Halil Durak
33d594be43 Timers: prefer integer-optimized hashing 2026-05-16 10:19:33 +03:00
Karl Seguin
d926291241 Merge pull request #2467 from lightpanda-io/http_transfer
Cleanup HttpClient.Transfer
2026-05-16 08:52:12 +08:00
Karl Seguin
0b358fd410 Merge pull request #2474 from staylor/fix/2472-frame-id-reset
Fix #2472: scope frame ID generator to Browser, not Session
2026-05-16 08:46:27 +08:00
Karl Seguin
94e8b06583 Merge pull request #2482 from navidemad/make-v8-path
make: forward optional V8_PATH to zig build
2026-05-16 08:41:05 +08:00
Karl Seguin
a5c1068b85 Force kill lightpanda between steps to prevent "port already in-use" error in CI 2026-05-16 08:39:53 +08:00
Navid EMAD
54e09a5ace make: rename V8_PATH to generic ZIGFLAGS
Per review feedback, generalise the optional pass-through so any
`-D...` build option can be forwarded, not just the prebuilt V8 path.
2026-05-16 02:27:52 +02:00
Karl Seguin
5550b61d2d Merge pull request #2480 from navidemad/make-clean
make: add clean target
2026-05-16 07:35:09 +08:00
Karl Seguin
732e19c7b6 add cargo clean to html5ever 2026-05-16 07:34:35 +08:00
Karl Seguin
d3f3e7f335 Merge pull request #2475 from navidemad/fix-a41-json-undefined
js: emit `null` when JSON-stringifying unserializable values
2026-05-16 07:24:14 +08:00
Karl Seguin
2163a2fd5a Merge pull request #2463 from lightpanda-io/nikneym/nav-accept-header
Send `Accept` header when navigating
2026-05-16 06:39:40 +08:00
Navid EMAD
fd0700a572 dockerfile: fix curl|sh pipefail; trim builder stage
- Download rustup to a file then execute, so a failed curl is not
  masked by sh's exit code under /bin/sh (no pipefail).
- Add --no-install-recommends and apt-list cleanup to both apt stages
  (stage 0 drops from 156 to 116 packages, 1144 MB to 605 MB).
- Add --retry 3 --retry-delay 2 to all 4 external downloads.
- Use git clone --depth 1 (28 MB to 9.6 MB working tree).
- Drop -v from tar for minisign and zig extractions (log noise only).

Final shipped image is unchanged; the wins live in the builder stage
and build-cache footprint.
2026-05-15 23:45:06 +02:00
Navid EMAD
f08a1fef12 ci: smoke test the MCP stdio server
Sends initialize + notifications/initialized + tools/list over stdin
and asserts the JSON-RPC responses with jq. Catches regressions in
the agentic surface (./lightpanda mcp) without needing a node client.

Reuses the existing lightpanda-build-release artifact, so the new
job costs about a minute on top of zig-build-release.
2026-05-15 22:53:38 +02:00
Navid EMAD
d1a0203d88 make: forward optional V8_PATH to zig build
Without -Dprebuilt_v8_path, the build/test targets rebuild V8 from
source (10+ minutes per invocation). Contributors who already have a
cached archive can now short-circuit by exporting V8_PATH:

    V8_PATH=v8/libc_v8.a mise exec -- make test

When V8_PATH is empty (default), behavior is unchanged.
2026-05-15 22:53:00 +02:00
Navid EMAD
ee1cbf1bb3 make: add clean target 2026-05-15 22:51:54 +02:00
Navid EMAD
dd5e335262 css: harden media-query evaluator and @media boundary
Address review feedback on PR #2478:

- MediaQuery.zig: strip CSS `/* ... */` comments before tokenization so
  `screen and /*x*/ (min-width: 1px)` evaluates the same as without the
  comment.
- MediaQuery.zig: bound-check `em` / `rem` multiplication via
  `std.math.mul` so a u32-overflowing length (e.g. `268435456em`) fails
  closed instead of panicking in debug or wrapping in release.
- StyleManager.zig: prelude brace search skips `/* ... */` comments, so
  `@media /* { fake */ screen { ... }` splits at the real opening brace
  rather than the one inside the comment.
- Tests: unit tests for stripped comments, em/rem overflow, and
  unimplemented units (cm/mm/pt/in/vw). HTML fixtures cover commented
  preludes/queries and the `replaceSync` cascade path.
2026-05-15 20:49:45 +02:00
Navid EMAD
68bd1441af css: evaluate @media and matchMedia against viewport
Inline `@media` rules were parsed but never applied to the cascade, and
`window.matchMedia(q).matches` always returned false. Add a Media Queries
Level 4 subset evaluator (`width`/`height`/`orientation`, lengths in
`px`/`em`/`rem`, comma OR, `and`, `not`, `only`) wired into both surfaces.
External `<link rel="stylesheet">` fetch remains out of scope; the
evaluator reads the same 1920x1080 viewport already exposed by
`Window.innerWidth` / `innerHeight`.

Closes #2477
2026-05-15 20:30:01 +02:00
Navid EMAD
353be6382d js: emit null when JSON-stringifying unserializable values
V8's `JSON::Stringify` finishes by calling `Object::ToString` on whatever
`i::JsonStringify` returns. For values that `JSON.stringify` treats as
non-serializable at the top level (`undefined`, functions, symbols),
`i::JsonStringify` yields the undefined sentinel and `ToString` coerces
it to the JS string `"undefined"`. `Value.jsonStringify` then wrote those
9 bytes raw via `writer.writeAll`, embedding a bare `undefined` token in
the JSON stream — invalid per RFC 8259 and rejected by any strict-JSON
CDP client. Detect the sentinel and emit JSON `null` instead, matching
what `JSON.stringify` produces when the same value sits in an array slot
(`JSON.stringify([undefined])` → `"[null]"`).

Closes #2473
2026-05-15 19:21:57 +02:00
Scott Taylor
4bd2edb596 Fix #2472: scope frame ID generator to Browser, not Session
CDP target IDs (`FID-{d:0>10}`) must stay unique for the lifetime of
the CDP connection -- Playwright's `CRBrowser._onAttachedToTarget`
asserts on duplicates and the assertion is fatal (the connection is
unusable afterwards).

Before this fix, `Session.frame_id_gen` reset to 0 in two places:

  1. `tearDownActivePage` explicitly reset to 0 after every page
     teardown (likely intended to mimic pre-pending-page numbering
     within a single Session, but invisible there because the
     immediately-following `installNewActivePage` typically reuses
     the old frame's explicit `frame_id`, see `replaceRootImmediate`).

  2. Fresh Sessions started from the field default of 0. Each
     `Target.createBrowserContext` calls `Browser.newSession`, which
     deinits the old Session and constructs a new one -- so even
     without (1), the next BrowserContext's first page would still
     get `FID-0000000001`.

(2) is what trips Playwright on the second `browser.newContext()`
on a connection: the second context's first frame re-issues
`FID-0000000001`, identical to the first context's frame, and
Playwright's `CRBrowser._onAttachedToTarget` raises
`Duplicate target FID-0000000001`.

Move `frame_id_gen` (and `nextFrameId`) from `Session` to `Browser`,
which is per-CDP-connection. Existing callers (`Session.createPage`,
`Frame.zig:1327`, `Frame.zig:1437`, `Worker.zig:74`) still go through
`Session.nextFrameId` -- it's now a thin pass-through to
`browser.nextFrameId()` -- so no call sites change. Removed the
explicit reset in `tearDownActivePage`; it was redundant within a
Session (root navigation reuses the old frame_id) and harmful across
Sessions.

`loader_id_gen` stays on Session: Loader IDs (`LID-...`) are scoped
per-frame in CDP and Playwright doesn't track them in the target
registry, so the per-Session reset is correct there.

Repro (`playwright-core@1.58.2`):

  for (let i = 1; i <= 3; i++) {
    const ctx = await browser.newContext();
    await ctx.newPage();
    await ctx.close();
  }

Before: cycle 2 throws `Duplicate target FID-0000000001`.
After: 5/5 cycles complete cleanly.

Tests: 653/653 pass. Added regression coverage in
`cdp.target: createTarget assigns unique IDs across BrowserContexts
(issue #2472)` -- verified to fail against the original source
(reverted Browser.zig and Session.zig, kept the test, ran zig build
test: only the new test fails).
2026-05-15 13:17:49 -04:00
Halil Durak
34557e3993 cli.zig: update doc comment 2026-05-15 18:31:10 +03:00
Halil Durak
658df6e500 cli.zig: support lightpanda help <command>
Another variation to receive `help` text for a specific command.
2026-05-15 18:31:10 +03:00
Halil Durak
3489129f68 main.zig: changes for new help 2026-05-15 18:31:10 +03:00
Halil Durak
b2d8c2b834 help.zon: introduce help.zon
Separates `help` explanation from configuration.
2026-05-15 18:31:09 +03:00
Halil Durak
f361f12316 cli.zig: change the way help command and sub-command detected
`cli.zig` is now aware of `help` command at all situations and creates it by itself. Instead of using errors, it initializes `Command` union where `help` branch is active.
2026-05-15 18:31:09 +03:00