Introduce an `Llm` struct to bundle the provider and API key together.
Add `UserError` to identify errors that have already printed human-
readable messages, preventing duplicate logging on exit.
Removes an assertion that can break with custom element callbacks.
https://github.com/lightpanda-io/browser/pull/2429 does not solve this issue
since it isn't a result of when reactions are executed, but just that they
happen.
(I should note, that I'm not 100% sure the above statement is correct. It's
possible that our CE reactions are (in some cases at least) too micro. Maybe
some operations, like setInnerHtml should operate within a more atomic
frameworks, vs an CE-reaction per internal step. But that's a pretty big change)
This adds help documentation for the --json flag. This is the only thing that
must be kept from this commit.
It uses our testing.zig to streamline the json testing (instead of string
probing). It removes the JsonEnvelope in favor of an anonymous struct (though,
I'm fine with adding it back in if it's needed to resolve something ambiguous).
Finally, I removed the last unit test as, at that point, it's really just
testing Zig's JSON stringifier (could arguably make the same case for the other
two, but there's some logic there about how nulls/empty might be handled).
Trim down a lot of the comments. Inline remarks in some cases rather than a
large function header.
Removing the `errdefer headers.deinit();` is unfortunate but currently
necessary to avoid a potential double-free if the request gets far enough that
http_client frees it and still returns an error. This is a known issue that
needs to be fixed separately and that impacts multiple call-sites. My "fix"
introduces a possible (very small) memory leak versus a possible crash.
Fold QueryWriter into Writer behind an Opts.filter. Tree mode is unchanged
(filter=null); query mode walks the full subtree (including AX-ignored
nodes per the queryAXTree spec) and emits the flat-match shape. Shared
resolveRole helper handles label-promotion for both paths so the two
can't drift.
Drop the "objectId not yet supported" carve-out: queryAXTree now reuses
dom.getNode, which already resolves nodeId/backendNodeId/objectId.
Move the UA `matchesUaDisplayNoneRule` short-circuit from BEFORE the
author rule walk in `isElementHidden` to AFTER it, gated on
`display_priority == 0` (no author display rule matched). Per CSS
Cascade §6.1 any normal-origin author rule beats UA origin regardless
of specificity, so `.x { display: flex }` on `<div class="x" hidden>`
must report visible. The author rule machinery
(`addRawRule`/`isRelevant`/`checkRules` + `id_rules`/`class_rules`/
`tag_rules`/`other_rules`) already produces the right answer; only the
check order in `isElementHidden` was wrong.
Refs #2496
Closes the DOMParser gap left as a follow-up in the previous review-fix
commit. DOMParser.parseFromString built its target Document via the
frame's parser without touching `_parse_mode`, so `Build.created` →
`linkAddedCallback` → `loadExternalStylesheet` saw `_parse_mode ==
.document` and fetched/registered sheets on the LIVE frame document for
every stylesheet link in the parsed string.
Bracket both the text/html and XML branches with the same fragment
parse-mode `parseHtmlAsChildren` uses. The existing gate in
`loadExternalStylesheet` already short-circuits on .fragment, so no
change is needed there. Side benefits: parser-emitted scripts in
DOMParser content stop reaching `scriptAddedCallback` against the live
frame, default-script injection skips DOMParser content, and mutation
observers on the live document no longer fan out for parsed nodes —
all of which match what DOMParser should do per spec.
Regression test extended to cover the DOMParser path alongside the
existing innerHTML case.
Refs #2343
Addresses 8 findings from ultrareview on the external stylesheet feature:
* UAF on CDP teardown during syncRequest. `loadExternalStylesheet`
pumps the CDP socket inline, so a `Target.closeTarget` arriving
mid-fetch could drive `Session.removePage` and free the frame
while we still held `self`. Set `_script_manager.base.is_evaluating`
around the call — the same bracket every other syncRequest caller
uses, which is what `Session.removePage`'s reentrancy guard checks.
* Disconnect leak. `link.remove()` left the sheet on
`document.styleSheets` and in the cascade forever; the disconnect
walker had a `<style>` branch but no `<link>` mirror. Common SPA
theme-switch pattern (append new sheet, remove old) was broken.
Added the parallel `else if` branch.
* Fragment-parsed links. `Build.created` fires for parser-instantiated
elements before attachment, including innerHTML / outerHTML /
insertAdjacentHTML / Range.createContextualFragment / <template>
content. Without a guard those fetched against the live document
and registered phantom sheets even when the fragment was never
attached. Added `_parse_mode == .fragment` early-return mirroring
the existing `nodeIsReady` short-circuit. DOMParser is a separate
case (parses with `.document` into a different Document) and is
left as a known follow-up.
* Missing Referer. Every other resource-fetch path
(ScriptManagerBase, XHR, Fetch, WorkerGlobalScope) routes through
`Frame.headersForRequest` to attach the cached `Referer` header.
Many CDNs gate stylesheet delivery on Referer; without it requests
returned 403/302 and the CSS silently failed. Added the call.
* Header OOM leak. `headers.add` between `newHeaders()` and
`syncRequest` (which takes ownership) leaked the initial 3-entry
slist on OOM. Added `errdefer headers.deinit()` mirroring
RobotsLayer.zig:121-122.
* `_href` mutated before parse could fail. On parse error the cached
sheet was left with the new URL but old rules dropped — violated
the "previous sheet intact on failure" invariant the PR description
promises. Moved the `_href` assignment to after `replaceSync`
succeeds. Full atomicity would require a scratch-list pattern in
`CSSStyleSheet.replaceSync` itself; documented as a known limit.
* `_sheet` cached before registration could OOM. If `sheets.add`
failed, `link._sheet` pointed at an unregistered sheet and every
future re-fetch short-circuited via the `orelse` branch, leaving
the sheet permanently unreachable through `document.styleSheets`.
Assign `link._sheet` only after `sheets.add` succeeds.
* Stale CLI help text claimed `--enable-external-stylesheets` was a
no-op surface. Removed the obsolete sentence.
New regression tests cover fragment-parse skip and disconnect
removal+re-add. Full suite 694/694 pass.
Refs #2343
Caught in code review: `loadExternalStylesheet` created a fresh
`CSSStyleSheet` and appended to `document.styleSheets` on every call, so
mutating `link.href` on a connected stylesheet element accumulated stale
sheets — the old rules kept cascading because the previous sheet was
never removed.
Cache the sheet on `Link._sheet` (mirroring `Style._sheet`) and reuse it
via `replaceSync` on re-fetch. First load creates + registers as before;
subsequent loads swap content in place, keeping `document.styleSheets`
length stable.
On fetch failure the cached sheet is untouched — matches browser
behavior where a broken href doesn't invalidate the previously loaded
sheet until the link itself is removed.
Refs #2343
Wires up --enable-external-stylesheets / LP.configureLoading.externalStylesheets
from the prior surface-only commit. When the flag is set, parser- and
JS-created <link rel=stylesheet> elements now synchronously fetch and parse
their href, register a CSSStyleSheet on document.styleSheets, and feed
StyleManager so checkVisibility() reflects external rules. Flag stays
default-off — scrapers that don't need accurate visibility pay nothing.
Frame.loadExternalStylesheet mirrors ScriptManager.addFromElement: same
HttpClient.syncRequest path, same arena ownership, same per-frame
notification + cookie wiring. Body is routed through CSSStyleSheet.replaceSync,
which already parses, populates cssRules, and calls sheetModified() — no
StyleManager changes needed. 2 MiB hard cap on a single sheet body, status
non-2xx and oversize both fire `error` on the link.
Link.Build.created is added so static head <link> elements reach
linkAddedCallback at all — void elements never trigger nodeComplete, which
is why static `<link>` had no observable effect before. Mirrors Image.
HttpClient.Request.ResourceType gains a `.stylesheet` variant so CDP Network
events report the right type; cdp.fetch.zig switches updated.
Refs #2343
Reserves the CLI flag and LP.configureLoading externalStylesheets field
so drivers can adopt the API before the fetch implementation lands in a
follow-up that depends on #2303.
The bool is intentionally unread in this PR. Mirrors the existing
--disable-subframes / --disable-workers plumbing; the CDP field extends
LP.configureLoading alongside subFrame and worker without breaking
existing callers.
Refs #2343
LogFilter isn't thread safe, so setting it in a test where the log filter is
read from another thread trigger's TSAN. LogFilter.deinit now waits until
the server has no active threads.
Previously, the CDP socket was added to the worker's multi and fully owned
by the worker. While this is simple, it introduced some issues:
1 - Cannot detect a disconnected client during JS processing ( for(;;) )
2 - A blocked worker can cause back-pressure that blocks the client. This can
cause a deadlock if the worker is blocked waiting for a CDP message
In addition to these 2 problems, there was 1 other serious CDP-related issue:
arbitrary CDP messages could be processed during JavaScript callback. For
example, a Worker calls importScripts while request interception is enabled,
this requires us to tick the HttpClient waiting for the interception response.
But, a client could sent Target.closeTarget, which we'd process and delete the
frame..all while importScripts is still blocked. Assuming importScripts unblocks
everything is a big UAF since the frame (and its workers) were cleared from
closeTarget.
The CDP socket is now read from the network (main) thread and an OTP-style
mailbox is used. The network thread posts message to the Worker's inbox and
signals it to wakeup. This solves #1 and #2. It doesn't directly solve the
reentrancy issue, but it provides the foundation. Specifically, in introduces
a queue for of CDP message and more control over when/how that queue is
processed. At "safe points" (Runner.tick, HttpClient.tick), any message can
be processed. But, when inside a JavaScript callback, we can process only non-
destructive/mutating message. Specifically, we can process only messages related
to request interception.
The `fetch` command is very practical to render pages without needing to
have a long running browser instance.
It is however masking all details on the fetch, most importantly the HTTP status code.
This is a big caveat when leveraging `lightpanda fetch` in a pipeline.
This introduces a `--json` option to provide a structured output that
contains:
* url
* HTTP status code
* response headers
* rendered content as controlled by the `--dump` option
The proposal is to always output the same JSON format even when not
using `--dump` with an option.
- Replace `Self` with `Recorder` and `Verifier` for improved clarity.
- Add `Spinner.isEnabled()` to encapsulate atomic state access.
- Shorten and refine various comments across the codebase.
Adds the `-a` short flag, improves CLI validation for one-shot mode,
and ensures `--model` takes precedence over `--pick-model`.
BREAKING CHANGE: `--task-attachment` has been renamed to `--attach`.