Commit Graph

6711 Commits

Author SHA1 Message Date
Adrià Arrufat
31e20ae261 Recorder: pass directory to init and reuse arena
Update `Recorder.init` to accept a `std.fs.Dir` and relative path,
decoupling it from the current working directory.

Also add a persistent arena allocator to `Recorder` to reuse
allocations when scrubbing environment variables on each write.
2026-05-21 17:45:54 +02:00
Adrià Arrufat
88d56c63d0 script: optimize formatting and avoid redundant writes
- Skip atomic writes if the content is unchanged.
- Use allocating writer to reduce allocations during formatting.
2026-05-21 17:37:13 +02:00
Adrià Arrufat
3e0a59c96a browser: avoid dangling pointers in lpEnvNames
Walk std.c.environ dynamically and duplicate names into a provided
allocator instead of caching pointers to std.os.environ, which can
be invalidated by setenv.
2026-05-21 17:28:49 +02:00
Adrià Arrufat
55a50746fa script: move helper functions into Command union
Renames `Command.zig` to `command.zig` and moves associated functions
like `parse`, `toToolCall`, and `canHeal` directly into the `Command`
union. This simplifies usage from `Command.Command` to just `Command`.
2026-05-21 17:13:19 +02:00
Adrià Arrufat
0707187fa0 build: bump zenai dependency 2026-05-21 15:11:40 +02:00
Adrià Arrufat
c04062f4de build: bump zenai dependency 2026-05-21 15:04:56 +02:00
Adrià Arrufat
651504efb5 agent: consolidate LLM provider and key resolution
Introduce an `Llm` struct to bundle the provider and API key together.
Add `UserError` to identify errors that have already printed human-
readable messages, preventing duplicate logging on exit.
2026-05-21 15:01:44 +02:00
Adrià Arrufat
7df1e80ce5 config: fix agent mode 2026-05-21 13:49:08 +02:00
Adrià Arrufat
54482831ed Merge branch 'main' into agent 2026-05-20 17:33:07 +02:00
Pierre Tachoire
5905319e78 Merge pull request #2503 from lightpanda-io/general_help
Remove options from main help
2026-05-20 16:25:39 +02:00
Pierre Tachoire
5fce406a13 compact more help options 2026-05-20 16:09:16 +02:00
Pierre Tachoire
af6175cc56 rewrite help in a more compact way
Inspired by go help output
2026-05-20 16:05:17 +02:00
Francis Bouvier
dffa961f45 Remove options from main help 2026-05-20 16:05:13 +02:00
Adrià Arrufat
989932b40a Merge branch 'main' into agent 2026-05-20 14:57:32 +02:00
Pierre Tachoire
f1b0adf923 Merge pull request #2498 from navidemad/fix/author-display-rule-beats-ua-hidden
`StyleManager`: author `display` rule must beat UA `[hidden]` / `display:none`
2026-05-20 14:01:25 +02:00
Pierre Tachoire
bb2d62d189 Merge pull request #2500 from lightpanda-io/parseHtmlAsChildren_assertion
parseHtmlAsChildren handling for unexpected dom (custom element callb…
2026-05-20 14:00:58 +02:00
Pierre Tachoire
6e6b3caf96 Merge pull request #2479 from navidemad/accessibility-query-ax-tree
Implement Accessibility.queryAXTree CDP method (and fix latent frame-binding bug)
2026-05-20 13:59:35 +02:00
Pierre Tachoire
2ad2c9d878 Merge pull request #2487 from navidemad/feat/external-stylesheets-flag
Add --enable-external-stylesheets flag with fetch + parse
2026-05-20 13:41:59 +02:00
Karl Seguin
b6fd09c5ab Merge pull request #2502 from lightpanda-io/max-cdp-conn
by using httpClient, fetch generates a call to Config.maxConnections
2026-05-20 16:28:56 +08:00
Pierre Tachoire
639cb14cb3 Merge pull request #2494 from marchelbling/feat-fetch-json-option
feat: add --json to fetch command
2026-05-20 10:17:21 +02:00
Pierre Tachoire
6eb25d5c44 by using httpClient, fetch generates a call to Config.maxConnections 2026-05-20 10:14:17 +02:00
Karl Seguin
a314984b2e Merge pull request #2495 from lightpanda-io/cdp_inbox
Main/Network reads CDP socket
2026-05-20 14:35:29 +08:00
Karl Seguin
5fb0f5a204 parseHtmlAsChildren handling for unexpected dom (custom element callback)
Removes an assertion that can break with custom element callbacks.
https://github.com/lightpanda-io/browser/pull/2429 does not solve this issue
since it isn't a result of when reactions are executed, but just that they
happen.

(I should note, that I'm not 100% sure the above statement is correct. It's
possible that our CE reactions are (in some cases at least) too micro. Maybe
some operations, like setInnerHtml should operate within a more atomic
frameworks, vs an CE-reaction per internal step. But that's a pretty big change)
2026-05-20 14:33:03 +08:00
Pierre Tachoire
29d5a7ae9e Merge pull request #2497 from lightpanda-io/ci-revert-cdp-logs
ci: remove cdp logs from end to end tests
2026-05-20 08:30:12 +02:00
Karl Seguin
e7b16983bb Add help documentation + small tweaks
This adds help documentation for the --json flag. This is the only thing that
must be kept from this commit.

It uses our testing.zig to streamline the json testing (instead of string
probing). It removes the JsonEnvelope in favor of an anonymous struct (though,
I'm fine with adding it back in if it's needed to resolve something ambiguous).

Finally, I removed the last unit test as, at that point, it's really just
testing Zig's JSON stringifier (could arguably make the same case for the other
two, but there's some logic there about how nulls/empty might be handled).
2026-05-20 10:44:41 +08:00
Karl Seguin
2fd2be0e51 Small largely stylistic tweaks
Trim down a lot of the comments. Inline remarks in some cases rather than a
large function header.

Removing the `errdefer headers.deinit();` is unfortunate but currently
necessary to avoid a potential double-free if the request gets far enough that
http_client frees it and still returns an error. This is a known issue that
needs to be fixed separately and that impacts multiple call-sites. My "fix"
introduces a possible (very small) memory leak versus a possible crash.
2026-05-20 10:31:28 +08:00
Karl Seguin
61386059c1 promote invalid CDP message from warn to err 2026-05-20 07:19:57 +08:00
Karl Seguin
037db695ff Merge pull request #2492 from lightpanda-io/cdp_connection
Re-organization CDP connection
2026-05-20 06:45:30 +08:00
Adrià Arrufat
85276df004 deps: update zenai and use thinking_level 2026-05-20 00:16:03 +02:00
Adrià Arrufat
98baccf1db agent: update default Gemini model 2026-05-19 23:56:41 +02:00
Navid EMAD
814ca8ab3f accessibility: unify query+tree writers, route objectId via dom.getNode
Fold QueryWriter into Writer behind an Opts.filter. Tree mode is unchanged
(filter=null); query mode walks the full subtree (including AX-ignored
nodes per the queryAXTree spec) and emits the flat-match shape. Shared
resolveRole helper handles label-promotion for both paths so the two
can't drift.

Drop the "objectId not yet supported" carve-out: queryAXTree now reuses
dom.getNode, which already resolves nodeId/backendNodeId/objectId.
2026-05-19 20:43:54 +02:00
Marc Helbling
fec2bbda7b review 2026-05-19 18:13:34 +02:00
Pierre Tachoire
a15c04de4b ci: remove cdp logs from end to end tests 2026-05-19 17:32:13 +02:00
Navid EMAD
bb8d0593b2 StyleManager: author display rule must beat UA [hidden] / display:none
Move the UA `matchesUaDisplayNoneRule` short-circuit from BEFORE the
author rule walk in `isElementHidden` to AFTER it, gated on
`display_priority == 0` (no author display rule matched). Per CSS
Cascade §6.1 any normal-origin author rule beats UA origin regardless
of specificity, so `.x { display: flex }` on `<div class="x" hidden>`
must report visible. The author rule machinery
(`addRawRule`/`isRelevant`/`checkRules` + `id_rules`/`class_rules`/
`tag_rules`/`other_rules`) already produces the right answer; only the
check order in `isElementHidden` was wrong.

Refs #2496
2026-05-19 17:18:20 +02:00
Karl Seguin
345cc9c6c0 zig fmt 2026-05-19 23:10:23 +08:00
Karl Seguin
97c8ca3832 when work is done, don't keep polling, return to process it 2026-05-19 22:39:48 +08:00
Navid EMAD
32dbd716b1 Apply fragment parse-mode to DOMParser
Closes the DOMParser gap left as a follow-up in the previous review-fix
commit. DOMParser.parseFromString built its target Document via the
frame's parser without touching `_parse_mode`, so `Build.created` →
`linkAddedCallback` → `loadExternalStylesheet` saw `_parse_mode ==
.document` and fetched/registered sheets on the LIVE frame document for
every stylesheet link in the parsed string.

Bracket both the text/html and XML branches with the same fragment
parse-mode `parseHtmlAsChildren` uses. The existing gate in
`loadExternalStylesheet` already short-circuits on .fragment, so no
change is needed there. Side benefits: parser-emitted scripts in
DOMParser content stop reaching `scriptAddedCallback` against the live
frame, default-script injection skips DOMParser content, and mutation
observers on the live document no longer fan out for parsed nodes —
all of which match what DOMParser should do per spec.

Regression test extended to cover the DOMParser path alongside the
existing innerHTML case.

Refs #2343
2026-05-19 15:51:35 +02:00
Navid EMAD
f05efd6719 Harden external stylesheet path per code review
Addresses 8 findings from ultrareview on the external stylesheet feature:

* UAF on CDP teardown during syncRequest. `loadExternalStylesheet`
  pumps the CDP socket inline, so a `Target.closeTarget` arriving
  mid-fetch could drive `Session.removePage` and free the frame
  while we still held `self`. Set `_script_manager.base.is_evaluating`
  around the call — the same bracket every other syncRequest caller
  uses, which is what `Session.removePage`'s reentrancy guard checks.

* Disconnect leak. `link.remove()` left the sheet on
  `document.styleSheets` and in the cascade forever; the disconnect
  walker had a `<style>` branch but no `<link>` mirror. Common SPA
  theme-switch pattern (append new sheet, remove old) was broken.
  Added the parallel `else if` branch.

* Fragment-parsed links. `Build.created` fires for parser-instantiated
  elements before attachment, including innerHTML / outerHTML /
  insertAdjacentHTML / Range.createContextualFragment / <template>
  content. Without a guard those fetched against the live document
  and registered phantom sheets even when the fragment was never
  attached. Added `_parse_mode == .fragment` early-return mirroring
  the existing `nodeIsReady` short-circuit. DOMParser is a separate
  case (parses with `.document` into a different Document) and is
  left as a known follow-up.

* Missing Referer. Every other resource-fetch path
  (ScriptManagerBase, XHR, Fetch, WorkerGlobalScope) routes through
  `Frame.headersForRequest` to attach the cached `Referer` header.
  Many CDNs gate stylesheet delivery on Referer; without it requests
  returned 403/302 and the CSS silently failed. Added the call.

* Header OOM leak. `headers.add` between `newHeaders()` and
  `syncRequest` (which takes ownership) leaked the initial 3-entry
  slist on OOM. Added `errdefer headers.deinit()` mirroring
  RobotsLayer.zig:121-122.

* `_href` mutated before parse could fail. On parse error the cached
  sheet was left with the new URL but old rules dropped — violated
  the "previous sheet intact on failure" invariant the PR description
  promises. Moved the `_href` assignment to after `replaceSync`
  succeeds. Full atomicity would require a scratch-list pattern in
  `CSSStyleSheet.replaceSync` itself; documented as a known limit.

* `_sheet` cached before registration could OOM. If `sheets.add`
  failed, `link._sheet` pointed at an unregistered sheet and every
  future re-fetch short-circuited via the `orelse` branch, leaving
  the sheet permanently unreachable through `document.styleSheets`.
  Assign `link._sheet` only after `sheets.add` succeeds.

* Stale CLI help text claimed `--enable-external-stylesheets` was a
  no-op surface. Removed the obsolete sentence.

New regression tests cover fragment-parse skip and disconnect
removal+re-add. Full suite 694/694 pass.

Refs #2343
2026-05-19 15:51:34 +02:00
Navid EMAD
4592812027 Reuse cached sheet on link href change
Caught in code review: `loadExternalStylesheet` created a fresh
`CSSStyleSheet` and appended to `document.styleSheets` on every call, so
mutating `link.href` on a connected stylesheet element accumulated stale
sheets — the old rules kept cascading because the previous sheet was
never removed.

Cache the sheet on `Link._sheet` (mirroring `Style._sheet`) and reuse it
via `replaceSync` on re-fetch. First load creates + registers as before;
subsequent loads swap content in place, keeping `document.styleSheets`
length stable.

On fetch failure the cached sheet is untouched — matches browser
behavior where a broken href doesn't invalidate the previously loaded
sheet until the link itself is removed.

Refs #2343
2026-05-19 15:50:11 +02:00
Navid EMAD
3e409d49e9 Implement external stylesheet fetch + parse
Wires up --enable-external-stylesheets / LP.configureLoading.externalStylesheets
from the prior surface-only commit. When the flag is set, parser- and
JS-created <link rel=stylesheet> elements now synchronously fetch and parse
their href, register a CSSStyleSheet on document.styleSheets, and feed
StyleManager so checkVisibility() reflects external rules. Flag stays
default-off — scrapers that don't need accurate visibility pay nothing.

Frame.loadExternalStylesheet mirrors ScriptManager.addFromElement: same
HttpClient.syncRequest path, same arena ownership, same per-frame
notification + cookie wiring. Body is routed through CSSStyleSheet.replaceSync,
which already parses, populates cssRules, and calls sheetModified() — no
StyleManager changes needed. 2 MiB hard cap on a single sheet body, status
non-2xx and oversize both fire `error` on the link.

Link.Build.created is added so static head <link> elements reach
linkAddedCallback at all — void elements never trigger nodeComplete, which
is why static `<link>` had no observable effect before. Mirrors Image.

HttpClient.Request.ResourceType gains a `.stylesheet` variant so CDP Network
events report the right type; cdp.fetch.zig switches updated.

Refs #2343
2026-05-19 15:50:11 +02:00
Navid EMAD
6ed41ea346 Add --enable-external-stylesheets flag (no-op surface)
Reserves the CLI flag and LP.configureLoading externalStylesheets field
so drivers can adopt the API before the fetch implementation lands in a
follow-up that depends on #2303.

The bool is intentionally unread in this PR. Mirrors the existing
--disable-subframes / --disable-workers plumbing; the CDP field extends
LP.configureLoading alongside subFrame and worker without breaking
existing callers.

Refs #2343
2026-05-19 15:50:11 +02:00
Karl Seguin
ed05a6b14f test thread safety
LogFilter isn't thread safe, so setting it in a test where the log filter is
read from another thread trigger's TSAN. LogFilter.deinit now waits until
the server has no active threads.
2026-05-19 21:26:53 +08:00
Karl Seguin
875c147783 Main/Network reads CDP socket
Previously, the CDP socket was added to the worker's multi and fully owned
by the worker. While this is simple, it introduced some issues:

1 - Cannot detect a disconnected client during JS processing ( for(;;) )

2 - A blocked worker can cause back-pressure that blocks the client. This can
    cause a deadlock if the worker is blocked waiting for a CDP message

In addition to these 2 problems, there was 1 other serious CDP-related issue:
arbitrary CDP messages could be processed during JavaScript callback. For
example, a Worker calls importScripts while request interception is enabled,
this requires us to tick the HttpClient waiting for the interception response.
But, a client could sent Target.closeTarget, which we'd process and delete the
frame..all while importScripts is still blocked. Assuming importScripts unblocks
everything is a big UAF since the frame (and its workers) were cleared from
closeTarget.

The CDP socket is now read from the network (main) thread and an OTP-style
mailbox is used. The network thread posts message to the Worker's inbox and
signals it to wakeup. This solves #1 and #2. It doesn't directly solve the
reentrancy issue, but it provides the foundation. Specifically, in introduces
a queue for of CDP message and more control over when/how that queue is
processed. At "safe points" (Runner.tick, HttpClient.tick), any message can
be processed. But, when inside a JavaScript callback, we can process only non-
destructive/mutating message. Specifically, we can process only messages related
to request interception.
2026-05-19 20:52:21 +08:00
Adrià Arrufat
19e3e7b74e agent: simplify spinner and remove tool failure state 2026-05-19 13:16:51 +02:00
Karl Seguin
e61eddf956 Merge pull request #2493 from lightpanda-io/nikneym/fix-injection-through-authority
`URL.zig`: fix NUL/CR/LF/TAB character injection through authority
2026-05-19 19:12:16 +08:00
Adrià Arrufat
e0af9c4168 refactor: unify tool results and rename CommandExecutor
Unifies tool outcomes into a `ToolResult` struct, replacing `EvalResult`.
Renames `CommandExecutor` to `CommandRunner` and simplifies error handling.
2026-05-19 13:01:39 +02:00
Halil Durak
64a3f3edd7 URL.zig: update tests 2026-05-19 13:55:34 +03:00
Adrià Arrufat
9eed73434c agent: make isHealAllowed exhaustive 2026-05-19 12:09:41 +02:00
Marc Helbling
a89a28a4a2 feat: add --json to fetch command
The `fetch` command is very practical to render pages without needing to
have a long running browser instance.
It is however masking all details on the fetch, most importantly the HTTP status code.
This is a big caveat when leveraging `lightpanda fetch` in a pipeline.

This introduces a `--json` option to provide a structured output that
contains:
* url
* HTTP status code
* response headers
* rendered content as controlled by the `--dump` option

The proposal is to always output the same JSON format even when not
using `--dump` with an option.
2026-05-19 12:08:23 +02:00
Adrià Arrufat
74ba2fb6bd agent: reuse ToolExecutor.buildTools in SlashCommand tests 2026-05-19 12:03:50 +02:00