Commit Graph

106 Commits

Author SHA1 Message Date
Adrià Arrufat
88c8828224 Merge branch 'main' into agent 2026-06-05 12:43:46 +02:00
Adrià Arrufat
ac82551e66 agent: replace -i flag with /save and /load commands
Removes the `-i`/`--interactive` CLI flag and live file-based
recording. Instead, the REPL now supports a `/load <path>` command
to run scripts from disk, and `/save` to export the in-memory
session recording.

The `Recorder` is simplified to be purely in-memory, and the script
runtime is moved to `src/script/Runtime.zig`.

BREAKING CHANGE: The `-i`/`--interactive` flag has been removed. Use
the `/save` and `/load` commands within the REPL instead.
2026-06-03 16:21:10 +02:00
Karl Seguin
1e2e283c0c Add canary to RC
We still occasionally see release overflow errors. I think this is a UAF, since
I cannot find any mismatched acquire/releases. So this adds a canary value to
every RC and includes it in the crash dump.

If the canary value is different than what we set it at, then we have a UAF.
If the canary value is set to the poison we set on release, then we have a
double free.
If the canary value is unchanged, then it's inconclusive.
2026-06-03 18:42:24 +08:00
Adrià Arrufat
044b34d228 refactor: remove legacy PandaScript self-healing and execution
Removes the `--self-heal` CLI option, the `scriptStep` and `scriptHeal`
MCP tools, and associated verification/iterator machinery. Replaces
"PandaScript" terminology with "slash commands" and moves shared
helpers to `tools.zig`.

BREAKING CHANGE: The `--self-heal` CLI flag and the `scriptStep` and
`scriptHeal` MCP tools have been removed.
2026-06-01 17:29:24 +02:00
Francis Bouvier
8824174afa agent: run recorded scripts as JavaScript
Replace recorded agent replay with a standalone JavaScript script runtime.
Install synchronous agent primitives in an isolated V8 context, add console
output, return structured extract values as JS objects/arrays, and route script
execution through the new runtime.

Update recording to emit .js function calls, default /save filenames to .js,
drop script-level extract save support, and refresh agent docs/tutorials for
the new format.

Self-healing is disabled for now.
2026-06-01 15:43:37 +02:00
Karl Seguin
e6332ac121 Close session before freeing notification
With the new CookieStore, the session must be freed before the notification is.
This is how it works in CDP, but in fetch, we were pretty lazy about it. This
caused the notification to be freed first, and then the cookiestore to try to
unregister: UAF.
2026-05-30 08:44:35 +08:00
Adrià Arrufat
4bc5baf7ee agent: improve command diagnostics and fix various bugs
- Add diagnostics to command parsing to report invalid field values.
- Fix UTF-8 truncation in spinner to avoid splitting codepoints.
- Fix swapped arguments in log error formatting.
- Distinguish "null" values from missing elements in verifier.
- Handle MCP tool cancellation and timeouts with specific error codes.
2026-05-25 17:51:46 +02:00
Adrià Arrufat
989932b40a Merge branch 'main' into agent 2026-05-20 14:57:32 +02:00
Karl Seguin
e7b16983bb Add help documentation + small tweaks
This adds help documentation for the --json flag. This is the only thing that
must be kept from this commit.

It uses our testing.zig to streamline the json testing (instead of string
probing). It removes the JsonEnvelope in favor of an anonymous struct (though,
I'm fine with adding it back in if it's needed to resolve something ambiguous).

Finally, I removed the last unit test as, at that point, it's really just
testing Zig's JSON stringifier (could arguably make the same case for the other
two, but there's some logic there about how nulls/empty might be handled).
2026-05-20 10:44:41 +08:00
Marc Helbling
fec2bbda7b review 2026-05-19 18:13:34 +02:00
Marc Helbling
a89a28a4a2 feat: add --json to fetch command
The `fetch` command is very practical to render pages without needing to
have a long running browser instance.
It is however masking all details on the fetch, most importantly the HTTP status code.
This is a big caveat when leveraging `lightpanda fetch` in a pipeline.

This introduces a `--json` option to provide a structured output that
contains:
* url
* HTTP status code
* response headers
* rendered content as controlled by the `--dump` option

The proposal is to always output the same JSON format even when not
using `--dump` with an option.
2026-05-19 12:08:23 +02:00
Adrià Arrufat
d0a8da453b agent: implement graceful Ctrl-C interruption 2026-05-15 18:45:56 +02:00
Adrià Arrufat
eb1986b8eb Merge branch 'main' into agent 2026-05-15 10:51:45 +02:00
Karl Seguin
a59ddeb360 Dump using the latest Frame to prevent segfault during on frame change
Fixes: https://github.com/lightpanda-io/browser/issues/2446
2026-05-14 20:00:20 +08:00
Adrià Arrufat
70edee7063 lightpanda: close session before notification deinit 2026-05-13 17:49:00 +02:00
Adrià Arrufat
c41955ade3 script: move script module to root src directory 2026-05-11 19:56:36 +02:00
Adrià Arrufat
1a84f56160 refactor: relocate PandaScript and improve agent reliability
Moves script logic to `browser/script/` for shared use. Implements
message rollback on API failure, caches environment variables, and
fixes a potential panic in the spinner.

- Relocate Command, Recorder, and Verifier to `src/browser/script/`
- Implement message rollback on API and synthesis failures in Agent
- Cache `LP_*` environment variables process-wide with mutex protection
- Fix potential panic in Spinner during backward clock jumps
- Improve Recorder to handle write failures and multi-line comments
- Update documentation regarding attachments and path safety
2026-05-11 19:49:15 +02:00
Adrià Arrufat
75b7a8ec6d Merge branch 'main' into agent 2026-05-11 17:52:35 +02:00
Halil Durak
246f91d1f8 --inject-script: prefer splice bytes into <head> directly 2026-05-11 15:15:35 +03:00
Halil Durak
c566d0c41c introduce --inject-script and --inject-script-file
* Prefer `--inject-*` prefix.
* Support injecting multiple scripts (also allows using both variants together).
* Instead of executing scripts in JS context, actually insert them to `<head>` for correct dump output.
2026-05-11 15:15:35 +03:00
Halil Durak
39f12a5669 fetch: add support for --script option
Allows passing a JS file as an arg to be executed alongside other scripts.
2026-05-11 15:15:35 +03:00
Adrià Arrufat
c6ccd83ac4 mcp: add pandascript recording and self-healing tools
Adds tools to record sessions and heal scripts over MCP. Refactors
shared logic to `script.zig` and adds a TTY spinner for the agent.
2026-05-07 20:11:40 +02:00
Adrià Arrufat
a00161f54c Merge branch 'main' into agent 2026-05-05 08:05:57 +02:00
Nikolay Govorov
9a312a4177 Refactor server/client/cdp structure 2026-05-04 16:41:22 +01:00
Adrià Arrufat
1d7e92beeb Merge branch 'main' into agent 2026-04-27 06:40:15 +02:00
Karl Seguin
12c2efb811 Adds --terminate-ms command line argument + ctrl-c improvements in fetch
The main.zig path for `fetch` now captures the *Browser so that
browser.env.terminate() can be called. This is a bit more complex than the serve
path because the Browser owns the Isolate and can't be moved from one thread to
another.

With main having access to the browser, two things are now possible:
1 - We can support a --terminate-ms flag (https://github.com/lightpanda-io/browser/issues/2206)
2 - ctrl-c can correctly stop blocked JavaScript processes

1 is implemented via setitimer to set a timer for SIGALRM, avoiding the need to
add another "watcher" thread, or putting a timer in Network.run.
2026-04-25 12:34:06 +08:00
Adrià Arrufat
ec3ff945cd Merge branch 'main' into agent 2026-04-23 12:31:24 +02:00
Karl Seguin
550fb58f3f Introduce Page (container)
Follow up to https://github.com/lightpanda-io/browser/pull/2200

This change is actually pretty mundane, but a bunch of files that used to
take a *Session (e.g. every WebAPI releaseRef and deinit) now take a *Page.

This aims to separate the 2 lifetimes currently managed by Session by moving
the "Page" lifetime to a dedicated container: Page. Ultimately, the goal is to
remove the 1-page-per-session limit of the current design. Not to explicitly
support multiple pages per session (though, that's more possible now), but
in order to better emulate Chrome where, during a navigation event, the old and
new page both exist.
2026-04-23 15:48:13 +08:00
Adrià Arrufat
58308cfdca Merge branch 'main' into agent 2026-04-22 07:13:04 +02:00
Karl Seguin
2275416505 Page -> Frame
This is to pave the way for introducing a new "Page" container, which will take
over the page lifecycle currently burdening Session. The ultimate goal of that
is to allow the Session to have multiple pages (mostly for better transitions
between pages), which is hard to do now since the Session has so much state.

This rename was aggressive, e.g. currentPage() -> currentFrame() so that, when
the new Page container is added, you won't see "currentPage()" and wonder:

  "Does 'currentPage' mean the new Page container, or the Frame (which
  used to be called Page)".
2026-04-22 08:42:18 +08:00
Adrià Arrufat
efc2b5d236 Merge branch 'main' into agent 2026-04-20 09:37:36 +02:00
Karl Seguin
2d20e57f80 Change all @import("...../log.zig") to const log = lp.log;
@import("lightpanda") where needed.

Would also like to do this for String, Page, Session and js which all stand out
as types that are use across the codebase.

I know that a few devs are doing this in new work and I haven't heard anyone
voice an objection.
2026-04-20 12:40:04 +08:00
Adrià Arrufat
2d808336df Merge branch 'main' into agent 2026-04-18 08:28:03 +02:00
Karl Seguin
e6a190c72d Improve Cookie parsing rules
This commit was focused on making small changes to cookie parsing in order to
improve a handful of WPT cases.

Protects against nameless cookies that have a value with certain prefixes (e.g.
`__Secure-`) which could be pased into an incorrect prefix.

It allows tabs in cookie names/values.

It enforces that certain prefix have certain settings.

Along the way:
- Added window.origin
- Ensure URL passed to Request is escaped
- Added log on fetch error
- made --dump wpt only list non-passing results
2026-04-17 19:12:19 +08:00
Adrià Arrufat
49e4ed6fc2 Merge branch 'main' into agent 2026-04-17 09:19:46 +02:00
Karl Seguin
3ca1f230b9 Serialize sameSite
Tweak ergonomics (public functions log internally and are infallible). Use
readFileAlloc directly. Fix possible memory leak with cookie arena - I don't
think you can make a copy of the arena, and then dupe with the original.
2026-04-16 10:35:34 +08:00
Pierre Tachoire
a24fcc6a5c use session arg to load cookies from file 2026-04-15 10:29:53 -04:00
Pierre Tachoire
cc4bd417d2 save cookies at the end of fetch 2026-04-15 10:09:14 -04:00
Matt Van Horn
35991a1b32 refactor: split --cookies-file into --cookie/--cookie-jar per curl convention
Split the single --cookies-file flag into two flags following curl's
convention as requested by @krichprollsch:

- --cookie (read-only): loads cookies at startup for fetch, mcp, and
  serve/CDP commands
- --cookie-jar (write-only): saves cookies on exit for fetch and mcp
  only (CDP cookie-jar deferred per maintainer guidance)

Add cookie integration to MCP server (load in init, save in deinit)
and CDP session creation (load only). The serve command now rejects
--cookie-jar with a clear error message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:09:13 -04:00
Matt Van Horn
4d384dfe01 feat: add --cookies-file flag for session persistence
Add a --cookies-file CLI option that loads cookies from a JSON file
at startup and saves them back on exit. This enables AI agents to
maintain login sessions across multiple Lightpanda invocations.

The cookie format matches CDP Network.Cookie (compatible with
Puppeteer's page.cookies() export):
  [{"name":"sid","value":"abc","domain":".example.com","path":"/",
    "expires":1234567890,"secure":true,"httpOnly":true}]

Closes #335

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:09:12 -04:00
Adrià Arrufat
b99fd47c9d Merge branch 'main' into agent 2026-04-10 17:16:29 +02:00
Karl Seguin
05229fdc53 Use the document's charset to determine if/how to encode querystring
Whenever we resolve a URL, say from `anchor.href`, we should consider the
document's charset when encoding the querystring. This probably isn't the
most important feature, but it makes tens of thousands of WPT cases pass, e.g

/encoding/legacy-mb-tchinese/big5/big5-encode-href-errors-han.html?3001-4000 and
/encoding/legacy-mb-japanese/euc-jp/eucjp-encode-href-errors-han.html?17001-18000

DOM elements previous called `URL.resolveURL(...)`. They now call
`self.asNode().resolveURL(...)`, where `Node#resolveURL` will provide the
document's charset.
2026-04-10 16:47:42 +08:00
Adrià Arrufat
048cb46950 Refactor: extract shared tool execution layer from MCP and agent
Both MCP tools.zig and agent ToolExecutor.zig duplicated the same 20 tool
handlers. This extracts the shared logic into src/browser/tools.zig so new
tools only need to be added once. Reduces total code by ~620 lines while
keeping all 465 tests passing.
2026-04-07 08:05:36 +02:00
Adrià Arrufat
f7a4f1345f Merge branch 'main' into agent 2026-04-07 06:49:56 +02:00
Karl Seguin
de8a5eeec1 zig fmt 2026-04-07 07:28:31 +08:00
Karl Seguin
36d3be5534 add assertion on RC (to catch release overflow) 2026-04-07 07:28:31 +08:00
Adrià Arrufat
a81a24229b Add interactive agent mode with LLM-powered web browsing
Introduces `lightpanda agent` command that provides a REPL where users
can chat with an AI that uses the browser's tools (goto, markdown, click,
fill, etc.) to browse the web. Uses zenai for multi-provider LLM support
(Anthropic, OpenAI, Gemini) and linenoise v2 for terminal line editing.
2026-04-04 07:56:10 +02:00
Karl Seguin
77b60cebb0 Move finalizers to pure reference counting
Takes https://github.com/lightpanda-io/browser/pull/2024 a step further and
changes all reference counting to be explicit.

Up until this point, finalizers_callback was seen as a fail-safe to make sure
that instances were released no matter what. It exists because v8 might never
call a finalizer, so we need to keep track of finalizables and finalize them
on behalf of v8. BUT, it was used as more than a fallback for v8...it allowed
us to be lazy and acquireRef's in Zig without a matching releaseRef (1), because
why not, the finalizer_callback will handle it.

This commit redefines finalizer_callbacks as strictly being a fallback for v8.
If v8 calls the finalizer, then the finalizer callback is removed (2) - we lose
our fail-safe. This means that every acquireRef must be matched with a
releaseRef. Everything is explicit now. The most obvious impact of this is
that on Page.deinit, we have to releaseRef every MO, IO and blob held by the
page.

This change removes a number of special-cases to deal with various ownership
patterns. For example, Iterators are now properly reference counted and when their
RC reaches 0, they can safely releaseRef on their list. This also elimites
use-after-free potential when 2 RC objects reference each other. This should
eliminate some WPT crashes (e.g. /editing/run/insertimage.html)

(1) - We were only ever lazy about releaseRef during shutdown, so this change
won't result in more aggressive collection.

(2) Since 1 object can be referenced from 0-N IsolatedWorlds, it would be more
accurate to say that the finalizer callback is removed when all referencing
IsolatedWorld finalize it.
2026-04-02 17:04:33 +08:00
Karl Seguin
0604056f76 Improve network naming consistency
1.
Runtime.zig -> Network.zig (especially since most places imported it as
`const Network = @import("Runtime.zig")`

2.
const net_http = @import(...) -> const http = @import(...)
2026-04-01 18:46:03 +08:00
Karl Seguin
4ad8282e75 Merge pull request #2047 from lightpanda-io/fancy-wait
Add --wait-selector, --wait-script and --wait-script-file options to …
2026-03-31 20:59:48 +08:00