browser

mirror of https://github.com/lightpanda-io/browser.git synced 2026-06-11 09:35:59 -04:00

Author	SHA1	Message	Date
Karl Seguin	12c2efb811	Adds --terminate-ms command line argument + ctrl-c improvements in fetch The main.zig path for `fetch` now captures the *Browser so that browser.env.terminate() can be called. This is a bit more complex than the serve path because the Browser owns the Isolate and can't be moved from one thread to another. With main having access to the browser, two things are now possible: 1 - We can support a --terminate-ms flag (https://github.com/lightpanda-io/browser/issues/2206) 2 - ctrl-c can correctly stop blocked JavaScript processes 1 is implemented via setitimer to set a timer for SIGALRM, avoiding the need to add another "watcher" thread, or putting a timer in Network.run.	2026-04-25 12:34:06 +08:00
Karl Seguin	550fb58f3f	Introduce Page (container) Follow up to https://github.com/lightpanda-io/browser/pull/2200 This change is actually pretty mundane, but a bunch of files that used to take a Session (e.g. every WebAPI releaseRef and deinit) now take a Page. This aims to separate the 2 lifetimes currently managed by Session by moving the "Page" lifetime to a dedicated container: Page. Ultimately, the goal is to remove the 1-page-per-session limit of the current design. Not to explicitly support multiple pages per session (though, that's more possible now), but in order to better emulate Chrome where, during a navigation event, the old and new page both exist.	2026-04-23 15:48:13 +08:00
Karl Seguin	2275416505	Page -> Frame This is to pave the way for introducing a new "Page" container, which will take over the page lifecycle currently burdening Session. The ultimate goal of that is to allow the Session to have multiple pages (mostly for better transitions between pages), which is hard to do now since the Session has so much state. This rename was aggressive, e.g. currentPage() -> currentFrame() so that, when the new Page container is added, you won't see "currentPage()" and wonder: "Does 'currentPage' mean the new Page container, or the Frame (which used to be called Page)".	2026-04-22 08:42:18 +08:00
Karl Seguin	2d20e57f80	Change all @import("...../log.zig") to const log = lp.log; @import("lightpanda") where needed. Would also like to do this for String, Page, Session and js which all stand out as types that are use across the codebase. I know that a few devs are doing this in new work and I haven't heard anyone voice an objection.	2026-04-20 12:40:04 +08:00
Karl Seguin	e6a190c72d	Improve Cookie parsing rules This commit was focused on making small changes to cookie parsing in order to improve a handful of WPT cases. Protects against nameless cookies that have a value with certain prefixes (e.g. `__Secure-`) which could be pased into an incorrect prefix. It allows tabs in cookie names/values. It enforces that certain prefix have certain settings. Along the way: - Added window.origin - Ensure URL passed to Request is escaped - Added log on fetch error - made --dump wpt only list non-passing results	2026-04-17 19:12:19 +08:00
Karl Seguin	3ca1f230b9	Serialize sameSite Tweak ergonomics (public functions log internally and are infallible). Use readFileAlloc directly. Fix possible memory leak with cookie arena - I don't think you can make a copy of the arena, and then dupe with the original.	2026-04-16 10:35:34 +08:00
Pierre Tachoire	a24fcc6a5c	use session arg to load cookies from file	2026-04-15 10:29:53 -04:00
Pierre Tachoire	cc4bd417d2	save cookies at the end of fetch	2026-04-15 10:09:14 -04:00
Matt Van Horn	35991a1b32	refactor: split --cookies-file into --cookie/--cookie-jar per curl convention Split the single --cookies-file flag into two flags following curl's convention as requested by @krichprollsch: - --cookie (read-only): loads cookies at startup for fetch, mcp, and serve/CDP commands - --cookie-jar (write-only): saves cookies on exit for fetch and mcp only (CDP cookie-jar deferred per maintainer guidance) Add cookie integration to MCP server (load in init, save in deinit) and CDP session creation (load only). The serve command now rejects --cookie-jar with a clear error message. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:09:13 -04:00
Matt Van Horn	4d384dfe01	feat: add --cookies-file flag for session persistence Add a --cookies-file CLI option that loads cookies from a JSON file at startup and saves them back on exit. This enables AI agents to maintain login sessions across multiple Lightpanda invocations. The cookie format matches CDP Network.Cookie (compatible with Puppeteer's page.cookies() export): [{"name":"sid","value":"abc","domain":".example.com","path":"/", "expires":1234567890,"secure":true,"httpOnly":true}] Closes #335 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 10:09:12 -04:00
Karl Seguin	05229fdc53	Use the document's charset to determine if/how to encode querystring Whenever we resolve a URL, say from `anchor.href`, we should consider the document's charset when encoding the querystring. This probably isn't the most important feature, but it makes tens of thousands of WPT cases pass, e.g /encoding/legacy-mb-tchinese/big5/big5-encode-href-errors-han.html?3001-4000 and /encoding/legacy-mb-japanese/euc-jp/eucjp-encode-href-errors-han.html?17001-18000 DOM elements previous called `URL.resolveURL(...)`. They now call `self.asNode().resolveURL(...)`, where `Node#resolveURL` will provide the document's charset.	2026-04-10 16:47:42 +08:00
Karl Seguin	de8a5eeec1	zig fmt	2026-04-07 07:28:31 +08:00
Karl Seguin	36d3be5534	add assertion on RC (to catch release overflow)	2026-04-07 07:28:31 +08:00
Karl Seguin	77b60cebb0	Move finalizers to pure reference counting Takes https://github.com/lightpanda-io/browser/pull/2024 a step further and changes all reference counting to be explicit. Up until this point, finalizers_callback was seen as a fail-safe to make sure that instances were released no matter what. It exists because v8 might never call a finalizer, so we need to keep track of finalizables and finalize them on behalf of v8. BUT, it was used as more than a fallback for v8...it allowed us to be lazy and acquireRef's in Zig without a matching releaseRef (1), because why not, the finalizer_callback will handle it. This commit redefines finalizer_callbacks as strictly being a fallback for v8. If v8 calls the finalizer, then the finalizer callback is removed (2) - we lose our fail-safe. This means that every acquireRef must be matched with a releaseRef. Everything is explicit now. The most obvious impact of this is that on Page.deinit, we have to releaseRef every MO, IO and blob held by the page. This change removes a number of special-cases to deal with various ownership patterns. For example, Iterators are now properly reference counted and when their RC reaches 0, they can safely releaseRef on their list. This also elimites use-after-free potential when 2 RC objects reference each other. This should eliminate some WPT crashes (e.g. /editing/run/insertimage.html) (1) - We were only ever lazy about releaseRef during shutdown, so this change won't result in more aggressive collection. (2) Since 1 object can be referenced from 0-N IsolatedWorlds, it would be more accurate to say that the finalizer callback is removed when all referencing IsolatedWorld finalize it.	2026-04-02 17:04:33 +08:00
Karl Seguin	0604056f76	Improve network naming consistency 1. Runtime.zig -> Network.zig (especially since most places imported it as `const Network = @import("Runtime.zig")` 2. const net_http = @import(...) -> const http = @import(...)	2026-04-01 18:46:03 +08:00
Karl Seguin	4ad8282e75	Merge pull request #2047 from lightpanda-io/fancy-wait Add --wait-selector, --wait-script and --wait-script-file options to …	2026-03-31 20:59:48 +08:00
Karl Seguin	af19cbc726	use remaining instead of timeout_ms, else, what's the point? Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>	2026-03-31 19:42:32 +08:00
Karl Seguin	492fd86bad	Expand the lifetime of the XHR reference We need to take the self-reference to the XHR object as soon as the request is made. Previously, we were waiting until we got the start callback, but v8 could (and does) drop the reference before that happens. Unfortunately, that means we can no longer use _transfer == null to tell if we own a reference or not, so a new boolean was added.	2026-03-31 16:46:31 +08:00
Karl Seguin	ab6c63b24b	Add --wait-selector, --wait-script and --wait-script-file options to fetch These new optional parameter run AFTER --wait-until, allowing the (imo) useful combination of `--wait-until load --wait-script "report.complete === true"`. However, if `--wait-until` IS NOT specified but `--wait-selector/script` IS, then there is no default wait and it'll just check the selector/script. If neither `--wait-selector` or `--wait-script/--wait-script-file` are specified then `--wait-until` continues to default to `done`. These waiters were added to the Runner, and the existing Action.waitForSelector now uses the runner's version. Selector querying has been split into distinct parse and query functions, so that we can parse once, and query on every tick. We could potentially optimize --wait-script to compile the script once and call it on each tick, but we'd have to detect page navigation to recompile the script in the new context. Something I'd rather optimize separately.	2026-03-31 12:30:46 +08:00
Karl Seguin	ad54437ca3	zig fmt	2026-03-28 21:43:46 +08:00
Karl Seguin	01ecb296e5	Rework finalizers This commit involves a number of changes to finalizers, all aimed towards better consistency and reliability. A big part of this has to do with v8::Inspector's ability to move objects across IsolatedWorlds. There has been a few previous efforts on this, the most significant being https://github.com/lightpanda-io/browser/pull/1901. To recap, a Zig instance can map to 0-N v8::Objects. Where N is the total number of IsolatedWorlds. Generally, IsolatedWorlds between origins are...isolated...but the v8::Inspector isn't bound by this. So a Zig instance cannot be tied to a Context/Identity/IsolatedWorld...it has to live until all references, possibly from different IsolatedWorlds, are released (or the page is reset). Finalizers could previously be managed via reference counting or explicitly toggling the instance as weak/strong. Now, only reference counting is supported. weak/strong can essentially be seen as an acquireRef (rc += 1) and releaseRef (rc -= 1). Explicit setting did make some things easier, like not having to worry so much about double-releasing (e.g. XHR abort being called multiple times), but it was only used in a few places AND it simply doesn't work with objects shared between IsolatedWorlds. It is never a boolean now, as 3 different IsolatedWorlds can each hold a reference. Temps and Globals are tracked on the Session. Previously, they were tracked on the Identity, but that makes no sense. If a Zig instance can outlive an Identity, then any of its Temp references can too. This hasn't been a problem because we've only seen MutationObserver and IntersectionObserver be used cross-origin, but the right CDP script can make this crash with a use-after-free (e.g. `MessageEvent.data` is released when the Identity is done, but `MessageEvent` is still referenced by a different IsolateWorld). Rather than deinit with a `comptime shutdown: bool`, there is now an explicit `releaseRef` and `deinit`. Bridge registration has been streamlined. Previously, types had to register their finalizer AND acquireRef/releaseRef/deinit had to be declared on the entire prototype chain, even if these methods just delegated to their proto. Finalizers are now automatically enabled if a type has a `acquireRef` function. If a type has an `acquireRef`, then it must have a `releaseRef` and a `deinit`. So if there's custom cleanup to do in `deinit`, then you also have to define `acquireRef` and `releaseRef` which will just delegate to the _proto. Furthermore these finalizer methods can be defined anywhere on the chain. Previously: ```zig const KeywboardEvent = struct { _proto: Event, ... pub fn deinit(self: KeyboardEvent, session: Session) void { self._proto.deinit(session); } pub fn releaseRef(self: KeyboardEvent, session: Session) void { self._proto.releaseRef(session); } } ``` ```zig const KeyboardEvent = struct { _proto: Event, ... // no deinit, releaseRef, acquireref } ``` Since the `KeyboardEvent` doesn't participate in finalization directly, it doesn't have to define anything. The bridge will detect the most specific place they are defined and call them there.	2026-03-28 21:11:23 +08:00
Adrià Arrufat	7e778a17d6	MCP/CDP: unify node registration This fixes a bug in MCP where interactive elements were not assigned a backendNodeId, preventing agents from clicking or filling them. Also extracts link collection to a shared browser module.	2026-03-26 23:51:43 +09:00
Adrià Arrufat	260768463b	Merge branch 'main' into osc/feat-mcp-detect-forms	2026-03-24 09:25:47 +09:00
Karl Seguin	c9bc370d6a	Extract Session.wait into a Runner This is done for a couple reasons. The first is just to have things a little more self-contained for eventually supporting more advanced "wait" logic, e.g. waiting for a selector. The other is to provide callers with more fine-grained controlled. Specifically the ability to manually "tick", so that they can [presumably] do something after every tick. This is needed by the test runner to support more advanced cases (cases that need to test beyond 'load') and it also improves (and fixes potential use-after-free, the lp.waitForSelector)	2026-03-23 12:30:41 +08:00
Matt Van Horn	78c6def2b1	mcp: add detectForms tool for structured form discovery Add a detectForms MCP tool and lp.detectForms CDP command that return structured form metadata from the current page. Each form includes its action URL, HTTP method, and fields with names, types, required status, values, select options, and backendNodeIds for use with the fill tool. This lets AI agents discover and fill forms in a single step instead of calling interactiveElements, filtering for form fields, and guessing which fields belong to which form. New files: - src/browser/forms.zig: FormInfo/FormField structs, collectForms() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-21 08:40:50 -07:00
Karl Seguin	a4cb5031d1	Tweak wait_until option Small tweaks to https://github.com/lightpanda-io/browser/pull/1896 Improve the wait ergonomics with an Option with default parameter. Revert page pointer logic to original (don't think that change was necessary).	2026-03-19 20:29:20 +08:00
shaewe180	09327c3897	feat: fetch add wait_until parameter for page loads options Add `--wait_until` and `--wait_ms` CLI arguments to configure session wait behavior. Updates `Session.wait` to evaluate specific page load states (`load`, `domcontentloaded`, `networkidle`, `fixed`) before completing the wait loop.	2026-03-18 15:08:51 +08:00
Adrià Arrufat	32f450f803	browser: centralize node interaction logic Extracts click, fill, and scroll logic from CDP and MCP domains into a new dedicated actions module to reduce code duplication.	2026-03-16 14:22:15 +09:00
Adrià Arrufat	60699229ca	Merge branch 'main' into semantic-tree	2026-03-11 20:52:39 +09:00
Adrià Arrufat	6c7272061c	cli: enable pruning for semantic_tree_text dump mode Previously, semantic_tree_text hardcoded prune = false, which bypassed the structural node filters and allowed empty none nodes to pollute the root of the text dump.	2026-03-11 10:38:12 +09:00
Nikolay Govorov	3626f70d3e	Merge pull request #1759 from lightpanda-io/wp/mrdimidum/net-poll-runtime Network poll runtime	2026-03-10 23:38:07 +00:00
Adrià Arrufat	d1ee0442ea	Merge branch 'main' into semantic-tree	2026-03-10 21:48:49 +09:00
Adrià Arrufat	56f47ee574	Merge branch 'main' into semantic-tree	2026-03-10 17:26:34 +09:00
egrs	74f0436ac7	merge main, resolve conflicts with getInteractiveElements	2026-03-10 09:25:12 +01:00
egrs	22d31b1527	add LP.getStructuredData CDP command	2026-03-10 09:19:51 +01:00
Nikolay Govorov	687f577562	Move accept loop to common runtime	2026-03-10 03:00:50 +00:00
Nikolay Govorov	8e59ce9e9f	Prepare global NetworkRuntime module	2026-03-10 03:00:47 +00:00
egrs	a417c73bf7	add LP.getInteractiveElements CDP command Returns a structured list of all interactive elements on a page: buttons, links, inputs, ARIA widgets, contenteditable regions, and elements with event listeners. Includes accessible names, roles, listener types, and key attributes. Event listener introspection (both addEventListener and inline handlers) is unique to LP — no other browser exposes this to automation code.	2026-03-09 19:46:12 +01:00
Adrià Arrufat	85ebbe8759	SemanticTree: improve accessibility tree and name calculation - Add more structural roles (banner, navigation, main, list, etc.). - Implement fallback for accessible names (SVG titles, image alt text). - Skip children for leaf-like semantic nodes to reduce redundancy. - Disable pruning in the default semantic tree view.	2026-03-09 21:04:47 +09:00
Adrià Arrufat	3c97332fd8	feat(dump): add semantic_tree and semantic_tree_text formats Adds support for dumping the semantic tree in JSON or text format via the --dump option. Updates the Config enum and usage help.	2026-03-09 18:23:52 +09:00
Adrià Arrufat	248851701f	Refactor: move SemanticTree to core and expose via MCP tools	2026-03-06 15:44:03 +09:00
Adrià Arrufat	0f46277b1f	CDP: implement LP.getSemanticTree for native semantic DOM extraction	2026-03-06 15:29:32 +09:00
Adrià Arrufat	982b8e2d72	mcp: remove redundant mcp from test references	2026-03-02 22:24:17 +09:00
Adrià Arrufat	da51cdd11d	Merge branch 'main' into mcp	2026-03-02 11:55:36 +09:00
Adrià Arrufat	aae9a505e0	mcp: promot Server.zig to file struct	2026-02-28 21:02:49 +09:00
Karl Seguin	45196e022b	Add a "wpt" dump mode Adds a not-documented "wpt" mode to --dump which outputs a formatted report.cases. This is meant to make working on a single WPT test case easier, particularly with some coding tool. Claude recommended this output for its own use. Instead of telling claude to start the browser in serve mode, then run the wptrunner, and merge the two outputs (and then stop the server), you can do: zig build run -- fetch --dump wpt "http://localhost:8000/dom/nodes/CharacterData-appendChild.html" (you still need the wpt server up)	2026-02-28 19:08:58 +08:00
Karl Seguin	21be3db51f	Callers to page.navigate ensure URL is properly encoded. Follow up to https://github.com/lightpanda-io/browser/pull/1646 The encodeURL (renamed to ensureEncoded and exposed in this commit) already handled already-encoded URLs, so this was largely a matter of exposing the functionality. The reason this isn't baked directly into Page.navigate is that, in some places e.g. internal navigation, the URL is already know to be encoded. So it's up to every caller to make sure they are passing a valid URL to navigate.	2026-02-26 12:22:06 +08:00
Adrià Arrufat	8c8a05b8c1	mcp: consolidate tests and cleanup imports	2026-02-26 00:02:49 +09:00
Karl Seguin	a818560344	Add a --with_frames argument to fetch When set (defaults to not set/false), --dump will include iframe contents. I was hoping I could add a mode to strip_mode to this, but since dump is used extensively (e.g. innerHTML), this is something that has to be off by default (for correctness).	2026-02-25 15:29:27 +08:00
Adrià Arrufat	5fea4cf760	mcp: add protocol and router unit tests	2026-02-22 23:15:45 +09:00

1 2

72 Commits