browser

mirror of https://github.com/lightpanda-io/browser.git synced 2026-06-11 17:46:32 -04:00

Author	SHA1	Message	Date
Nikolay Govorov	ceee9cfb10	Move Handles stuff to Network	2026-05-26 15:28:54 +01:00
Karl Seguin	e4171bc694	Merge pull request #2501 from lightpanda-io/remove_reentrency_teardown_protection Remove reentrency teardown protection	2026-05-21 10:15:26 +08:00
Pierre Tachoire	6e6b3caf96	Merge pull request #2479 from navidemad/accessibility-query-ax-tree Implement Accessibility.queryAXTree CDP method (and fix latent frame-binding bug)	2026-05-20 13:59:35 +02:00
Karl Seguin	a9cf87e0b0	Remove reentrency teardown protection This largely reverts `92607ad765` (captured in PR: https://github.com/lightpanda-io/browser/pull/2398). https://github.com/lightpanda-io/browser/pull/2495 introduces protection against execution arbitrary CDP command during JavaScript callbacks. Claude initially made the case for keeping the existing code as a safety net, but sycophanted when I pushed by. My reason for removing it is that it isn't a low-maintenance guard. It's a flag that serves a real purpose (ensuring 1 JS script is finished before executing another one), that has been expended to solve these issues. It needs to be set (and reverted) at every callsite that makes a blocking call, and it needs to be checked (recursively across all frames) in any place that can teardown the page/ frame. Claude called the allowlist "load-bearing in a non-obvious way", but I think it's purpose built specifically for this case. Extended the comment atop `allowDuringSyncWait` so that future-selves remember this.	2026-05-20 15:08:18 +08:00
Navid EMAD	814ca8ab3f	accessibility: unify query+tree writers, route objectId via dom.getNode Fold QueryWriter into Writer behind an Opts.filter. Tree mode is unchanged (filter=null); query mode walks the full subtree (including AX-ignored nodes per the queryAXTree spec) and emits the flat-match shape. Shared resolveRole helper handles label-promotion for both paths so the two can't drift. Drop the "objectId not yet supported" carve-out: queryAXTree now reuses dom.getNode, which already resolves nodeId/backendNodeId/objectId.	2026-05-19 20:43:54 +02:00
Karl Seguin	97c8ca3832	when work is done, don't keep polling, return to process it	2026-05-19 22:39:48 +08:00
Karl Seguin	875c147783	Main/Network reads CDP socket Previously, the CDP socket was added to the worker's multi and fully owned by the worker. While this is simple, it introduced some issues: 1 - Cannot detect a disconnected client during JS processing ( for(;;) ) 2 - A blocked worker can cause back-pressure that blocks the client. This can cause a deadlock if the worker is blocked waiting for a CDP message In addition to these 2 problems, there was 1 other serious CDP-related issue: arbitrary CDP messages could be processed during JavaScript callback. For example, a Worker calls importScripts while request interception is enabled, this requires us to tick the HttpClient waiting for the interception response. But, a client could sent Target.closeTarget, which we'd process and delete the frame..all while importScripts is still blocked. Assuming importScripts unblocks everything is a big UAF since the frame (and its workers) were cleared from closeTarget. The CDP socket is now read from the network (main) thread and an OTP-style mailbox is used. The network thread posts message to the Worker's inbox and signals it to wakeup. This solves #1 and #2. It doesn't directly solve the reentrancy issue, but it provides the foundation. Specifically, in introduces a queue for of CDP message and more control over when/how that queue is processed. At "safe points" (Runner.tick, HttpClient.tick), any message can be processed. But, when inside a JavaScript callback, we can process only non- destructive/mutating message. Specifically, we can process only messages related to request interception.	2026-05-19 20:52:21 +08:00
Karl Seguin	8ef6084fdb	Re-organization CDP connection network/WsConnection.zig was poorly named. It didn't represent a generic WS connection, but rather a CDP-specific connection. This splits the generic WS logic into network/WS.zig and the CDP-specific details in cdp/Connection.zig. Some of the connection management in the Server has also been simplified.	2026-05-19 10:08:22 +08:00
Karl Seguin	d926291241	Merge pull request #2467 from lightpanda-io/http_transfer Cleanup HttpClient.Transfer	2026-05-16 08:52:12 +08:00
Navid EMAD	b9601be45e	accessibility: bind AX writers to the node's owning frame axnodeWriter and axnodeQueryWriter both used session.currentFrame(), but the root node may belong to a different frame (cross-frame query from a parent context, iframe content). Name resolution (Label lookup against ownerDocument) and visibility checks (frame._style_manager) are per-frame, so the writer needs to bind to the node's owning frame. Uses the existing Node.ownerFrame(fallback) helper. Fallback is currentFrame for the orphan/detached-node case. Also corrects a pre-existing latent bug in getFullAXTree where the writer ignored the resolved frameId and used currentFrame instead.	2026-05-15 22:26:48 +02:00
Navid EMAD	5f2d897f16	accessibility: implement queryAXTree CDP method Adds Accessibility.queryAXTree for finding AX nodes by role and/or accessible name without serializing the full tree. Motivated by agentic / MCP automation workloads where getFullAXTree round-trips multi-MB JSON on complex pages. QueryWriter walks the DOM subtree rooted at the requested node, reuses the existing role + name resolution + ignore logic from AXNode, and emits a flat array of matches. Reuses the same VisibilityCache + LabelByForIndex + temp arena pattern as axnodeWriter, so no extra retained state. MVP limitations, each a small follow-up PR: - objectId param returns a specific not-yet-supported error - matches emit empty properties/childIds and no parentId - frameId not parsed; queries the current frame	2026-05-15 21:38:17 +02:00
Pierre Tachoire	3803a1f8c6	webmcp: use value.jsonStringify for JSON write	2026-05-15 11:17:53 +02:00
Pierre Tachoire	7c5a3b211f	cdp: cancel inflight webmcp invocation on bc deinit	2026-05-15 08:50:48 +02:00
Pierre Tachoire	5e0901aaf7	cdp: fix invalid arena usage in webmcp	2026-05-15 08:50:47 +02:00
Pierre Tachoire	3ef6e57d58	cdp: adjust invocation id usage for webmcp	2026-05-15 08:50:47 +02:00
Pierre Tachoire	c23d0f4f35	cdp: implement webMCP domain	2026-05-15 08:50:46 +02:00
Karl Seguin	a5162bea8f	Cleanup HttpClient.Transfer This is just moving fields around. The end result is that there's a `transfer.req` and a `transfer.res`. On the Request side, we use to have a nested `params: RequestParam` resulting in a lot of `transfer.req.params.url`. This is now `transfer.req.url`. On the Response side, we had the exact opposite: response fields splattered directly in the transfer, `transfer.response_header`. This is now `transfer.res.header`. There is now an HttpClient.Response, which is the actual final response (which could be for a transfer or something else, e.g the cache). And an HttpClient.Transfer.Response which captures the inflight response data (and is one of the polymorphic variants of the HttpClient.Response). Probably still not ideal, but I'm not sure how to make it cleaner, and even if this is just an intermediary step, I consider it an small win.	2026-05-15 12:55:47 +08:00
Muki Kiboigo	ac863c7e2b	add Network.requestServedFromCache	2026-05-13 21:47:47 -07:00
Pierre Tachoire	198c4e5a0f	Merge pull request #2444 from lightpanda-io/useless-code cdp: remove dead code	2026-05-13 15:36:16 +02:00
Pierre Tachoire	36b55339cd	cdp: reset browser context arena when bc is removed	2026-05-13 11:26:09 +02:00
Pierre Tachoire	403fe0d293	cdp: remove dead code	2026-05-13 11:18:05 +02:00
Pierre Tachoire	854eb6a62d	Merge pull request #2339 from lightpanda-io/cdp-console cdp: implement Console	2026-05-13 08:28:01 +02:00
Karl Seguin	393141e472	pass arena into handlers (consistent with other handlers)	2026-05-13 11:51:59 +08:00
Karl Seguin	82a4fc752b	HttpClient Improvements 1 - Track owner of a request (for simpler / more accurate abort (TBD)) 2 - Create Transfer upfront, make everything work on Transfer (not Request) This helps remove ambiguity about cleanup and simplifies layers. For example Robots request is just another normal request, not a special case. This gives everything a stable address (the *Transfer which can be looked up by id)	2026-05-12 19:26:24 +08:00
Scott Taylor	92607ad765	Defer page teardown while worker scripts are evaluating Worker scripts can call importScripts(), which performs a synchronous HTTP request via HttpClient.syncRequest. To stay responsive during a long fetch, syncRequest pumps the CDP socket (cdp.blocking_read) while waiting. If a CDP message such as Target.closeTarget arrives on that socket mid-fetch, the previous code path tore down the page immediately: Worker JS -> importScripts -> syncRequest -> blocking_read -> CDP dispatch -> Target.closeTarget -> Session.removePage -> Page.deinit -> Frame.deinit -> Worker.deinit (frees worker arena + identity_map) When control unwound back into the worker's eval, the next operation that hit ctx.identity.identity_map.getOrPut dereferenced the freed metadata pointer and segfaulted (sometimes immediately, sometimes a few connections later as the arena got recycled). Reproducer: any URL that loads dedicated workers calling importScripts during initial eval, driven via puppeteer-core's connectOverCDP. The allbirds.com product page (which loads ~8 web-pixel workers each calling importScripts) reliably triggered it within ~10 connections. Session.removePage already deferred when the frame's own ScriptManager.is_evaluating was set; that guard never tripped because worker scripts don't go through the frame's ScriptManager. Fix: * Worker.loadInitialScript now sets the worker's own _worker_scope._script_manager.is_evaluating around the eval, with save/restore so nested worker evals compose correctly. * WorkerGlobalScope.importScript also sets its own _script_manager.is_evaluating around the syncRequest + runMacrotasks. The typical caller (Worker.loadInitialScript) already sets this around its outer eval, so the outer guard usually covers us; the inner mark is defense-in-depth for callers that reach importScripts() from a setTimeout / microtask outside the loadInitialScript scope. * New Frame.anyScriptEvaluating method walks the frame tree (frame ScriptManager + every worker's ScriptManager + child frames) and returns true if any is mid-eval. Session.removePage and CDP.disposeBrowserContext use this in place of the frame-only check, deferring teardown until all evals unwind. Final cleanup happens at CDP.deinit on connection close, matching the existing deferred-teardown contract. Verified by running the puppeteer-core repro back-to-back against a single Lightpanda serve; all returned 200 with the right title, no UAF crashes (was previously crashing within 1-10 runs). All 521 unit tests still pass. Note: a separate, pre-existing latent V8 issue surfaces under stress on this same code path. After many iterations a Runtime.evaluate promise tracked by V8's inspector PromiseHandlerTracker is discarded during garbage collection's first-pass weak callbacks; the discard sends a failure response which triggers v8::String::NewFromOneByte, hitting the debug-only assertion AllowHeapAllocation::IsAllowed() in heap-allocator-inl.h:79 (no allocations allowed during weak callbacks). This reproduces on a baseline build of this PR commit and on a baseline build of just the original two-line is_evaluating fix \u2014 i.e. it is not introduced by the deferral logic. The deferral makes it more visible because inspector callbacks now live longer before teardown, so they are more likely to be alive during a GC. Tracking this as a follow-up; the fix here still resolves the UAF that was crashing the server immediately.	2026-05-09 17:26:41 -04:00
Pierre Tachoire	d6c9a5fb83	cdp: add runtime.consoleAPICalled	2026-05-06 18:34:30 +02:00
Pierre Tachoire	595b774f1d	cdp: implement Console.messageAdded event	2026-05-06 18:34:29 +02:00
Nikolay Govorov	e5a9f8ba2e	Fix ony more crash	2026-05-04 18:12:47 +01:00
Nikolay Govorov	9a312a4177	Refactor server/client/cdp structure	2026-05-04 16:41:22 +01:00
Pierre Tachoire	c3fe5346c2	cdp: add console enable/disable commands	2026-05-04 14:59:17 +02:00
Pierre Tachoire	080e1e6415	cdp: rename Audit into Audits	2026-05-04 12:42:55 +02:00
Pierre Tachoire	cddabe60f5	cdp: avoid request id conflict between LID- and REQ- Use distinct key for laoder id and request id based captured response.	2026-05-04 08:59:53 +02:00
Pierre Tachoire	11172a341a	cdp: use loader_id as captured response key for documents	2026-05-04 08:59:50 +02:00
Navid EMAD	fd2f26a065	Merge remote-tracking branch 'origin/main' into fix-a3-handle-javascript-dialog	2026-04-29 00:57:03 +02:00
Muki Kiboigo	85a5c0f927	decrement intercepted and properly deinit on BrowserContext deinit	2026-04-28 07:01:43 -07:00
Muki Kiboigo	3db3281e8e	working authentication with InterceptionLayer	2026-04-28 07:01:40 -07:00
Muki Kiboigo	9c826159a0	crude InterceptionLayer	2026-04-28 07:01:40 -07:00
Navid EMAD	1d806475c4	page: make handleJavaScriptDialog drive confirm/prompt return values Page.handleJavaScriptDialog previously responded -32000 "No dialog is showing" regardless of whether a dialog was open, leaving CDP clients no way to influence the JS-side return value of confirm() / prompt(). PR #2085 wired up the Page.javascriptDialogOpening event but explicitly deferred the return-value override since true Chrome semantics require suspending V8 mid-execution. Add a pre-arm model that fits the auto-dismiss architecture without runtime suspension: handleJavaScriptDialog stashes {accept, promptText} on the BrowserContext; when the next JS dialog dispatches the javascript_dialog_opening notification, the listener pops the stash and fills it into the dispatch's response output param so Window.confirm / prompt return the CDP client's choice. Without a pre-arm, headless auto-dismiss values from PR #2085 are preserved (confirm->false, prompt->null, alert->void). Closes #2260	2026-04-27 07:08:01 +02:00
Karl Seguin	550fb58f3f	Introduce Page (container) Follow up to https://github.com/lightpanda-io/browser/pull/2200 This change is actually pretty mundane, but a bunch of files that used to take a Session (e.g. every WebAPI releaseRef and deinit) now take a Page. This aims to separate the 2 lifetimes currently managed by Session by moving the "Page" lifetime to a dedicated container: Page. Ultimately, the goal is to remove the 1-page-per-session limit of the current design. Not to explicitly support multiple pages per session (though, that's more possible now), but in order to better emulate Chrome where, during a navigation event, the old and new page both exist.	2026-04-23 15:48:13 +08:00
Karl Seguin	73320e163d	Add placeholder handlers for Audit enable/disable CDP methods Might help with: https://github.com/lightpanda-io/browser/issues/2177 I say "might" because there are a 2 more methods in Audit which I haven't implemented. This is just the most basic placeholder for now.	2026-04-23 09:19:49 +08:00
Karl Seguin	2275416505	Page -> Frame This is to pave the way for introducing a new "Page" container, which will take over the page lifecycle currently burdening Session. The ultimate goal of that is to allow the Session to have multiple pages (mostly for better transitions between pages), which is hard to do now since the Session has so much state. This rename was aggressive, e.g. currentPage() -> currentFrame() so that, when the new Page container is added, you won't see "currentPage()" and wonder: "Does 'currentPage' mean the new Page container, or the Frame (which used to be called Page)".	2026-04-22 08:42:18 +08:00
Karl Seguin	c159be503a	Merge pull request #2194 from lightpanda-io/import_log Change all @import("...../log.zig") to const log = lp.log;	2026-04-20 15:07:15 +08:00
Adrià Arrufat	983e592b43	cdp: use page arena pool for AXNode writer	2026-04-20 07:42:08 +02:00
Karl Seguin	2d20e57f80	Change all @import("...../log.zig") to const log = lp.log; @import("lightpanda") where needed. Would also like to do this for String, Page, Session and js which all stand out as types that are use across the codebase. I know that a few devs are doing this in new work and I haven't heard anyone voice an objection.	2026-04-20 12:40:04 +08:00
Adrià Arrufat	b42251d750	ax: route AXNode.Writer scratch allocations through a dedicated arena	2026-04-17 12:53:43 +02:00
Adrià Arrufat	36c1218486	ax: add lazy label index for name resolution	2026-04-17 12:20:10 +02:00
Adrià Arrufat	cee72cabb9	cdp: improve AX tree visibility and label resolution Prunes hidden subtrees from the accessibility tree and implements accessible name resolution via labels. Adds the `labels` property to labellable HTML elements.	2026-04-17 08:33:13 +02:00
Pierre Tachoire	8de5267cd0	Merge pull request #2169 from lightpanda-io/feat/cookies-file Feat/cookies file	2026-04-16 08:21:53 -04:00
Karl Seguin	3ca1f230b9	Serialize sameSite Tweak ergonomics (public functions log internally and are infallible). Use readFileAlloc directly. Fix possible memory leak with cookie arena - I don't think you can make a copy of the arena, and then dupe with the original.	2026-04-16 10:35:34 +08:00
Pierre Tachoire	a24fcc6a5c	use session arg to load cookies from file	2026-04-15 10:29:53 -04:00

1 2

63 Commits