browser

mirror of https://github.com/lightpanda-io/browser.git synced 2026-06-11 17:46:32 -04:00

Author	SHA1	Message	Date
Navid EMAD	7237c377d3	browser: send Referer on cross-page navigation requests Anchor click, form submit, and `location.href = ...` assignments queue a navigation through `Frame.scheduleNavigation`, which then tears down the originating page and rebuilds the frame in `Session.processRootQueuedNavigation` before `Frame.navigate` issues the HTTP request. The originator's URL was discarded with the old arena, so the request went out without a Referer header — even though the HTML "navigate" algorithm and Fetch §4.5 require one. `Frame.headersForRequest` (#1449) handled subresource fetches but was never called from the navigation path. Capture the originating frame's URL into a new `referer` field on `NavigateOpts` at scheduling time, dup'd into the `QueuedNavigation` arena so it survives the page tear-down. `Frame.navigate` adds it as a `Referer:` header alongside the existing per-request headers. Iframe initial navigation (`Frame.zig:1282`) also sets `referer = parent.url` since the parent frame outlives that direct `navigate` call. CDP `Page.navigate` (`.reason = .address_bar`) and `Page.reload` continue to omit Referer — matches Chrome. Closes #2281	2026-04-28 03:21:12 +02:00
Navid EMAD	1b9e8ad46c	page: drop POST method/body on redirect so reload doesn't replay it If a POST navigation gets redirected (302/303), the page that actually loads is fetched with GET — but Frame._navigated_options still carries the original POST method, body, and Content-Type. Page.reload would then re-POST the form data to the redirect target, which is both incorrect and dangerous (re-submission of credentials, charges, etc.). Reset method/body/header on _navigated_options inside frameHeaderDoneCallback whenever response.redirectCount() > 0. The full spec-correct version would distinguish 307/308 (preserve method) from 301/302/303 (convert POST→GET), but resubmitting form data is the more dangerous failure mode — conservative reset matches Chrome's practical behavior on reload. Also collapse the prev_body/prev_header extraction in doReload to a single tuple-destructured blk: block (no behavior change). Tests: new cdp.frame: reload after POST→redirect drops the POST drives POST /redirect_to_echo → 302 → /echo_method, then Page.reload, asserts the second request is GET. /redirect_to_echo route added to testing.zig. The existing reload-replays-POST test still passes (no redirect, POST is still replayed).	2026-04-27 22:47:55 +02:00
Navid EMAD	5b1452f162	Merge remote-tracking branch 'origin/main' into fix-a6-page-reload-replay-post # Conflicts: # src/cdp/domains/page.zig	2026-04-27 22:39:54 +02:00
Navid EMAD	00c42dec4e	http: inherit request URL fragment across fragment-less redirect Per RFC 7231 §7.1.2, when a 3xx response carries a Location header without a fragment, the user agent must process the redirect as if the value inherited the fragment of the request URL. URL.resolve follows RFC 3986 §5.3 which drops the base fragment, so handleRedirect now reattaches the original fragment when the resolved target has none. Closes #2263	2026-04-27 08:04:47 +02:00
Navid EMAD	ea6b228f9d	page: replay POST method/body/header on Page.reload doReload built a NavigateOpts with only url + kind=.reload; method/body/header defaulted to GET/null/null, so any prior POST navigation regressed to a GET on reload. The HTML reload navigation re-fetches the document that produced the current entry, and Chrome replays the same HTTP request that loaded the page (including method, body, and Content-Type) — Lightpanda dropped all three. Retain the prior request body and content-type header in Frame.NavigatedOpts (duped into the frame arena), and have doReload capture them into the CDP command's arena before replacePage() frees the old frame. The reload's frame.navigate call carries the replayed method/body/header so the request the page was loaded with is the request that runs again. Closes #2258	2026-04-27 06:28:58 +02:00
Halil Durak	7aca1732fe	Merge pull request #2098 from lightpanda-io/nikneym/cli-parser	2026-04-23 11:17:44 +03:00
Karl Seguin	550fb58f3f	Introduce Page (container) Follow up to https://github.com/lightpanda-io/browser/pull/2200 This change is actually pretty mundane, but a bunch of files that used to take a Session (e.g. every WebAPI releaseRef and deinit) now take a Page. This aims to separate the 2 lifetimes currently managed by Session by moving the "Page" lifetime to a dedicated container: Page. Ultimately, the goal is to remove the 1-page-per-session limit of the current design. Not to explicitly support multiple pages per session (though, that's more possible now), but in order to better emulate Chrome where, during a navigation event, the old and new page both exist.	2026-04-23 15:48:13 +08:00
Halil Durak	156cf9b5a4	`testing.zig`: init directly on `.serve`	2026-04-22 16:07:04 +03:00
Karl Seguin	2275416505	Page -> Frame This is to pave the way for introducing a new "Page" container, which will take over the page lifecycle currently burdening Session. The ultimate goal of that is to allow the Session to have multiple pages (mostly for better transitions between pages), which is hard to do now since the Session has so much state. This rename was aggressive, e.g. currentPage() -> currentFrame() so that, when the new Page container is added, you won't see "currentPage()" and wonder: "Does 'currentPage' mean the new Page container, or the Frame (which used to be called Page)".	2026-04-22 08:42:18 +08:00
Karl Seguin	782ed50c83	@import(".....string.zig").String => lp.String Similar change to https://github.com/lightpanda-io/browser/pull/2194 but for String.	2026-04-20 15:29:37 +08:00
Karl Seguin	2d20e57f80	Change all @import("...../log.zig") to const log = lp.log; @import("lightpanda") where needed. Would also like to do this for String, Page, Session and js which all stand out as types that are use across the codebase. I know that a few devs are doing this in new work and I haven't heard anyone voice an objection.	2026-04-20 12:40:04 +08:00
Karl Seguin	9b93ac690e	Migrate more tests to new async (failing in CI) Also improve test report failure on async failure.	2026-04-17 12:10:35 +08:00
Karl Seguin	aac0a6e6b6	Websocket fixes. This commit fixes a few serious issues with the Websocket implementation. 1 - libcurl recursive api calls Creating a Websocket instance from within a libcurl callback results in libcurl failing with a RecursiveApiCall error. I fixed this more generally by adding a `ready_queue` which connections can use when the `HttpClient` is performing actions. Once `perform` ends, this new `ready_queue` is processed. There might be a more holistic solution to this (we seem to run into RecursiveApiCall everywhere), but since HttpClient is going through heavy changes, this seemed like the smallest possible change to fix it. 2 - "load" blocking Load and IdleNetwork notifications should not block on Websocket connections. To solve this, `HttpClient` now ha `http_active` and `ws_active` to replace `active`. Only `http_active` is used for things like "load" triggering. 3 - The above change made the Runner's job more complicated. It used to be binary: you either have active connections or not. Now there are different types of active connections. To keep it simple, and I think probably more correct, the "done-ness" (based on the `wait` parameter) is now independent of active (or not) network activity. If the page's `load_state == .complete`, then the `wait == .done` is considered successful, whether or not we have active connections. 4 - As a consequence of the above, and seemingly unrelated to all of these changes, a number of html tests now use the "new" robust async framework. Most of these tests were using the `testing.onload` (aka `testing.eventually`) which had somewhat...unclear semantics. These tests passed more of a consequence of how we processed a page and being very simple (e.g. just needing 1 micro or macrotask tick). But `eventually` never worked for more complicated cases, and the previous `testing.async` didn't work well. Now, the test runner waits for .load (which, as per #3, can fire more aggressively), which caused many `eventually` tests to fail. Moving these tests to the new `async` is more robust and works with the new aggressive "load".	2026-04-17 11:20:27 +08:00
Karl Seguin	6bf35e1ed4	try to improve test ws shutdown, merge ws tests	2026-04-04 07:00:26 +08:00
Karl Seguin	14dcb7895a	Give websockets their own connection pool, improve websocket message logging	2026-04-04 07:00:24 +08:00
Karl Seguin	5733c35a2d	WebSocket WebAPI Uses libcurl's websocket capabilities to add support for WebSocket. Depends on https://github.com/lightpanda-io/zig-v8-fork/pull/167 Issue: https://github.com/lightpanda-io/browser/issues/1952 This is a WIP because it currently uses the same connection pool used for all HTTP requests. It would be pretty easy for a page to starve the pool and block any progress. We previously stored the Transfer inside of the easy's private data. We now store the Connection, and a Connection now has a `transport` field which is a union for `http: Transfer` or `websocket: Websocket`.	2026-04-04 06:59:28 +08:00
Karl Seguin	ab6c63b24b	Add --wait-selector, --wait-script and --wait-script-file options to fetch These new optional parameter run AFTER --wait-until, allowing the (imo) useful combination of `--wait-until load --wait-script "report.complete === true"`. However, if `--wait-until` IS NOT specified but `--wait-selector/script` IS, then there is no default wait and it'll just check the selector/script. If neither `--wait-selector` or `--wait-script/--wait-script-file` are specified then `--wait-until` continues to default to `done`. These waiters were added to the Runner, and the existing Action.waitForSelector now uses the runner's version. Selector querying has been split into distinct parse and query functions, so that we can parse once, and query on every tick. We could potentially optimize --wait-script to compile the script once and call it on each tick, but we'd have to detect page navigation to recompile the script in the new context. Something I'd rather optimize separately.	2026-03-31 12:30:46 +08:00
Karl Seguin	cf641ed458	Merge pull request #1990 from lightpanda-io/remove_cdp_generic Remove cdp generic	2026-03-26 07:49:13 +08:00
Karl Seguin	11ed95290b	Improve async tests testing.async(...) is pretty lame. It works for simple cases, where the microtask is very quickly resolved, but otherwise can't block the test from exiting. This adds an overload to testing.async and leverages the new Runner https://github.com/lightpanda-io/browser/pull/1958 to "tick" until completion (or timeout). The overloaded version of testing.async() (called without a callback) will increment a counter which is only decremented with the promise is resolved. The test runner will now `tick` until the counter == 0.	2026-03-26 07:35:05 +08:00
Karl Seguin	0dd0495ab8	Removes CDPT (generic CDP) CDPT used to be a generic so that we could inject Browser, Session, Page and Client. At some point, it [thankfully] became a generic only to inject Client. This commit removes the generic and bakes the *Server.Client instance in CDP. It uses a socketpair for testing. BrowserContext is still generic, but that's generic for a very different reason and, while I'd like to remove that generic too, it belongs in a different PR.	2026-03-25 17:43:30 +08:00
Karl Seguin	c9bc370d6a	Extract Session.wait into a Runner This is done for a couple reasons. The first is just to have things a little more self-contained for eventually supporting more advanced "wait" logic, e.g. waiting for a selector. The other is to provide callers with more fine-grained controlled. Specifically the ability to manually "tick", so that they can [presumably] do something after every tick. This is needed by the test runner to support more advanced cases (cases that need to test beyond 'load') and it also improves (and fixes potential use-after-free, the lp.waitForSelector)	2026-03-23 12:30:41 +08:00
Karl Seguin	a4cb5031d1	Tweak wait_until option Small tweaks to https://github.com/lightpanda-io/browser/pull/1896 Improve the wait ergonomics with an Option with default parameter. Revert page pointer logic to original (don't think that change was necessary).	2026-03-19 20:29:20 +08:00
shaewe180	09327c3897	feat: fetch add wait_until parameter for page loads options Add `--wait_until` and `--wait_ms` CLI arguments to configure session wait behavior. Updates `Session.wait` to evaluate specific page load states (`load`, `domcontentloaded`, `networkidle`, `fixed`) before completing the wait loop.	2026-03-18 15:08:51 +08:00
Adrià Arrufat	b698e2d078	LogFilter: init with slice and silence tests	2026-03-17 13:42:35 +09:00
Adrià Arrufat	3d6d669a50	testing: add LogFilter utility for scoped log suppression	2026-03-12 13:56:53 +09:00
Karl Seguin	94ce5edd20	Frames on the same origin share v8 data Depends on: https://github.com/lightpanda-io/zig-v8-fork/pull/153 In some ways this is an extension of https://github.com/lightpanda-io/browser/pull/1635 but it has more implications with respect to correctness. A js.Context wraps a v8::Context. One of the important thing it adds is the identity_map so that, given a Zig instance we always return the same v8::Object. But imagine code running in a frame. This frame has its own Context, and thus its own identity_map. What happens when that frame does: ```js window.top.frame_loaded = true; ``` From Zig's point of view, `Window.getTop` will return the correct Zig instance. It will return the Window references by the "root" page. When that instance is passed to the bridge, we'll look for the v8::Object in the Context's `identity_map` but wont' find it. The mapping exists in the root context `identity_map`, but not within this frame. So we create a new v8::Object and now our 1 zig instance has N v8::Objects for every page/frame that tries to access it. This breaks cross-frame scripting which should work, at least to some degree, even when frames are on the same origin. This commit adds a `js.Origin` which contains the `identity_map`, along with our other `v8::Global` storage. The `Env` now contains a `js.Origin` lookup, mapping an origin string (e.g. lightpanda.io:443) to an *Origin. When a Page's URL is changed, we call `self.js.setOrigin(new_url)` which will then either get or create an origin from the Env's origin lookup map. js.Origin is reference counted so that it remains valid so long as at least 1 frame references them. There's some special handling for null-origins (i.e. about:blank). At the root, null origins get a distinct/isolated Origin. For a frame, the parent's origin is used. Above, we talked about `identity_map`, but a `js.Context` has 8 other fields to track v8 values, e.g. `global_objects`, `global_functions`, `global_values_temp`, etc. These all must be shared by frames on the same origin. So all of these have also been moved to js.Origin. They've also been merged so that we now have 3 fields: `identity_map`, `globals` and `temps`. Finally, when the origin of a context is changed, we set the v8::Context's SecurityToken (to that origin). This is a key part of how v8 allows cross- context access.	2026-03-11 08:43:40 +08:00
Nikolay Govorov	687f577562	Move accept loop to common runtime	2026-03-10 03:00:50 +00:00
Nikolay Govorov	8e59ce9e9f	Prepare global NetworkRuntime module	2026-03-10 03:00:47 +00:00
Karl Seguin	82e3f126ff	Protect against transfer.abort() being called during callback This was already handled in most cases, but not for a body-less response. It's safe to call transfer.abort() during a callback, so long as the performing flag is set to true. This was set during the normal libcurl callbacks, but for a body-less response, we manually invoke the header_done_callback and were not setting the performing flag.	2026-03-02 11:44:42 +08:00
Karl Seguin	081979be3b	Initial support for frames Missing: - [ ] Navigation support within frames (in fact, as-is, any navigation done inside a frame, will almost certainly break things - [ ] Correct CDP support. I don't know how frames are supposed to be exposed to CDP. Normal navigate events? Distinct CDP frame_ids? - [ ] Cross-origin restrictions. The interaction between frames is supposed to change depending on whether or not they're on the same origin - [ ] Potentially handling src-less frames incorrectly. Might not really matter Adds basic frame support. Initially explored adding a BrowsingContext and embedding it in Page, with the goal of also having it embedded in a to-be created Frame. But it turns out that 98% of Page _was_ BrowsingContext and introducing a BrowsingContext as the primary interaction unit broke pretty much _every_ single WebAPI. So Page was expanded: - Added `_parent: ?Page`, which is `null` for "root" page. - Added `frame: ?IFrame`, which is `null` for the "root" page. This is the HTMLIFrameElement for frame-pages. - Added a _type: enum{root, frame}, which is currently only used to improve the logs - Added a frames: std.ArrayList(*Page). This is a list of frames for the page. Note that a "frame-page" can itself haven nested frames. Besides the above, there were 3 "big" changes. 1 - Adding frames (dynamically, parsed) has to create a new page, start navigation, track it (in the frames list). Part of this was just piggybacking off of code that handles <script> 2 - The page "load" event blocks on the frame "load" event. This cascades. when a page triggers it's load, it can do: ```zig if (self._parent) \|p\| { p.iframeLoaded(self); } ``` Pages need to keep track of how many iframes they're waiting to load. When all iframes (and all scripts) are loaded, it can then triggers its own load event. 3 - Our JS execution expects 1 primary entered context (the pages). But we now have multiple page contexts, and we need to be in the correct one based on where javascript is being executed. There is no more an default entered context. Creating a Local.Scope enters the context, and ls.deinit() exits the context.	2026-02-19 23:47:33 +08:00
Nikolay Govorov	9296c10ca4	Use per-cdp connection HttpClient	2026-02-18 09:22:26 +00:00
Nikolay Govorov	6ccd3f277b	Fix race condition	2026-02-04 13:48:07 +00:00
Nikolay Govorov	a72782f91e	Eliminates duplication in the creation of HTTP headers	2026-02-04 09:08:57 +00:00
Nikolay Govorov	f71aa1cad2	Centralizes configuration, eliminates unnecessary copying of config	2026-02-04 07:57:59 +00:00
Nikolay Govorov	fd8c488dbd	Move Notification from App to BrowserContext	2026-02-04 07:33:45 +00:00
Karl Seguin	176d42f625	add 'arraybuffer' responseType to XHR	2026-02-02 07:45:21 +08:00
Karl Seguin	181f265de5	Rework Inspector usage V8's inspector world is made up of 4 components: Inspector, Client, Channel and Session. Currently, we treat all 4 components as a single unit which is tied to the lifetime of CDP BrowserContext - or, loosely speaking, 1 "Inspector Unit" per page / v8::Context. According to https://web.archive.org/web/20210622022956/https://hyperandroid.com/2020/02/12/v8-inspector-from-an-embedder-standpoint/ and conversation with Gemini, it's more typical to have 1 inspector per isolate. The general breakdown is the Inspector is the top-level manager, the Client is our implementation which control how the Inspector works (its function we expose that v8 calls into). These should be tied to the Isolate. Channels and Sessions are more closely tied to Context, where the Channel is v8->zig and the Session us zig->v8. This PR does a few things 1 - It creates 1 Inspector and Client per Isolate (Env.js) 2 - It creates 1 Session/Channel per BrowserContext 3 - It merges v8::Session and v8::Channel into Inspector.Session 4 - It moves the Inspector instance directly into the Env 5 - BrowserContext interacts with the Inspector.Session, not the Inspector 4 is arguably unnecessary with respect to the main goal of this commit, but the end-goal is to tighten the integration. Specifically, rather than CDP having to inform the inspector that a context was created/destroyed, the Env which manages Contexts directly (https://github.com/lightpanda-io/browser/pull/1432) and which now has direct access to the Inspector, is now equipped to keep this in sync.	2026-01-30 15:59:33 +08:00
Karl Seguin	54c45a0cfd	Make js.Bridge aware of string.String for input parameters Avoids having to allocate small strings when going from v8 -> Zig. Also added a discriminatory type, string.Global which uses the arena, rather than the call_arena, if an allocation _is_ necessary. (This is similar to a feature we had before, but was lost in zigdom). Strings from v8 that need to be persisted, can be allocated directly v8 -> arena, rather than v8 -> call_arena -> arena. I think there are a lot of places where we should use string.String - where strings are expected to be short (e.g. attribute names). But started with just document.querySelector and querySelectorAll.	2026-01-26 07:52:27 +08:00
Karl Seguin	62aa564df1	Remove Global v8::Local<V8::Context> When we create a js.Context, we create the underlying v8.Context and store it for the duration of the page lifetime. This works because we have a global HandleScope - the v8.Context (which is really a v8::Local<v8::Context>) is that to the global HandleScope, effectively making it a global. If we want to remove our global HandleScope, then we can no longer pin the v8.Context in our js.Context. Our js.Context now only holds a v8.Global of the v8.Context (v8::Global<v8::Context). This PR introduces a new type, js.Local, which takes over a lot of the functionality previously found in either js.Caller or js.Context. The simplest way to think about it is: 1 - For v8 -> zig calls, we create a js.Caller (as always) 2 - For zig -> v8 calls, we go through the js.Context (as always) 3 - The shared functionality, which works on a v8.Context, now belongs to js.Local For #1 (v8 -> zig), creating a js.Local for a js.Caller is really simple and centralized. v8 largely gives us everything we need from the FunctionCallbackInfo or PropertyCallbackInfo. For #2, it's messier, because we can only create a local v8::Context if we have a HandleScope, which we may or may not. Unfortunately, in many cases, what to do becomes the responsibility of the caller and much of the code has to become aware of this local-ness. What does it means for our code? The impact is on WebAPIs that store .Global. Because the global can't do anything. You always need to convert that .Global to a local (e.g. js.Function.Global -> js.Function). If you're 100% sure the WebAPI is only being invoked by a v8 callback, you can use `page.js.local.?.toLocal(some_global).call(...)` to get the local value. If you're 100% sure the WebAPI is only being invoked by Zig, you need to create `js.Local.Scope` to get access to a local: ```zig var ls: js.Local.Scope = undefined; page.js.localScope(&ls); defer ls.deinit(); ls.toLocal(some_global).call(...) // can also access `&ls.local` for APIs that require a const js.Local ``` For functions that can be invoked by either V8 or Zig, you should generally push the responsibility to the caller by accepting a `local: const js.Local`. If the caller is a v8 callback, it can pass `page.js.local.?`. If the caller is a Zig callback, it can create a `Local.Scope`. As an alternative, it is possible to simply pass the *Page, and check `if page.js.local == null` and, if so, create a Local.Scope. But this should only be done for performance reasons. We currently only do this in 1 place, and it's because the Zig caller doesn't know whether a Local will actually be needed and it's potentially called on every element creating from the parser.	2026-01-19 07:28:33 +08:00
Karl Seguin	8e14dacc32	Improve ergonomics of try catch (and Function's tryCall) It now returns a Caught struct which contains all information. The Caught struct can be logged directly, providing more consistent logs for caught errors.	2026-01-14 17:34:02 +08:00
Pierre Tachoire	adfcf7bb2c	tests: re-enable metrics JSON output METRICS=true zig build test	2026-01-08 13:07:27 +01:00
Karl Seguin	7b0e256408	copy history test from legacy	2026-01-05 10:12:41 +08:00
Karl Seguin	087086c308	remove some unused imports	2025-12-26 12:40:20 +08:00
Pierre Tachoire	8f2921f61f	add test for big json number with fetch/xhr	2025-12-25 12:50:44 +01:00
Karl Seguin	d9c53a3def	Page.scheduleNavigation for location changes	2025-12-22 12:19:08 +08:00
Karl Seguin	1639ff1b98	improve XMLHTTPRequest. Legacy xhr.html pass	2025-12-15 17:56:23 +08:00
Muki Kiboigo	ac85341cab	add NavigationKind to navigate	2025-12-09 17:10:59 -08:00
Karl Seguin	0e1b966dce	re-enable CDP dom domain	2025-12-09 13:04:01 +08:00
Karl Seguin	9132bc2375	re-enable CDP node registry	2025-12-09 11:50:33 +08:00
Karl Seguin	1164da5e7a	copyright notices	2025-11-14 10:52:43 +08:00

1 2 3

114 Commits