browser

mirror of https://github.com/lightpanda-io/browser.git synced 2026-06-11 01:25:53 -04:00

Author	SHA1	Message	Date
Karl Seguin	4205cd905b	Clear pending destroy on createPage (a known safepoint). This allows pending destroys that have been accumulated to be cleaned up. In normal operations, this likely isn't going to happen. But we see a some unit tests create _many_ pages that never have the change to be cleaned up. The result is that the next "normal" unit test, which actually runs enough through Runner to trigger the cleanup, pays a huge cleanup price. Arguably, for a test-only solution, we could create a session per test, or have explicit cleanup in the test. But having 1 long-lasting session is useful as it can show us these potential pitfalls AND, it isn't impossible that a real-world case runs into similar issues.	2026-05-15 11:31:53 +08:00
Karl Seguin	2f3a426fb0	Merge pull request #2453 from lightpanda-io/cdp-network-serve-from-cache Adds `Network.requestServedFromCache`	2026-05-14 17:34:46 +08:00
Karl Seguin	b96c24d377	Merge pull request #2455 from lightpanda-io/cdp-response-fromdiskcache Add `fromDiskCache` field to `Network.Response`	2026-05-14 16:01:04 +08:00
Karl Seguin	0624a05205	Merge pull request #2454 from lightpanda-io/cdp-network-cache-clear add `Network.clearBrowserCache` and `Network.canClearBrowserCache`	2026-05-14 15:56:29 +08:00
Karl Seguin	143bffdfec	Merge pull request #2450 from navidemad/fix-bug7-form-idl forms: add enctype + 5 submitter form-* IDL accessors	2026-05-14 13:44:57 +08:00
Karl Seguin	80a09fc0fd	zig fmt	2026-05-14 13:19:17 +08:00
Muki Kiboigo	f2f328cffd	add fromDiskCache field to Network.Response CDP type	2026-05-13 21:57:22 -07:00
Muki Kiboigo	07e7c3d687	add Network.clearBrowserCache and Network.canClearBrowserCache	2026-05-13 21:52:10 -07:00
Muki Kiboigo	ac863c7e2b	add Network.requestServedFromCache	2026-05-13 21:47:47 -07:00
Karl Seguin	14b4449628	use format to write String value	2026-05-14 11:03:12 +08:00
Karl Seguin	373916873f	Merge pull request #2442 from lightpanda-io/worker_message_buffer CI fixes, callback timing correctness	2026-05-14 08:56:36 +08:00
Karl Seguin	96ac9a49ea	Update src/browser/webapi/Worker.zig Co-authored-by: Navid EMAD <navid.emad@yespark.fr>	2026-05-14 08:33:32 +08:00
Karl Seguin	1580ab197f	Merge pull request #2452 from lightpanda-io/event_worker make Event worker-safe	2026-05-14 07:39:37 +08:00
Karl Seguin	bcafa175cb	make Event worker-safe	2026-05-14 07:11:33 +08:00
Navid EMAD	f0cce42757	forms: route Frame.submitForm through Form.normalizeMethod/normalizeEnctype The submitForm encoding path was the last duplicate of the "limited to only known values" canonicalization the previous commit consolidated for the IDL getters. Now it consumes the same Form.normalizeMethod / Form.normalizeEnctype helpers, so a single function owns the canonical mapping (`""` / unknown -> spec default, recognized values pass through unchanged). Side effect of routing through the helper: the `log.warn(.not_implemented, "FormData.encoding", ...)` branch falls out. After commit `4b693db4` added `text/plain`, the only attribute values that still reach the urlencoded fallback are spec-invalid ones, which per HTML §4.10.21.5 silently canonicalize to `application/x-www-form-urlencoded`. The warning was firing for valid spec behavior — Chrome doesn't log either. Behavior-preserving on all observable surfaces: full suite 639/639 green; existing form-submission integration tests (multipart, urlencoded, text/plain, GET-ignores-enctype) all pass unchanged.	2026-05-13 18:14:10 +02:00
Navid EMAD	4b693db480	forms: support enctype=text/plain in form submission Closing the divergence introduced by the new IDL accessors: `submitter.formEnctype` (and `form.enctype`) now return "text/plain" for that attribute value per WHATWG HTML §4.10.21.5, but `Frame.submitForm` previously fell back to urlencoded with a `.not_implemented` log when it saw the same value on the submission path. Implement the spec's text/plain encoding algorithm (HTML §4.10.21.8): - FormData.EncType gains a `.plaintext` variant. - FormData.plaintextEncode writes "name=value CRLF" per entry, no URL-encoding, no escaping — the spec accepts that text/plain is a lossy, human-readable encoding (values containing "=" or CRLF produce an ambiguous wire format by design). - Frame.submitForm recognizes "text/plain" before the urlencoded fallback and sets the Content-Type header to "text/plain; charset=<form-charset>", per spec step 21.4. Two new Zig unit tests cover encoding output (`FormData: plaintext write`, `FormData: plaintext empty body`). Full suite 639/639 green. This is bundled with the IDL accessor commits because returning "text/plain" from the IDL while the submission silently re-encodes as urlencoded is a spec-internal inconsistency the IDL change itself introduces. Reviewers who'd prefer to land just the read-only accessors first should feel free to ask for a split — this commit is self-contained and reverts cleanly.	2026-05-13 18:08:54 +02:00
Navid EMAD	cedfdba0d7	forms: extract normalizeMethod / normalizeEnctype helpers The "limited to only known values" canonicalization (per WHATWG HTML §2.2.2) was duplicated five times: Form.getMethod + Form.getEnctype + {Button,Input}.{getFormMethod,getFormEnctype}. Each callsite differed only in the missing-value default ("" for submitter overrides, "get" / "application/x-www-form-urlencoded" for the form-side). Extract into two pub helpers on Form.zig taking the attribute slice + the missing-value default. The five callers collapse to one-liners. Behavior-preserving: existing form.html / button.html / input-attrs.html fixtures all pass unchanged; full suite 637/637 green. Net: -36 LOC.	2026-05-13 17:58:55 +02:00
Navid EMAD	2fdc82aa05	forms: add enctype + 5 submitter form-* IDL accessors Six form-submission IDL accessors were missing from the JsApi blocks of HTMLFormElement, HTMLButtonElement, and HTMLInputElement, so reads produced undefined instead of the spec-mandated string/boolean. The content-attribute path (clicking a submit button honoring formaction / formmethod / formenctype) was wired up in #2279; this commit adds the matching IDL-property accessors per WHATWG HTML §4.10.18.6 and §4.10.21.5. - Form.enctype: limited to known values, missing+invalid both default to application/x-www-form-urlencoded (mirrors getMethod's shape). - Button/Input formAction: returns frame.url when missing/empty, else the resolved URL (mirrors Form.getAction). - Button/Input formEnctype, formMethod: limited to known values with no missing-value default ("" when missing, canonical invalid-value default application/x-www-form-urlencoded / get when invalid). - Button/Input formTarget: plain reflection, defaults to "". - Button/Input formNoValidate: boolean reflection of formnovalidate. Closes #2449	2026-05-13 17:49:19 +02:00
Karl Seguin	5595f7d298	Merge pull request #2448 from lightpanda-io/script_load_error_handling Don't process scripts that failed to load	2026-05-13 23:19:40 +08:00
Pierre Tachoire	198c4e5a0f	Merge pull request #2444 from lightpanda-io/useless-code cdp: remove dead code 0.3.0	2026-05-13 15:36:16 +02:00
Pierre Tachoire	ffc2baa733	Merge pull request #2431 from lightpanda-io/cdp-double-frame-navigated-event fix(cdp): remove duplicate Page.frameNavigated and fix context regist…	2026-05-13 15:17:27 +02:00
Karl Seguin	7750bc94f6	Apply suggestions from code review Remove no-longer needed setTimeouts in test now that messages are queued. Runner also checks ready_queue when determining doneness. Co-authored-by: Navid EMAD <design.navid@gmail.com>	2026-05-13 20:57:59 +08:00
Karl Seguin	2326071036	Don't [try] to process scripts that failed to load At some point recently, we started to process scripts that fail to load (e.g. 404). This stops such scripts from [trying] to be evaluated, and executes the onerror handler in all script loading paths.	2026-05-13 20:48:08 +08:00
Pierre Tachoire	12971a2420	Merge pull request #2445 from lightpanda-io/reset-bc-arena cdp: reset browser context arena when bc is removed	2026-05-13 14:35:38 +02:00
Pierre Tachoire	5d73d82bf6	cdp: call context created w/ correct is_default_context value Co-authored-by: Navid EMAD <navid.emad@yespark.fr>	2026-05-13 14:11:53 +02:00
Pierre Tachoire	8432cfbfba	cdp: return error in case of missing event's frame Instead of using the root_frame	2026-05-13 12:29:11 +02:00
Karl Seguin	e895ce81e3	Merge pull request #2437 from lightpanda-io/window_frameElement Add window.frameElement	2026-05-13 18:00:08 +08:00
Karl Seguin	3e31fde66c	Merge pull request #2443 from lightpanda-io/url_fixes Fix URLSearchParams constructor	2026-05-13 17:59:50 +08:00
Karl Seguin	625e240f5a	Pump the http_client queue after perform, not just before Client.tick drains self.queue (assigning conns to queued transfers) only at the start. When perform / processMessages releases a batch of conns back to the pool, those conns sit idle until the next tick — a queued transfer that could have run this tick waits one Runner iteration (~20 ms in the test runner) for no reason. Adds a second drainQueue call after perform so newly-freed conns get picked up immediately. In practice this matters whenever httpMaxHostOpen / httpMaxConcurrent is exceeded — pages with N > limit subresources had each "wave" of queue overflow paying one extra tick of latency.	2026-05-13 17:58:49 +08:00
Karl Seguin	c79dd2bf1f	Make runner aware of http_client.queue When connections are queued, the processing cannot be considered done.	2026-05-13 17:55:39 +08:00
Karl Seguin	2bcf9a22d5	Disable cache=true from e2e matrix cache=true is problematic for a few reasons 1 - The current cache implementation is known to cause timing issues; i.e. it executes callbacks synchronously. 2 - Unlike something like robots.txt or proxy, cache tests need to be explicitly tested. The response has to include cache headers and the resource loaded again.	2026-05-13 17:52:46 +08:00
Karl Seguin	afc0942655	Merge pull request #2441 from lightpanda-io/fix-robots-crash Fix crash on `robots.txt` being fulfilled synchronously	2026-05-13 17:39:22 +08:00
Pierre Tachoire	36b55339cd	cdp: reset browser context arena when bc is removed	2026-05-13 11:26:09 +02:00
Pierre Tachoire	403fe0d293	cdp: remove dead code	2026-05-13 11:18:05 +02:00
Karl Seguin	c860a9a9e5	Split xhr-in-worker tests into their own file xhr.html can brush up against the timeout as we add more and more cases. This is particularly true on the slow CI, in debug builds, with TSAN.	2026-05-13 15:59:29 +08:00
Karl Seguin	dd99102f4b	Defer HTTP completion callbacks to next tick Client.makeRequest used to call self.perform(0) after handing the transfer to libcurl. That perform() does two things: drives curl_multi_perform (so bytes hit the wire) AND drains curl_multi_info_read messages, which is what fires the user-facing header/data/done callbacks. The issue is that, even in non-cache cases, a request could be immediately resolved in libcurl, and thus callbacks executed synchronously. By only calling `curl_multi_perform` on a new request, we prevent this from happening.	2026-05-13 15:59:29 +08:00
Karl Seguin	2fcad23834	Buffer worker postMessages received before script load completes	2026-05-13 15:59:29 +08:00
Karl Seguin	6d58af350d	Flag functions and accessors as DontEnum by default Only `own_properties`, e.g. window.CSS should be enumerable.	2026-05-13 15:49:31 +08:00
Karl Seguin	cc4ad53661	Fix URLSearchParams constructor First, KeyValueList.fromJsObject now only iterates own properties. Second URLSearchParams can now be constructed with another URLSearchParams. This is a stopgap. The correct solution is for it to accept any iterator, but as a quick fix for known cases (airbnb.com), this will help.	2026-05-13 14:38:43 +08:00
Pierre Tachoire	854eb6a62d	Merge pull request #2339 from lightpanda-io/cdp-console cdp: implement Console	2026-05-13 08:28:01 +02:00
Muki Kiboigo	4a45b4d866	fix crash on robots.txt request fufilled immediately	2026-05-12 21:50:05 -07:00
Karl Seguin	bd4f4c89e1	Merge pull request #2440 from staylor/scott/fix-worker-context-exit-with-proxy Add LP.configureLoading worker + --disable-workers opt-out for Web Worker loading	2026-05-13 12:29:43 +08:00
Karl Seguin	10a5597aba	Merge pull request #2435 from navidemad/fix-b12-htmldialogelement-methods dom: implement HTMLDialogElement.{show, showModal, close}	2026-05-13 12:17:20 +08:00
muki	cc927c98ec	Merge pull request #2424 from lightpanda-io/nix-wpt-run Ability to run wpt with Nix/NixOS	2026-05-12 20:58:39 -07:00
Muki Kiboigo	06c2474376	use commit sha instead of branch name	2026-05-12 20:58:13 -07:00
Karl Seguin	393141e472	pass arena into handlers (consistent with other handlers)	2026-05-13 11:51:59 +08:00
Scott Taylor	b2998470c2	Add --disable-workers + LP.configureLoading worker opt-out Adds two ways to opt out of dedicated Web Worker loading entirely. The Worker constructor still returns a Worker object so calling pages don't throw, but no script fetch is initiated and the worker scope's eval never runs (postMessage from the page is queued indefinitely with no handler to drain it). * CDP method LP.configureLoading { worker: bool } -- per-session toggleable at runtime, alongside the existing { subFrame: bool }. Both fields are now optional so callers can flip one without resetting the other to its default. Backwards-compatible. * CLI flag --disable-workers -- process-wide default applying to every session and to the fetch subcommand. Operators can flip it on without any driver changes. Mirrors --disable-subframes (#2401) exactly. ## Motivation Reliably-reproducible SIGABRT in Worker.loadInitialScript whenever a page constructs a Web Worker AND lightpanda is launched with --http_proxy. Crash signature: $msg="V8 fatal callback" location=v8::Context::Exit() message="Cannot exit non-entered context" Stack: _browser.webapi.Worker.loadInitialScript _browser.webapi.Worker.httpDoneCallback _network.layer.InterceptionLayer.InterceptContext.doneCallback _browser.HttpClient.processMessages _browser.HttpClient.perform _browser.HttpClient.tick The Zig-side Enter/Exit pair around the worker's eval doesn't match v8's entered_contexts stack invariant under that timing -- something upstream of the loadInitialScript Exit leaves an extra Enter on the stack, so v8's Utils::ApiCheck (`isolate->context() == env`) fires and the process aborts. Reproducible against any Shopify storefront PDP (e.g. https://weareallbirds.myshopify.com/products/mens-wool-runners) when served through any HTTP proxy -- the proxy just adds enough latency to surface the race; the same code path runs without --http_proxy but the timing window is too tight to reliably hit. The Allbirds trigger script is the Shopify web-pixel-extension worker, but ANY Worker the page constructs hits the same code path. The proper fix needs the v8 entered-contexts invariant to be restored end-to-end through the worker eval. That's a deeper dig into how Worker.loadInitialScript / WorkerGlobalScope.importScript / ls.local.runMacrotasks compose with v8's microtask queues across multiple contexts; I tried three intermediate fixes (deferring loadInitialScript via the frame scheduler when other scripts are mid-eval, replacing the post-eval cross-context runMacrotasks with worker-only PerformCheckpoint, and removing runMacrotasks entirely) and none stopped the crash. The bug is fired from inside the synchronous tick path before the post-eval microtask handling runs, which means the leak happens during Script::Run itself and needs more targeted investigation. This PR is the workaround so users hitting the SIGABRT on storefront / analytics-heavy pages have a clean opt-in escape today. For our use case (product catalog extraction) Workers carry no extraction signal -- web-pixel sandboxes, analytics SDKs, marketing tag pixels, etc. -- so disabling them removes a fragile code path without any downside. ## Implementation `Session.worker_loading_enabled: bool = true` -- default matches existing behavior. `Worker.init` short-circuits AFTER constructing the Worker / WorkerGlobalScope / arena bookkeeping (so the JS `new Worker(url)` expression doesn't throw): if (!session.worker_loading_enabled) { log.debug(.browser, "worker disabled", .{ .url = resolved_url }); return self; } Two ways to flip the flag, mirroring the --disable-subframes pattern: 1. LP.configureLoading { worker: bool } -- both subFrame and worker are now optional fields in the params struct, so existing callers passing only { subFrame } continue to work unchanged. 2. --disable-workers CLI flag -- added to CommonOptions (so it applies to serve, fetch, mcp). New Config.disableWorkers() getter; Session.init reads it as the initial value. Total diff: +88 / -3 across 4 files (src/Config.zig, src/browser/Session.zig, src/browser/webapi/Worker.zig, src/cdp/domains/lp.zig). ## Verification Reproducer pattern (puppeteer-core 24.42.0 + tiny CONNECT-tunnel proxy on 127.0.0.1:9999, scripts in cdp-repros/): serve --host 127.0.0.1 --port 9222 --http_proxy http://127.0.0.1:9999 serve --host 127.0.0.1 --port 9222 --http_proxy http://127.0.0.1:9999 --disable-workers Driving https://weareallbirds.myshopify.com/products/mens-wool-runners: baseline (no --disable-workers): 5/5 SIGABRT in Worker.loadInitialScript with the v8 fatal callback above. with --disable-workers: 10/10 successful, returns full HTML (~1MB), no crash. Test suite: make test -> 637 of 637 tests passed (was 636/636 + new cdp.lp: configureLoading toggles subFrame and worker independently regression test). zig fmt --check ./.zig ./*/.zig -> clean. ## Notes * The CDP method is the same domain (LP.configureLoading) and same shape as --disable-subframes' driver-side opt-in, so existing Playwright / puppeteer integrations that already toggle subframes don't need a separate code path -- one CDP call can flip both. * worker_loading_enabled = false does NOT remove Worker from the global namespace (so feature-detection like `if (typeof Worker !== 'undefined')` still reports true). It just makes constructed workers no-op. Pages that postMessage to a worker and wait for a response will hang on that promise forever (or until the page is torn down). For our extraction use case that's fine -- we control the worklist timeout anyway -- but it's worth noting if upstream wants to surface the disabled state more strongly (e.g. throw from postMessage, or remove the global entirely behind an even-stricter flag). * Once the underlying v8 entered-contexts invariant is restored in Worker.loadInitialScript, this flag becomes a perf / sandboxing tool rather than a correctness workaround. Worth keeping anyway: blocking analytics / pixel workers is a reasonable thing to want. ## Related * #2400 -- the iframe analog to this issue (subframe nav invalidates executionContextId); same workaround pattern. * #2401 -- introduced --disable-subframes / LP.configureLoading { subFrame } that this PR mirrors exactly for workers.	2026-05-12 23:46:45 -04:00
Karl Seguin	6eb90b2920	Add window.frameElement	2026-05-13 10:36:39 +08:00
Karl Seguin	2e159aaf12	Merge pull request #2436 from lightpanda-io/reset_frees_tasks on scheduler.reset, finalizer any remaining tasks	2026-05-13 09:06:02 +08:00
Karl Seguin	656e29476e	on scheduler.reset, finalizer any remaining tasks	2026-05-13 08:46:13 +08:00

1 2 3 4 5 ...

6254 Commits