Commit Graph

4409 Commits

Author SHA1 Message Date
Karl Seguin
6723d6642a 'fix' test 2026-04-13 19:51:04 +08:00
Karl Seguin
cdd109a41b Improve loaderId and requestId compatibility
This introduces two slightly related changes.

My understanding is:

- frameId represents the page. Even if the page navigates, it's the same
  frameId. We capture this in Page._frame_id. Nothing here changes.

- loaderId is essentially for a specific document of the page. If the page
  navigates, it should be a different loaderId. We were using a distinct
  loaderId per request. Not sure what problems that caused. But it was wrong.
  This was achieved by exposing Page.id to CDP.

- requestId was mostly correct: unique per request. HOWEVER, for the original
  document, apparently, requestId == loaderId. This change is particularly
  important for various puppeteer and playwrightb behavior. This is a bit
  hacked. CDP will look at the resource_type, if it's .document, it'll return
  the loaderId, else it returns the requestId it always id.
2026-04-13 18:33:43 +08:00
Karl Seguin
cd4e977020 Merge pull request #2150 from lightpanda-io/eventcounts
Add EventCounts API
2026-04-13 18:33:24 +08:00
Karl Seguin
3fe8c3ee8f Merge pull request #2148 from lightpanda-io/fix_typedarray_offset
Correctly treat a view's offset as a byte offset, not an element offset
2026-04-13 18:33:08 +08:00
Karl Seguin
ca061b46b7 Merge pull request #2149 from lightpanda-io/websocket_protocol
Basic protocol support for websocket.
2026-04-13 17:34:28 +08:00
Karl Seguin
577629f536 Merge pull request #2151 from lightpanda-io/text_decoder_streaming
TextDecoder streaming stop
2026-04-13 17:34:16 +08:00
Pierre Tachoire
920ae57f9a cdp: ignore UA containing Mozilla
Instead of returning an error when the UA contains Mozilla, we ignore
the option and log an message.
2026-04-13 11:25:16 +02:00
Pierre Tachoire
1589445ec0 zig fmt 2026-04-13 10:36:35 +02:00
Pierre Tachoire
21e27a257d cdp: add warning for non-implemented params on Emulation.setUserAgentOverride 2026-04-13 10:32:36 +02:00
Pierre Tachoire
05a08f1f97 cdp: forward Network.setUserAgentOverride to Emulation.setUserAgentOverride 2026-04-13 10:23:55 +02:00
zed
9fe3a48c3a test: add tests for setting CDP user agent 2026-04-13 10:23:54 +02:00
zed
5e5a573a9f new: allow CDP change useragent 2026-04-13 10:23:54 +02:00
Karl Seguin
299104ff1d TextDecoder streaming stop
When we stop streaming, we need to use any previously streamed data as part of
the last "unstreamed" chunk. Or, put differently, when stream is false, that
merely stops any subsequent streams, it doesn't discard any previously streamed
data.
2026-04-13 13:06:16 +08:00
Karl Seguin
592dc3e18d Add EventCounts API 2026-04-13 12:30:49 +08:00
Karl Seguin
5c161260fd don't break union probing 2026-04-13 11:35:32 +08:00
Karl Seguin
28a7e7fe45 Basic protocol support for websocket.
Websockets client can send a Protocol which the server can agree to. This isn't
as fancy as it sounds. We just send a specific header on websocket handshake
and then read the response header.
2026-04-13 11:21:59 +08:00
Karl Seguin
de167861c6 handle null buffers 2026-04-13 10:03:19 +08:00
Karl Seguin
cde8229be5 Correctly treat a view's offset as a byte offset, not an element offset 2026-04-13 09:52:03 +08:00
Karl Seguin
6ebe112525 Add form acceptCharset accessor
Pre-size encoding buffer for possible numeric character insertion.
2026-04-12 19:38:36 +08:00
Karl Seguin
57f20e1831 Encode form data based on the form (or documents) encoding.
Does something similar to https://github.com/lightpanda-io/browser/pull/2126
but for form submission. It uses the form's accept-charset attribute, or
fallsback to the document's charset.
2026-04-12 19:38:36 +08:00
William Chan
e6fd004767 refactor: remove unused _group_depth tracking
Since group depth is not used for indentation in headless mode,
simplify by removing the field. group/groupCollapsed just log,
groupEnd is a no-op.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-12 19:37:39 +08:00
William Chan
d5556fda93 feat(console): implement group, groupCollapsed, and groupEnd
Add missing console grouping APIs per the WHATWG Console spec:
- console.group(...data): logs label and increments group depth
- console.groupCollapsed(...data): same as group (headless mode has no
  visual collapsing, so behavior is identical to group)
- console.groupEnd(): decrements group depth, clamped at 0

Depth is tracked via _group_depth (u32) on the Console struct.
Saturation arithmetic (+|=) prevents overflow on runaway group() calls.

Fixes TypeError crashes on sites that use console.group* APIs (e.g.
React/Vite dev builds).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-12 19:37:38 +08:00
Karl Seguin
af4363ad8f Map a zig error.RangeError to a proper JS RangeError
Fixes WPT /encoding/api-invalid-label.any.html?1-1000
2026-04-11 15:35:01 +08:00
Karl Seguin
87e9e9190c Handle http response with closed socket
https://github.com/lightpanda-io/browser/pull/1987 added support for a
connection that was close with a valid response. This commit goes a step further
and removes the requirement for a "connection: close" header.

We see a lot of these in WPT tests, e.g.
/referrer-policy/gen/iframe.http-rp/unset/iframe-tag.http.html
2026-04-11 15:24:48 +08:00
Karl Seguin
63104a7f82 Re-enable debug allocator in debug
Disabled this when looking at memory profiles, and must have accidentally
committed it.
2026-04-11 12:24:19 +08:00
Karl Seguin
da8d206c52 On Page cleanup, capture next linked list node _before_ releasing MO
Also, switch MO and IO to use a "small" arena, as they probably don't require
too many allocations in most normal cases (just observing 1 or 2 things).
2026-04-11 11:39:05 +08:00
Karl Seguin
3be913750e Cache-Control is public by default
- If private isn't specified, default to public.
- Add some tests
- Optimize parsing by lower-casing once and switch to std.mem
2026-04-11 07:23:56 +08:00
Pierre Tachoire
071e70e5cc Merge pull request #2133 from lightpanda-io/feat/cdp-json-endpoints
Feat/cdp json endpoints
2026-04-10 17:51:48 +02:00
Pierre Tachoire
88dcac642a Merge pull request #2131 from lightpanda-io/pushstate-pathname
update page URL and location on pushState/replaceState
2026-04-10 17:37:43 +02:00
Matt Van Horn
224a7333f2 fix: use fixed Lightpanda/1.0 for /json/version User-Agent
Replace dynamic build version string with stable Lightpanda/1.0
in the Browser and User-Agent fields of the /json/version response.
The dev version (1.0.0-dev.5492+...) is not useful for CDP clients.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:36:10 +02:00
Matt Van Horn
416984d32f fix: update integration test for enriched /json/version response
The integration test at "server: get /json/version" was hardcoding
the old response with Content-Length: 48. Updated to verify the
enriched fields structurally since the version string varies at
build time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:36:09 +02:00
Matt Van Horn
503ca4ce07 feat: enrich CDP /json/version and add /json/list endpoint
Add Browser, Protocol-Version, and User-Agent fields to the
/json/version CDP endpoint response. Previously it only returned
webSocketDebuggerUrl, while Chrome and other CDP browsers return
7+ fields that automation tools use for capability detection.

Also add /json/list and /json endpoints that return an empty JSON
array, matching the standard CDP endpoint layout that tools like
Puppeteer and chromedp expect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:36:05 +02:00
Muki Kiboigo
d47e24ced0 add test for History URL updating 2026-04-10 07:38:21 -07:00
Adrià Arrufat
d6aea1187f Merge branch 'main' into fix/markdown-link-formatting 2026-04-10 16:26:14 +02:00
Muki Kiboigo
08cd9ca799 properly resolve URL before setting Location in History 2026-04-10 07:26:01 -07:00
Adrià Arrufat
c1a65160c1 markdown: test block link aria-label and title handling 2026-04-10 16:14:58 +02:00
Karl Seguin
24e17b6f21 Merge pull request #2130 from lightpanda-io/arena_pool_buckets
Add arena buckets to ArenaPool
2026-04-10 20:33:19 +08:00
Karl Seguin
771df02c49 Merge pull request #2129 from lightpanda-io/non-utf8-querystring-encoding
Non utf8 querystring encoding
2026-04-10 20:33:07 +08:00
Karl Seguin
ddf614a9d5 Add arena buckets to ArenaPool
ArenaPool previously maintained up to 512 16KB buckets. The 16KB retention is
small for things like XHR and scripts, but increasing it to something more
reasonably, like 128KB, would use up to 8x more memory.

This commit adds 4 buckets: 1KB, 4KB, 16KB and 128KB. Callers can request a
tiny, small, medium or large bucket. We end up using less memory peak memory
and less allocations.

Furthermore, callers can request a specific size. This is particularly useful
for WebSocket or Blob where the size could vary greatly (so we'd likely default
to a large bucket), but that could needlessly use up a large arena.

The bucket sizes were derived from analyzing allocations. A significant number
of allocations were very small. Things like ScheduleCallback and
FinalizerCallback are always less than 1K and can be generated in the thousands.
The 16KB retention was wasteful in these cases...better to have a large number
of 1K pools, so that we can have a handful of very large buffers.
2026-04-10 19:09:18 +08:00
Karl Seguin
2cfa1ea035 Merge pull request #2116 from lightpanda-io/gc-snapshot
force an aggressive GC on v8 after snapshot creation
2026-04-10 18:48:08 +08:00
Karl Seguin
05229fdc53 Use the document's charset to determine if/how to encode querystring
Whenever we resolve a URL, say from `anchor.href`, we should consider the
document's charset when encoding the querystring. This probably isn't the
most important feature, but it makes tens of thousands of WPT cases pass, e.g

/encoding/legacy-mb-tchinese/big5/big5-encode-href-errors-han.html?3001-4000 and
/encoding/legacy-mb-japanese/euc-jp/eucjp-encode-href-errors-han.html?17001-18000

DOM elements previous called `URL.resolveURL(...)`. They now call
`self.asNode().resolveURL(...)`, where `Node#resolveURL` will provide the
document's charset.
2026-04-10 16:47:42 +08:00
Karl Seguin
f7c1710c23 Expose correct charset
document.characterSet, document.charset and document.inputEncoding now exposes
the correct charset.
2026-04-10 16:47:42 +08:00
Karl Seguin
828715b751 Improve TextDecoder to support all necessary encoding types
Uses the newly added encoding_rs to implement TextDecoder for all encoding.
Claude wrote 100% of the Rust binding.

Improves various WPT tests, e.g. /encoding/api-basics.any.html.
2026-04-10 16:47:41 +08:00
Pierre Tachoire
d80e4227b4 force an aggressive GC on v8 after snapshot creation 2026-04-10 10:41:57 +02:00
Adrià Arrufat
070ee7df80 Merge branch 'main' into fix-telemetry-decoding 2026-04-10 09:42:21 +02:00
Pierre Tachoire
a4617390de Merge pull request #2104 from lightpanda-io/feat/add-ip-filter
Feat/add ip filter
2026-04-10 08:46:06 +02:00
Matt Van Horn
5cc49e79b8 fix: update block link test to match new link text format
The "browser.markdown: block link" test expected the old format
([](url)). Updated to expect [url](url) since block-content
anchors without aria-label/title now use the href as display text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 20:49:39 -07:00
Matt Van Horn
065e9383d0 fix: use proper link text in markdown dump for block-content anchors
When an anchor wraps block content (divs, images), the markdown dump
produced `([](url))` with empty display text. This is not valid
markdown and provides no useful information to LLMs consuming the
output.

Now uses the anchor's aria-label or title attribute as display text,
falling back to the href itself. Produces `[label](url)` instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:33:08 -07:00
Karl Seguin
0b6f099f43 Merge pull request #2117 from lightpanda-io/startup_order_tweak
Initialize snapshot before network
2026-04-10 08:30:54 +08:00
Pierre Tachoire
91e366cb71 move memoryPressureNotification call on session.resetPage
Run V8 GC with memoryPressureNotification directly into
session.resetPage to be sure to save free right after resources are
removed.
2026-04-10 08:12:51 +08:00