This introduces two slightly related changes.
My understanding is:
- frameId represents the page. Even if the page navigates, it's the same
frameId. We capture this in Page._frame_id. Nothing here changes.
- loaderId is essentially for a specific document of the page. If the page
navigates, it should be a different loaderId. We were using a distinct
loaderId per request. Not sure what problems that caused. But it was wrong.
This was achieved by exposing Page.id to CDP.
- requestId was mostly correct: unique per request. HOWEVER, for the original
document, apparently, requestId == loaderId. This change is particularly
important for various puppeteer and playwrightb behavior. This is a bit
hacked. CDP will look at the resource_type, if it's .document, it'll return
the loaderId, else it returns the requestId it always id.
When we stop streaming, we need to use any previously streamed data as part of
the last "unstreamed" chunk. Or, put differently, when stream is false, that
merely stops any subsequent streams, it doesn't discard any previously streamed
data.
Websockets client can send a Protocol which the server can agree to. This isn't
as fancy as it sounds. We just send a specific header on websocket handshake
and then read the response header.
Since group depth is not used for indentation in headless mode,
simplify by removing the field. group/groupCollapsed just log,
groupEnd is a no-op.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add missing console grouping APIs per the WHATWG Console spec:
- console.group(...data): logs label and increments group depth
- console.groupCollapsed(...data): same as group (headless mode has no
visual collapsing, so behavior is identical to group)
- console.groupEnd(): decrements group depth, clamped at 0
Depth is tracked via _group_depth (u32) on the Console struct.
Saturation arithmetic (+|=) prevents overflow on runaway group() calls.
Fixes TypeError crashes on sites that use console.group* APIs (e.g.
React/Vite dev builds).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
https://github.com/lightpanda-io/browser/pull/1987 added support for a
connection that was close with a valid response. This commit goes a step further
and removes the requirement for a "connection: close" header.
We see a lot of these in WPT tests, e.g.
/referrer-policy/gen/iframe.http-rp/unset/iframe-tag.http.html
Also, switch MO and IO to use a "small" arena, as they probably don't require
too many allocations in most normal cases (just observing 1 or 2 things).
Replace dynamic build version string with stable Lightpanda/1.0
in the Browser and User-Agent fields of the /json/version response.
The dev version (1.0.0-dev.5492+...) is not useful for CDP clients.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The integration test at "server: get /json/version" was hardcoding
the old response with Content-Length: 48. Updated to verify the
enriched fields structurally since the version string varies at
build time.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Browser, Protocol-Version, and User-Agent fields to the
/json/version CDP endpoint response. Previously it only returned
webSocketDebuggerUrl, while Chrome and other CDP browsers return
7+ fields that automation tools use for capability detection.
Also add /json/list and /json endpoints that return an empty JSON
array, matching the standard CDP endpoint layout that tools like
Puppeteer and chromedp expect.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ArenaPool previously maintained up to 512 16KB buckets. The 16KB retention is
small for things like XHR and scripts, but increasing it to something more
reasonably, like 128KB, would use up to 8x more memory.
This commit adds 4 buckets: 1KB, 4KB, 16KB and 128KB. Callers can request a
tiny, small, medium or large bucket. We end up using less memory peak memory
and less allocations.
Furthermore, callers can request a specific size. This is particularly useful
for WebSocket or Blob where the size could vary greatly (so we'd likely default
to a large bucket), but that could needlessly use up a large arena.
The bucket sizes were derived from analyzing allocations. A significant number
of allocations were very small. Things like ScheduleCallback and
FinalizerCallback are always less than 1K and can be generated in the thousands.
The 16KB retention was wasteful in these cases...better to have a large number
of 1K pools, so that we can have a handful of very large buffers.
Whenever we resolve a URL, say from `anchor.href`, we should consider the
document's charset when encoding the querystring. This probably isn't the
most important feature, but it makes tens of thousands of WPT cases pass, e.g
/encoding/legacy-mb-tchinese/big5/big5-encode-href-errors-han.html?3001-4000 and
/encoding/legacy-mb-japanese/euc-jp/eucjp-encode-href-errors-han.html?17001-18000
DOM elements previous called `URL.resolveURL(...)`. They now call
`self.asNode().resolveURL(...)`, where `Node#resolveURL` will provide the
document's charset.
Uses the newly added encoding_rs to implement TextDecoder for all encoding.
Claude wrote 100% of the Rust binding.
Improves various WPT tests, e.g. /encoding/api-basics.any.html.
The "browser.markdown: block link" test expected the old format
([](url)). Updated to expect [url](url) since block-content
anchors without aria-label/title now use the href as display text.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an anchor wraps block content (divs, images), the markdown dump
produced `([](url))` with empty display text. This is not valid
markdown and provides no useful information to LLMs consuming the
output.
Now uses the anchor's aria-label or title attribute as display text,
falling back to the href itself. Produces `[label](url)` instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>