We'll have two types of contexts: one for pages and one for workers. They'll
[probably] both be js.Context, but they'll have distinct FunctionTemplates
attached to their global. The Worker template will only contain a small subset
of the main Page's types, along with 1 or 2 of its own specific ones.
The Snapshot now creates the templates for both, so that the Env contains the
function templates for both contexts. Furthermore, having a "merged" view like
this ensures that the env.template[N] indices are consistent between the two.
However, the snapshot only attaches the Page-specific types to the snapshot
context. This allows the Page-context to be created as-is (e.g. efficiently).
The worker context will be created lazily, on demand, but from the templates
loaded into the env (since, again, the env contains templates for both).
A Worker has no page. So any API that is accessible to a worker cannot take
a *Page parameter. Such APIs will now take a js.Execution which the context
will own and create from the Page (or from the WorkerGlobalScope when that's
created).
To test this, in addition to introducing the Execution, this change also updates
URLSearchParams which is accessible to Worker (and the Page obviously). This
change is obviously viral..if URLSearchParams no longer has a *Page but instead
has an *Execution, then any function it calls must also be updated.
So some APIs will take a *Page (those only accessible from a Page) and some will
take an *Execution (those accessible from a Page or Worker). I'm ok with that.
A lot of private/internal functions take a *Page, because it's simple, but all
they want is a call_arena or something. We'll try to update those as much as
possible. The Page/Execution being injected from the bridge is convenient, but
we should be more specific for internal calls and pass only what's needed.
This introduces two slightly related changes.
My understanding is:
- frameId represents the page. Even if the page navigates, it's the same
frameId. We capture this in Page._frame_id. Nothing here changes.
- loaderId is essentially for a specific document of the page. If the page
navigates, it should be a different loaderId. We were using a distinct
loaderId per request. Not sure what problems that caused. But it was wrong.
This was achieved by exposing Page.id to CDP.
- requestId was mostly correct: unique per request. HOWEVER, for the original
document, apparently, requestId == loaderId. This change is particularly
important for various puppeteer and playwrightb behavior. This is a bit
hacked. CDP will look at the resource_type, if it's .document, it'll return
the loaderId, else it returns the requestId it always id.
When we stop streaming, we need to use any previously streamed data as part of
the last "unstreamed" chunk. Or, put differently, when stream is false, that
merely stops any subsequent streams, it doesn't discard any previously streamed
data.
Websockets client can send a Protocol which the server can agree to. This isn't
as fancy as it sounds. We just send a specific header on websocket handshake
and then read the response header.
Since group depth is not used for indentation in headless mode,
simplify by removing the field. group/groupCollapsed just log,
groupEnd is a no-op.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add missing console grouping APIs per the WHATWG Console spec:
- console.group(...data): logs label and increments group depth
- console.groupCollapsed(...data): same as group (headless mode has no
visual collapsing, so behavior is identical to group)
- console.groupEnd(): decrements group depth, clamped at 0
Depth is tracked via _group_depth (u32) on the Console struct.
Saturation arithmetic (+|=) prevents overflow on runaway group() calls.
Fixes TypeError crashes on sites that use console.group* APIs (e.g.
React/Vite dev builds).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
https://github.com/lightpanda-io/browser/pull/1987 added support for a
connection that was close with a valid response. This commit goes a step further
and removes the requirement for a "connection: close" header.
We see a lot of these in WPT tests, e.g.
/referrer-policy/gen/iframe.http-rp/unset/iframe-tag.http.html
Also, switch MO and IO to use a "small" arena, as they probably don't require
too many allocations in most normal cases (just observing 1 or 2 things).
Replace dynamic build version string with stable Lightpanda/1.0
in the Browser and User-Agent fields of the /json/version response.
The dev version (1.0.0-dev.5492+...) is not useful for CDP clients.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The integration test at "server: get /json/version" was hardcoding
the old response with Content-Length: 48. Updated to verify the
enriched fields structurally since the version string varies at
build time.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Browser, Protocol-Version, and User-Agent fields to the
/json/version CDP endpoint response. Previously it only returned
webSocketDebuggerUrl, while Chrome and other CDP browsers return
7+ fields that automation tools use for capability detection.
Also add /json/list and /json endpoints that return an empty JSON
array, matching the standard CDP endpoint layout that tools like
Puppeteer and chromedp expect.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ArenaPool previously maintained up to 512 16KB buckets. The 16KB retention is
small for things like XHR and scripts, but increasing it to something more
reasonably, like 128KB, would use up to 8x more memory.
This commit adds 4 buckets: 1KB, 4KB, 16KB and 128KB. Callers can request a
tiny, small, medium or large bucket. We end up using less memory peak memory
and less allocations.
Furthermore, callers can request a specific size. This is particularly useful
for WebSocket or Blob where the size could vary greatly (so we'd likely default
to a large bucket), but that could needlessly use up a large arena.
The bucket sizes were derived from analyzing allocations. A significant number
of allocations were very small. Things like ScheduleCallback and
FinalizerCallback are always less than 1K and can be generated in the thousands.
The 16KB retention was wasteful in these cases...better to have a large number
of 1K pools, so that we can have a handful of very large buffers.
Whenever we resolve a URL, say from `anchor.href`, we should consider the
document's charset when encoding the querystring. This probably isn't the
most important feature, but it makes tens of thousands of WPT cases pass, e.g
/encoding/legacy-mb-tchinese/big5/big5-encode-href-errors-han.html?3001-4000 and
/encoding/legacy-mb-japanese/euc-jp/eucjp-encode-href-errors-han.html?17001-18000
DOM elements previous called `URL.resolveURL(...)`. They now call
`self.asNode().resolveURL(...)`, where `Node#resolveURL` will provide the
document's charset.
Uses the newly added encoding_rs to implement TextDecoder for all encoding.
Claude wrote 100% of the Rust binding.
Improves various WPT tests, e.g. /encoding/api-basics.any.html.
The "browser.markdown: block link" test expected the old format
([](url)). Updated to expect [url](url) since block-content
anchors without aria-label/title now use the href as display text.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>