Commit Graph

5561 Commits

Author SHA1 Message Date
Karl Seguin
2f2f74ef4b Merge pull request #2135 from lightpanda-io/mo_cleanup
On Page cleanup, capture next linked list node _before_ releasing MO
2026-04-13 22:49:59 +08:00
Karl Seguin
9a05fd8af4 Merge pull request #2154 from lightpanda-io/cdp_request_id
Improve loaderId and requestId compatibility
2026-04-13 22:49:27 +08:00
Karl Seguin
ce811c1b48 Merge pull request #2142 from lightpanda-io/formdata_encoding
Encode form data based on the form (or documents) encoding.
2026-04-13 19:51:17 +08:00
Karl Seguin
6723d6642a 'fix' test 2026-04-13 19:51:04 +08:00
Karl Seguin
cdd109a41b Improve loaderId and requestId compatibility
This introduces two slightly related changes.

My understanding is:

- frameId represents the page. Even if the page navigates, it's the same
  frameId. We capture this in Page._frame_id. Nothing here changes.

- loaderId is essentially for a specific document of the page. If the page
  navigates, it should be a different loaderId. We were using a distinct
  loaderId per request. Not sure what problems that caused. But it was wrong.
  This was achieved by exposing Page.id to CDP.

- requestId was mostly correct: unique per request. HOWEVER, for the original
  document, apparently, requestId == loaderId. This change is particularly
  important for various puppeteer and playwrightb behavior. This is a bit
  hacked. CDP will look at the resource_type, if it's .document, it'll return
  the loaderId, else it returns the requestId it always id.
2026-04-13 18:33:43 +08:00
Karl Seguin
cd4e977020 Merge pull request #2150 from lightpanda-io/eventcounts
Add EventCounts API
2026-04-13 18:33:24 +08:00
Karl Seguin
3fe8c3ee8f Merge pull request #2148 from lightpanda-io/fix_typedarray_offset
Correctly treat a view's offset as a byte offset, not an element offset
2026-04-13 18:33:08 +08:00
Karl Seguin
ca061b46b7 Merge pull request #2149 from lightpanda-io/websocket_protocol
Basic protocol support for websocket.
2026-04-13 17:34:28 +08:00
Karl Seguin
577629f536 Merge pull request #2151 from lightpanda-io/text_decoder_streaming
TextDecoder streaming stop
2026-04-13 17:34:16 +08:00
Karl Seguin
299104ff1d TextDecoder streaming stop
When we stop streaming, we need to use any previously streamed data as part of
the last "unstreamed" chunk. Or, put differently, when stream is false, that
merely stops any subsequent streams, it doesn't discard any previously streamed
data.
2026-04-13 13:06:16 +08:00
Karl Seguin
592dc3e18d Add EventCounts API 2026-04-13 12:30:49 +08:00
Karl Seguin
5c161260fd don't break union probing 2026-04-13 11:35:32 +08:00
Karl Seguin
28a7e7fe45 Basic protocol support for websocket.
Websockets client can send a Protocol which the server can agree to. This isn't
as fancy as it sounds. We just send a specific header on websocket handshake
and then read the response header.
2026-04-13 11:21:59 +08:00
Karl Seguin
de167861c6 handle null buffers 2026-04-13 10:03:19 +08:00
Karl Seguin
cde8229be5 Correctly treat a view's offset as a byte offset, not an element offset 2026-04-13 09:52:03 +08:00
Karl Seguin
e6cffaef72 Merge pull request #2144 from lightpanda-io/feat/console-group
Feat/console group
2026-04-12 20:15:32 +08:00
Karl Seguin
6ebe112525 Add form acceptCharset accessor
Pre-size encoding buffer for possible numeric character insertion.
2026-04-12 19:38:36 +08:00
Karl Seguin
57f20e1831 Encode form data based on the form (or documents) encoding.
Does something similar to https://github.com/lightpanda-io/browser/pull/2126
but for form submission. It uses the form's accept-charset attribute, or
fallsback to the document's charset.
2026-04-12 19:38:36 +08:00
William Chan
e6fd004767 refactor: remove unused _group_depth tracking
Since group depth is not used for indentation in headless mode,
simplify by removing the field. group/groupCollapsed just log,
groupEnd is a no-op.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-12 19:37:39 +08:00
William Chan
d5556fda93 feat(console): implement group, groupCollapsed, and groupEnd
Add missing console grouping APIs per the WHATWG Console spec:
- console.group(...data): logs label and increments group depth
- console.groupCollapsed(...data): same as group (headless mode has no
  visual collapsing, so behavior is identical to group)
- console.groupEnd(): decrements group depth, clamped at 0

Depth is tracked via _group_depth (u32) on the Console struct.
Saturation arithmetic (+|=) prevents overflow on runaway group() calls.

Fixes TypeError crashes on sites that use console.group* APIs (e.g.
React/Vite dev builds).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-12 19:37:38 +08:00
Pierre Tachoire
dc8e917084 Merge pull request #2143 from lightpanda-io/ci-snap-cache
ci: invalidate snapshot cache on src/browser/webapi change
2026-04-12 11:56:26 +02:00
Pierre Tachoire
f7ef2be5cd ci: invalidate snapshot cache on src/browser/webapi change 2026-04-12 11:54:53 +02:00
Karl Seguin
283f3d4986 Merge pull request #2136 from lightpanda-io/debug_allocator
Re-enable debug allocator in debug
2026-04-11 12:58:52 +08:00
Karl Seguin
63104a7f82 Re-enable debug allocator in debug
Disabled this when looking at memory profiles, and must have accidentally
committed it.
2026-04-11 12:24:19 +08:00
Karl Seguin
da8d206c52 On Page cleanup, capture next linked list node _before_ releasing MO
Also, switch MO and IO to use a "small" arena, as they probably don't require
too many allocations in most normal cases (just observing 1 or 2 things).
2026-04-11 11:39:05 +08:00
Pierre Tachoire
071e70e5cc Merge pull request #2133 from lightpanda-io/feat/cdp-json-endpoints
Feat/cdp json endpoints
2026-04-10 17:51:48 +02:00
Pierre Tachoire
88dcac642a Merge pull request #2131 from lightpanda-io/pushstate-pathname
update page URL and location on pushState/replaceState
2026-04-10 17:37:43 +02:00
Matt Van Horn
224a7333f2 fix: use fixed Lightpanda/1.0 for /json/version User-Agent
Replace dynamic build version string with stable Lightpanda/1.0
in the Browser and User-Agent fields of the /json/version response.
The dev version (1.0.0-dev.5492+...) is not useful for CDP clients.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:36:10 +02:00
Matt Van Horn
416984d32f fix: update integration test for enriched /json/version response
The integration test at "server: get /json/version" was hardcoding
the old response with Content-Length: 48. Updated to verify the
enriched fields structurally since the version string varies at
build time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:36:09 +02:00
Matt Van Horn
503ca4ce07 feat: enrich CDP /json/version and add /json/list endpoint
Add Browser, Protocol-Version, and User-Agent fields to the
/json/version CDP endpoint response. Previously it only returned
webSocketDebuggerUrl, while Chrome and other CDP browsers return
7+ fields that automation tools use for capability detection.

Also add /json/list and /json endpoints that return an empty JSON
array, matching the standard CDP endpoint layout that tools like
Puppeteer and chromedp expect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:36:05 +02:00
Adrià Arrufat
bf4e33ee69 Merge pull request #2132 from lightpanda-io/fix/html5ever-cargo-cache-invalidation
build: track html5ever Rust sources as cargo step inputs
2026-04-10 17:14:14 +02:00
Pierre Tachoire
bf75e0cdfb Merge pull request #2123 from mvanhorn/fix/markdown-link-formatting
fix: use proper link text in markdown dump for block-content anchors
2026-04-10 17:09:38 +02:00
Adrià Arrufat
d6cdafc480 build: track html5ever Rust sources as cargo step inputs 2026-04-10 16:42:42 +02:00
Muki Kiboigo
d47e24ced0 add test for History URL updating 2026-04-10 07:38:21 -07:00
Adrià Arrufat
d6aea1187f Merge branch 'main' into fix/markdown-link-formatting 2026-04-10 16:26:14 +02:00
Muki Kiboigo
08cd9ca799 properly resolve URL before setting Location in History 2026-04-10 07:26:01 -07:00
Adrià Arrufat
c1a65160c1 markdown: test block link aria-label and title handling 2026-04-10 16:14:58 +02:00
Karl Seguin
24e17b6f21 Merge pull request #2130 from lightpanda-io/arena_pool_buckets
Add arena buckets to ArenaPool
2026-04-10 20:33:19 +08:00
Karl Seguin
771df02c49 Merge pull request #2129 from lightpanda-io/non-utf8-querystring-encoding
Non utf8 querystring encoding
2026-04-10 20:33:07 +08:00
Karl Seguin
ddf614a9d5 Add arena buckets to ArenaPool
ArenaPool previously maintained up to 512 16KB buckets. The 16KB retention is
small for things like XHR and scripts, but increasing it to something more
reasonably, like 128KB, would use up to 8x more memory.

This commit adds 4 buckets: 1KB, 4KB, 16KB and 128KB. Callers can request a
tiny, small, medium or large bucket. We end up using less memory peak memory
and less allocations.

Furthermore, callers can request a specific size. This is particularly useful
for WebSocket or Blob where the size could vary greatly (so we'd likely default
to a large bucket), but that could needlessly use up a large arena.

The bucket sizes were derived from analyzing allocations. A significant number
of allocations were very small. Things like ScheduleCallback and
FinalizerCallback are always less than 1K and can be generated in the thousands.
The 16KB retention was wasteful in these cases...better to have a large number
of 1K pools, so that we can have a handful of very large buffers.
2026-04-10 19:09:18 +08:00
Karl Seguin
2cfa1ea035 Merge pull request #2116 from lightpanda-io/gc-snapshot
force an aggressive GC on v8 after snapshot creation
2026-04-10 18:48:08 +08:00
Karl Seguin
e9b8707bdd Merge pull request #2121 from lightpanda-io/fix-telemetry-decoding
http: add default write callback to prevent stdout pollution
2026-04-10 18:38:43 +08:00
Pierre Tachoire
0f7079ec7a Merge pull request #2128 from lightpanda-io/v8-snapshot-cache
use v8 snapshot cache with wpt
2026-04-10 11:59:05 +02:00
Pierre Tachoire
e53e4579ab ci: use v8 snapshot cache w/ wpt test 2026-04-10 11:35:40 +02:00
Pierre Tachoire
9cd79941bf Merge pull request #2119 from lightpanda-io/wpt-completion
ci: send wpt completion
2026-04-10 11:26:15 +02:00
Pierre Tachoire
36fcb0fd7f ci: use a longer timeout for e2e test
When we have to generate a snapshot, the build duration is longer.
2026-04-10 11:02:04 +02:00
Karl Seguin
7c66240146 chore: trigger CI 2026-04-10 16:47:42 +08:00
Karl Seguin
a5bf1f07af chore: trigger CI 2026-04-10 16:47:42 +08:00
Karl Seguin
05229fdc53 Use the document's charset to determine if/how to encode querystring
Whenever we resolve a URL, say from `anchor.href`, we should consider the
document's charset when encoding the querystring. This probably isn't the
most important feature, but it makes tens of thousands of WPT cases pass, e.g

/encoding/legacy-mb-tchinese/big5/big5-encode-href-errors-han.html?3001-4000 and
/encoding/legacy-mb-japanese/euc-jp/eucjp-encode-href-errors-han.html?17001-18000

DOM elements previous called `URL.resolveURL(...)`. They now call
`self.asNode().resolveURL(...)`, where `Node#resolveURL` will provide the
document's charset.
2026-04-10 16:47:42 +08:00
Karl Seguin
f7c1710c23 Expose correct charset
document.characterSet, document.charset and document.inputEncoding now exposes
the correct charset.
2026-04-10 16:47:42 +08:00