Commit Graph

121 Commits

Author SHA1 Message Date
Adrià Arrufat
eab9ae0243 RobotsLayer: use managed ArrayList 2026-05-04 08:01:22 +02:00
Patrick Wyatt
47d96ab8ad Display actual port when binding --port 0
This change causes lightpanda to display the actual port number (instead of 0)
when binding a dynamic port (--port 0), which makes automating based on
scraping lightpanda output simple.
2026-04-29 21:44:41 -07:00
Muki Kiboigo
e8c9acd310 fix request arena leak on CacheLayer hit 2026-04-28 09:48:23 -07:00
Muki Kiboigo
1ab445843c better arena management in Robots Layer and Context 2026-04-28 07:01:43 -07:00
Muki Kiboigo
1370f6805b add a note about cdp callback cb 2026-04-28 07:01:42 -07:00
Muki Kiboigo
3fe774fbfb pass error all the way up to Layer chain to clean 2026-04-28 07:01:42 -07:00
Muki Kiboigo
4de1dc5424 properly call error callback in InterceptionLayer 2026-04-28 07:01:42 -07:00
Muki Kiboigo
83b047e66a assert that intercepted isn't 0 before decrementing 2026-04-28 07:01:42 -07:00
Muki Kiboigo
c719a522b8 use lightpanda module log in layers 2026-04-28 07:01:42 -07:00
Muki Kiboigo
152a792c18 use Request Arena in RobotsLayer 2026-04-28 07:01:41 -07:00
Muki Kiboigo
e56036fb50 use Request Arena in CacheLayer 2026-04-28 07:01:41 -07:00
Muki Kiboigo
fc702794c2 use Request Arena in WebBotAuthLayer 2026-04-28 07:01:41 -07:00
Muki Kiboigo
d14b75d93b use Request arnea in InterceptionLayer 2026-04-28 07:01:41 -07:00
Muki Kiboigo
bb9e238f6c Requests now use arenas from the arena pool 2026-04-28 07:01:41 -07:00
Muki Kiboigo
175c2cc288 ensure robots params have arena and request id 2026-04-28 07:01:41 -07:00
Muki Kiboigo
87eec578aa use arena pool in InterceptionLayer 2026-04-28 07:01:41 -07:00
Muki Kiboigo
ca08f0c56d remove blocking from RequestParams 2026-04-28 07:01:40 -07:00
Muki Kiboigo
3db3281e8e working authentication with InterceptionLayer 2026-04-28 07:01:40 -07:00
Muki Kiboigo
d0b421b085 partial auth challenge support 2026-04-28 07:01:40 -07:00
Muki Kiboigo
dddd0dfb90 fix request id mismatch on cdp 2026-04-28 07:01:40 -07:00
Muki Kiboigo
0d50f706db more fixing of hanging in cdp interception 2026-04-28 07:01:40 -07:00
Muki Kiboigo
9c826159a0 crude InterceptionLayer 2026-04-28 07:01:40 -07:00
Muki Kiboigo
6d41ea6fd0 move arena up to Request instead of Transfer 2026-04-28 07:01:39 -07:00
Muki Kiboigo
14ad5c9cdc move RequestStart to InterceptionLayer 2026-04-28 07:01:39 -07:00
Muki Kiboigo
e988e49136 remove Context and thread *Client 2026-04-28 07:01:39 -07:00
Muki Kiboigo
46d0b34c54 add RequestParams and SyncRequest 2026-04-28 07:01:39 -07:00
Muki Kiboigo
5dd15aa2cf use layers for Cache, Robots and WebBotAuth 2026-04-28 07:01:39 -07:00
Adrià Arrufat
fdadbaaad5 http: free curl header list on error 2026-04-27 17:32:01 +02:00
Karl Seguin
8509b112b8 Various small fixes
Extracted from https://github.com/lightpanda-io/browser/pull/2242
2026-04-25 13:22:41 +08:00
Nikolay Govorov
c7d004fefb Setup timeout via tcp keepalive 2026-04-24 12:40:21 +01:00
Karl Seguin
813c7d2aa0 Merge pull request #2219 from lightpanda-io/unused_imports
Remove unused imports
2026-04-24 06:49:58 +08:00
Nikolay Govorov
c964604c7a Fix canada.ca problem 2026-04-23 12:15:57 +01:00
Karl Seguin
0a8f4ff75f Remove unused imports
As a general rule, I keep `std` if its there and unused, mostly for debug.print
debugging.
2026-04-23 16:21:56 +08:00
Karl Seguin
2275416505 Page -> Frame
This is to pave the way for introducing a new "Page" container, which will take
over the page lifecycle currently burdening Session. The ultimate goal of that
is to allow the Session to have multiple pages (mostly for better transitions
between pages), which is hard to do now since the Session has so much state.

This rename was aggressive, e.g. currentPage() -> currentFrame() so that, when
the new Page container is added, you won't see "currentPage()" and wonder:

  "Does 'currentPage' mean the new Page container, or the Frame (which
  used to be called Page)".
2026-04-22 08:42:18 +08:00
Karl Seguin
2d20e57f80 Change all @import("...../log.zig") to const log = lp.log;
@import("lightpanda") where needed.

Would also like to do this for String, Page, Session and js which all stand out
as types that are use across the codebase.

I know that a few devs are doing this in new work and I haven't heard anyone
voice an objection.
2026-04-20 12:40:04 +08:00
Karl Seguin
aac0a6e6b6 Websocket fixes.
This commit fixes a few serious issues with the Websocket implementation.

1 - libcurl recursive api calls
Creating a Websocket instance from within a libcurl callback results in libcurl
failing with a RecursiveApiCall error. I fixed this more generally by adding a
`ready_queue` which connections can use when the `HttpClient` is performing
actions. Once `perform` ends, this new `ready_queue` is processed. There might
be a more holistic solution to this (we seem to run into RecursiveApiCall
everywhere), but since HttpClient is going through heavy changes, this seemed
like the smallest possible change to fix it.

2 - "load" blocking
Load and IdleNetwork notifications should not block on Websocket connections. To
solve this, `HttpClient` now ha `http_active` and `ws_active` to replace `active`.
Only `http_active` is used for things like "load" triggering.

3 - The above change made the Runner's job more complicated. It used to be
binary: you either have active connections or not. Now there are different types
of active connections. To keep it simple, and I think probably more correct,
the "done-ness" (based on the `wait` parameter) is now independent of active
(or not) network activity. If the page's `load_state == .complete`, then the
`wait == .done` is considered successful, whether or not we have active
connections.

4 - As a consequence of the above, and seemingly unrelated to all of these
changes, a number of html tests now use the "new" robust async framework. Most
of these tests were using the `testing.onload` (aka `testing.eventually`) which
had somewhat...unclear semantics. These tests passed more of a consequence of
how we processed a page and being very simple (e.g. just needing 1 micro or
macrotask tick). But `eventually` never worked for more complicated cases, and
the previous `testing.async` didn't work well. Now, the test runner waits for
.load (which, as per #3, can fire more aggressively), which caused many
`eventually` tests to fail. Moving these tests to the new `async` is more
robust and works with the new aggressive "load".
2026-04-17 11:20:27 +08:00
Karl Seguin
f2a2acc1aa Merge pull request #2134 from lightpanda-io/cache_public_default
Cache-Control is public by default
2026-04-14 12:22:25 +08:00
Karl Seguin
63104a7f82 Re-enable debug allocator in debug
Disabled this when looking at memory profiles, and must have accidentally
committed it.
2026-04-11 12:24:19 +08:00
Karl Seguin
3be913750e Cache-Control is public by default
- If private isn't specified, default to public.
- Add some tests
- Optimize parsing by lower-casing once and switch to std.mem
2026-04-11 07:23:56 +08:00
Adrià Arrufat
070ee7df80 Merge branch 'main' into fix-telemetry-decoding 2026-04-10 09:42:21 +02:00
Pierre Tachoire
a4617390de Merge pull request #2104 from lightpanda-io/feat/add-ip-filter
Feat/add ip filter
2026-04-10 08:46:06 +02:00
Karl Seguin
8eaeafe16c Fix a lot of typos.
I used https://github.com/crate-ci/typos, it worked well.

Also, make sure cdp-initiated KeyboardEvent is freed when no element is in focus
2026-04-10 06:51:10 +08:00
Adrià Arrufat
d19e62ec3c http: add default write callback to prevent stdout pollution 2026-04-09 22:03:09 +02:00
Karl Seguin
0253092f20 Improvements to IpFilters
The main change is changing how CidrV4 and CidrV6 are stored, by pre-calculating
their mask and storing their address as integer.

This allows significant simplification of matchesCidrV4 and matchesCidrV6.
2026-04-09 15:40:16 +08:00
Adrià Arrufat
182447c907 cache: add log filter to garbage file test 2026-04-08 19:36:29 +02:00
Pierre Tachoire
6ef518438b fix custom cidrs mem leak 2026-04-08 15:09:01 +02:00
Pierre Tachoire
efb2fa9c22 Send Sec-Ch-Ua http header 2026-04-08 12:11:09 +02:00
Lucien Coffe
7f5abfc9cf fix: use dashes in CLI flag names for consistency
Rename --block_private_networks to --block-private-networks and
--block_cidrs to --block-cidrs to match the existing flag naming
convention (e.g. --http-proxy, --proxy-bearer-token).
2026-04-08 12:10:46 +02:00
Lucien Coffe
fb6c4e4978 feat: add allow-list exclusions to --block_cidrs
CIDRs prefixed with '-' are treated as allow rules that exempt matching
IPs from blocking. Allow rules take precedence over both
--block_private_networks and custom block CIDRs.

Example: --block_private_networks --block_cidrs -10.0.0.42/32
blocks all private ranges except 10.0.0.42.

Adds 3 new tests for allow-list behavior.
2026-04-08 12:10:46 +02:00
Lucien Coffe
f5cfc4d315 feat: add --block_private_networks and --block_cidrs CLI flags
Block outbound HTTP requests to specified IP ranges before TCP handshake
using libcurl CURLOPT_OPENSOCKETFUNCTION callback. Fires after DNS
resolution, reads resolved IP directly from sockaddr, does bitwise CIDR
comparison. Fail-closed: unknown address families are blocked.

--block_private_networks blocks RFC1918, localhost, link-local, ULA.
--block_cidrs blocks additional comma-separated CIDRs.
IPv4-mapped IPv6 (::ffff:x.x.x.x) is unwrapped to prevent bypass.
2026-04-08 12:10:42 +02:00