The `fetch` command is very practical to render pages without needing to
have a long running browser instance.
It is however masking all details on the fetch, most importantly the HTTP status code.
This is a big caveat when leveraging `lightpanda fetch` in a pipeline.
This introduces a `--json` option to provide a structured output that
contains:
* url
* HTTP status code
* response headers
* rendered content as controlled by the `--dump` option
The proposal is to always output the same JSON format even when not
using `--dump` with an option.
* Prefer `--inject-*` prefix.
* Support injecting multiple scripts (also allows using both variants together).
* Instead of executing scripts in JS context, actually insert them to `<head>` for correct dump output.
The main.zig path for `fetch` now captures the *Browser so that
browser.env.terminate() can be called. This is a bit more complex than the serve
path because the Browser owns the Isolate and can't be moved from one thread to
another.
With main having access to the browser, two things are now possible:
1 - We can support a --terminate-ms flag (https://github.com/lightpanda-io/browser/issues/2206)
2 - ctrl-c can correctly stop blocked JavaScript processes
1 is implemented via setitimer to set a timer for SIGALRM, avoiding the need to
add another "watcher" thread, or putting a timer in Network.run.
- Configurable navigation timeouts and wait strategies in MCP tools.
- Default navigation timeout increased from 2s to 10s.
- Added navigate, eval, and screenshot MCP tools.
- Supported running a CDP server alongside MCP using --cdp-port.
- Fixed various startup crashes when running CDP in MCP mode.
- Hardened MCP server error handling.
These new optional parameter run AFTER --wait-until, allowing the (imo) useful
combination of `--wait-until load --wait-script "report.complete === true"`.
However, if `--wait-until` IS NOT specified but `--wait-selector/script` IS,
then there is no default wait and it'll just check the selector/script. If
neither `--wait-selector` or `--wait-script/--wait-script-file` are specified
then `--wait-until` continues to default to `done`.
These waiters were added to the Runner, and the existing Action.waitForSelector
now uses the runner's version. Selector querying has been split into distinct
parse and query functions, so that we can parse once, and query on every tick.
We could potentially optimize --wait-script to compile the script once and call
it on each tick, but we'd have to detect page navigation to recompile the script
in the new context. Something I'd rather optimize separately.
When running mcp server, it initialized lp.mcp.Server in the main thread
which also implicitly created the V8 isolate in the main thread.
When processing requests (like calling the goto tool) inside mcpThread,
V8 would assert that the isolate doesn't match the current thread.
Fixes#1938
Removes manual git flags from CI and build scripts.
Versioning is now automatically derived from git and build.zig.zon.
With this PR, we follow https://semver.org/
Logic:
1. Read the version from build.zig.zon
2. If it doesn't have a `.pre` field (i.e. dev/alpha/beta) it will use that
3. Otherwise it will get the info from git: hash and number of commits since last `.0` version
4. Then build the version: `0.3.0-dev.1493+0896edc3`
Note that, since the latest stable version is `0.2.6`.
The convention is to use `0.3.0-dev`, as:
- `0.2.6` < `0.3.0.dev` < `0.3.0`
Add `--wait_until` and `--wait_ms` CLI arguments to configure session wait behavior. Updates `Session.wait` to evaluate specific page load states (`load`, `domcontentloaded`, `networkidle`, `fixed`) before completing the wait loop.
- Add git_version option to build.zig (similar to git_commit)
- Update version command to output git_version when available
- Falls back to git_commit when not on a tagged release
- CI can pass -Dgit_version=$(git describe --tags --exact-match) for releases
Fixes#1867
When set (defaults to not set/false), --dump will include iframe contents.
I was hoping I could add a mode to strip_mode to this, but since dump is used
extensively (e.g. innerHTML), this is something that has to be off by default
(for correctness).
Adds a new `mcp` run mode to start an MCP server over stdio.
Implements tools for navigation and JS evaluation, along with
resources for HTML and Markdown page content.
At a high level, this does for Events what was recently done for XHR, Fetch and
Observers. Events are self-contained in their own arena from the ArenaPool and
are registered with v8 to be finalized.
But events are more complicated than those other types. For one, events have
a prototype chain. (XHR also does, but it's always the top-level object that's
created, whereas it's valid to create a base Event or something that inherits
from Event). But the _real_ complication is that Events, unlike previous types,
can be created from Zig or from V8.
This is something that Fetch had to deal with too, because the Response is only
given to V8 on success. So in Fetch, there's a period of time where Zig is
solely responsible for the Response, until it's passed to v8. But with events
it's a lot more subtle.
There are 3 possibilities:
1 - An Event is created from v8. This is the simplest, and it simply becomes a
a weak reference for us. When v8 is done with it, the finalizer is called.
2 - An Event is created in Zig (e.g. window.load) and dispatched to v8. Again
we can rely on the v8 finalizer.
3 - An event is created in Zig, but not dispatched to v8 (e.g. there are no
listeners), Zig has to release the event.
(It's worth pointing out that one thing that still keeps this relatively
straightforward is that we never hold on to Events past some pretty clear point)
Now, it would seem that #3 is the only issue we have to deal with, and maybe
we can do something like:
```
if (event_manager.hasListener("load", capture)) {
try event_manager.dispatch(event);
} else {
event.deinit();
}
```
In fact, in many cases, we could use this to optimize not even creating the
event:
```
if (event_manager.hasListener("load, capture)) {
const event = try createEvent("load", capture);
try event_manager.dispatch(event);
}
```
And that's an optimization worth considering, but it isn't good enough to
properly manage memory. Do you see the issue? There could be a listener (so we
think v8 owns it), but we might never give the value to v8. Any failure between
hasListener and actually handing the value to v8 would result in a leak.
To solve this, the bridge will now set a _v8_handover flag (if present) once it
has created the finalizer_callback entry. So dispatching code now becomes:
```
const event = try createEvent("load", capture);
defer if (!event._v8_handover) event.deinit(false);
try event_manager.dispatch(event);
```
The v8 finalizer callback was also improved. Previously, we just embedded the
pointer to the zig object. In the v8 callback, we could cast that back to T
and call deinit. But, because of possible timing issues between when (if) v8
calls the finalizer, and our own cleanup, the code would check in the context to
see if the ptr was still valid. Wait, what? We're using the ptr to get the
context to see if the ptr is valid?
We now store a pointer to the FinalizerCallback which contains the context.
So instead of something stupid like:
```
// note, if the identity_map doesn't contain the value, then value is likely
// invalid, and value.page will segfault
value.page.js.identity_map.contains(@intFromPtr(value))
```
We do:
```
if (fc.ctx.finalizer_callbacks.contains(@intFromPtr(fc.value)) {
// fc.value is safe to use
}
```
Currently the sighandler is setup regardless of the running mode, but it only
does something in "serve" mode. In fetch mode, since there are no registered
listeners, it intercepts the signal and does nothing. On MacOS at least, this
isn't a great experience as it can leave the process running in the background.
This adds a crash handler which reports a crash (if telemetry is enabled). On a
crash, this looks for `curl` (using the PATH env), and forks the process to then
call execve. This relies on a new endpoint to be setup to accept the "report".
Also, we include very little data..I figured just knowing about crashes would
be a good place to start.
A panic handler is provided, which override's Zig default handler and hooks
into the crash handler.
An `assert` function is added and hooks into the crash handler. This is
currently only used in one place (Session.zig) to demonstrate its use. In
addition to reporting a failed assert, the assert aborts execution in
ReleaseFast (as opposed to an undefined behavior with std.debug.assert).
I want to hook this into the v8 global error handler, but only after direct_v8
is merged.
Much of this is inspired by bun's code. They have their own assert (1) and
a [more sophisticated] crashHandler (2).
:
(1) beccd01647/src/bun.zig (L2987)
(2) beccd01647/src/crash_handler.zig (L198)