This adds an app.storage which is a union around configurable storage engine
(currently, only sqlite).
It is _not_ being used anywhere right now. The goal is to get feedback on
the implementation and then move cache to it.
This doesn't expose a generic query API. The goal is that the storage will
expose high level methods, e.g. `cacheGet(req: CacheGetRequest)` and every
storage engine will translate the `storage.CacheGetRequest` as needed.
A thin wrapper around the Sqlite C api is included, e.g. exec(SQL, .{args}) a
`rows` and `row` fetcher. A connection pool is included. By default, an
in-memory DB is currently created. And a `migrations` table with an id of `1`
is created/inserted. I don't imagine needing fancy migratations.
@import("lightpanda") where needed.
Would also like to do this for String, Page, Session and js which all stand out
as types that are use across the codebase.
I know that a few devs are doing this in new work and I haven't heard anyone
voice an objection.
ArenaPool previously maintained up to 512 16KB buckets. The 16KB retention is
small for things like XHR and scripts, but increasing it to something more
reasonably, like 128KB, would use up to 8x more memory.
This commit adds 4 buckets: 1KB, 4KB, 16KB and 128KB. Callers can request a
tiny, small, medium or large bucket. We end up using less memory peak memory
and less allocations.
Furthermore, callers can request a specific size. This is particularly useful
for WebSocket or Blob where the size could vary greatly (so we'd likely default
to a large bucket), but that could needlessly use up a large arena.
The bucket sizes were derived from analyzing allocations. A significant number
of allocations were very small. Things like ScheduleCallback and
FinalizerCallback are always less than 1K and can be generated in the thousands.
The 16KB retention was wasteful in these cases...better to have a large number
of 1K pools, so that we can have a handful of very large buffers.
When the snapshot isn't baked-into the binary, it requires more memory to
initialize and load. By changing the initializing order from: network, isolate,
snapshot to: isolate, snapshot, network we can reduce the peak startup memory.
This is because the short-live snapshot intermediary data and the long-lived
network data (certs) no longer overlap.
Same pattern as 3dea554e (mcp/Server.zig): allocator.create returns
undefined memory, and struct field defaults (shutdown: bool = false)
are not applied when fields are set individually. Use self.* = .{...}
to ensure all fields including defaults are initialized.
Any object we return from Zig to V8 becomes a v8::Global that we track in our
`ctx.identity_map`. V8 will not free such objects. On the flip side, on its own,
our Zig code never knows if the underlying v8::Object of a global can still be
used from JS. Imagine an XHR request where we fire the last readyStateChange
event..we might think we no longer need that XHR instance, but nothing stops
the JavaScript code from holding a reference to it and calling a property on it,
e.g. `xhr.status`.
What we can do is tell v8 that we're done with the global and register a callback.
We make our reference to the global weak. When v8 determines that this object
cannot be reached from JavaScript, it _may_ call our registered callback. We can
then clean things up on our side and free the global (we actually _have_ to
free the global).
v8 makes no guarantee that our callback will ever be called, so we need to track
these finalizable objects and free them ourselves on context shutdown. Furthermore
there appears to be some possible timing issues, especially during context shutdown,
so we need to be defensive and make sure we don't double-free (we can use the
existing identity_map for this).
An type like XMLHttpRequest can be re-used. After a request succeeds or fails,
it can be re-opened and a new request sent. So we also need a way to revert a
"weak" reference back into a "strong" reference. These are simple v8 calls on
the v8::Global, but it highlights how sensitive all this is. We need to mark
it as weak when we're 100% sure we're done with it, and we need to switch it to
strong under any circumstances where we might need it again on our side.
Finally, none of this makes sense if there isn't something to free. Of course,
the finalizer lets us release the v8::Global, and we can free the memory for the
object itself (i.e. the `*XMLHttpRequest`). This PR also adds an ArenaPool. This
allows the XMLHTTPRequest to be self-contained and not need the `page.arena`.
On init, the `XMLHTTPRequest` acquires an arena from the pool. On finalization
it releases it back to the pool. So we now have:
- page.call_arena: short, guaranteed for 1 v8 -> zig -> v8 flow
- page.arena long: lives for the duration of the entire page
- page.arena_pool: ideally lives for as long as needed by its instance (but no
guarantees from v8 about this, or the script might leak a lot of global, so worst
case, same as page.arena)
There are two layers here. The first is that, on startup, a v8 SnapshotCreator
is created, and a snapshot-specific isolate/context is setup with our browser
environment. This contains most of what was in Env.init and a good chunk of
what was in ExecutionWorld.createContext. From this, we create a v8.StartupData
which is used for the creation of all subsequent contexts. The snapshot sits
at the application level, above the Env - it's re-used for all envs/isolates, so
this gives a nice performance boost for both 1 connection opening multiple pages
or multiple connections opening 1 page.
The second layer is that the Snapshot data can be embedded into the binary, so
that it doesn't have to be created on startup, but rather created at build-time.
This improves the startup time (though, I'm not really sure how to measure that
accurately...).
The first layer is the big win (and just works as-is without any build / usage
changes).
with snapshot
total runs 1000
total duration (ms) 7527
avg run duration (ms) 7
min run duration (ms) 5
max run duration (ms) 41
without snapshot
total runs 1000
total duration (ms) 9350
avg run duration (ms) 9
min run duration (ms) 8
max run duration (ms) 42
To embed a snapshot into the binary, we first need to create the snapshot file:
zig build -Doptimize=ReleaseFast snapshot_creator -- src/snapshot.bin
And then build using the new snapshot_path argument:
zig build -Dsnapshot_path=../../snapshot.bin -Doptimize=ReleaseFast
The paths are weird, I know...since it's embedded, it needs to be inside the
project path, hence we put it in src/snapshot.bin. And since it's embedded
relative to the embedder (src/browser/js/Snapshot.zig) the path has to be
relative to that, hence ../../snapshot.bin. I'm open to suggestions on
improving this.