Commit Graph

10 Commits

Author SHA1 Message Date
Zoltan Kochan
83f06a6046 refactor(pacquet): trim comments that restate code or record history (#12482)
## What

A repository-wide sweep of the Rust source under `pacquet/` to bring comments in line with the comment policy in `AGENTS.md` / `pacquet/AGENTS.md`: **comments are for the non-obvious _why_, not a translation of the _what_, and behavior already proven by a test should not be re-narrated in code.**

Two focused commits:

**1. Trim comments that restate code or record history** (`refactor(pacquet): trim comments that restate code or record history`)
- comments that merely restate the adjacent code;
- history / refactor-shape narration (`used to`, `before this PR`, `Copilot flagged`, "the old X", removed-type references);
- call-site comments duplicating a callee's own doc comment;
- test prose that just re-describes what a test's name and assertions demonstrate.

**2. Drop comments that narrate test-covered behavior** (`refactor(pacquet): drop comments that narrate test-covered behavior`) — enforcing the "tests are documentation" rule. Each implementation file was paired with its co-located test module so the duplication was visible, then comments enumerating tested scenarios, edge cases, failure modes, or worked examples were removed — **including when they carried a pnpm-parity reference/link**, since that context lives in the test names and git history.

Preserved throughout: genuine `///`/`//!` contracts (guarantees, preconditions, postconditions, panics), hidden invariants, workarounds, ordering/safety reasons, parity notes that aren't behavior-restatement, and `SAFETY:`/`TODO:` notes.

## Scope

Comments only — **no code, identifiers, or string literals were modified** in either commit. ~500 file edits in total; several thousand comment lines removed.

## How it was done

Multi-agent sweeps: one agent per batch applied the policy, and a second agent diff-checked each edited batch for over-deletion (lost contracts) or accidental code changes. Over-trimmed contracts were restored; zero code changes were introduced (independently re-verified: every added line is identical to a removed line modulo a stripped trailing comment).
2026-06-17 23:05:55 +02:00
Zoltan Kochan
4ca9247a9b fix: preserve node runtime version prefix (#12444) 2026-06-16 12:43:37 +02:00
Zoltan Kochan
657d322b15 perf(network): schedule tarball downloads by estimated pipeline work (#12309)
## Summary

When the download connection pool saturates, freed slots are granted by a two-class scheduling policy instead of FIFO:

- **Latency class** (packument/metadata fetches, which gate resolution progress): served FIFO.
- **Throughput class** (tarball downloads): ranked by **estimated total pipeline work** — `unpackedSize + 3000 × fileCount` — so the most expensive download+extract jobs start first (longest-processing-time-first; a large archive that starts last runs alone at single-connection throughput while every other slot idles, see [pnpm/pnpm#12230](https://github.com/pnpm/pnpm/issues/12230)). The per-file term prices the fixed CAS-write/hash overhead, so a many-small-files package ranks as the long job it actually is.
- **Neither class can starve the other**: downloads are guaranteed a reserved half of the pool (strict metadata-first was measured to serialize cold installs — no tarball got a slot until resolution drained, costing the whole resolve/fetch overlap), and metadata wins beyond that reserve (a download backlog can't stall resolution). Both directions are work-conserving.

### How the size hints travel

- Local fresh installs read `dist.unpackedSize` / `dist.fileCount` off the resolver-fetched manifest (also fixes exact decompression-buffer preallocation on the prefetch path, previously hardcoded `None`).
- The pnpr `/v1/resolve` `package` frame carries both as optional `unpackedSize` / `fileCount` fields (omitted when the registry never published them; old clients and servers interoperate unchanged).
- pnpr frozen restores: the lockfile records no sizes, but the verification fan-out fetches each entry's metadata anyway — the npm verifier records both stats into an optional `ObservedDistStats` sink as a side product of the tarball-URL binding check, and the frozen fast path announces every verified tarball as a sized `package` frame before `done` (URLs derived by the same `tarball_url_and_integrity` the client materialization uses). Verdict-cache hits fetch no metadata and keep the bare `done` frame.
- pnpr's abbreviated metadata now **preserves** `unpackedSize`/`fileCount` instead of stripping them, since pacquet reads both.
- Resolve-time tarball fetches (tarball deps' manifests come from their archives) acquire in the latency class — they gate the resolver's walk.

### Benchmark tooling

- The integrated benchmark's latency proxy gained `--registry-slow-start`: per-connection TCP slow start (RFC 6928 initial window, doubling per delivered window toward the bandwidth cap), so scheduling effects that depend on per-connection ramp-up are measurable.
- Fixed a macOS bug where the proxy's accepted sockets inherited the listener's `O_NONBLOCK` and every proxied connection died on its first read — all shaped benchmark traffic silently failed before this.

## Measurements

Fixture: ~110 direct deps / 1308 packages (~90 MB wire), `isolated-linker.fresh-install.cold-cache.cold-store`, local mirror of real npm behind the shaped proxy (30 ms RTT, 80 Mbit/s per-connection cap, TCP slow start).

**Drift-controlled interleaved comparison** (4 alternating blocks x 4 runs each; sequential multi-target sessions on this machine showed up to +75% session-order drift, so block-paired ABAB is the only design we trust):

| target | mean +/- sd (n=16) |
| --- | --- |
| baseline FIFO | 14.36 s +/- 0.54 |
| this PR | **14.06 s +/- 0.70** |

The PR wins **all 4 paired blocks** (-0.18 s to -0.50 s, mean -0.30 s, ~2%). A scheduler ablation (reserve+FIFO, smallest-first, unpackedSize-only, work with K=3k and K=10k per file) ordered as the pipeline model predicts, but the per-variant deltas sit inside the session-drift noise, so only the FIFO-vs-full-design pairing is claimed. K in [3000, 10000] is indistinguishable.

**The starvation fix is the load-bearing piece, established mechanistically rather than by wall clock:** with strict metadata-first priority (an intermediate design), cold-install event timelines showed 4-7 s windows at install start with zero tarball activity - downloads never won a slot during the resolution burst, serializing the resolve and fetch phases. The reserved share removes those gaps entirely and the worst observed cold-install runs with it are within ~1 s of the median, where unreserved variants showed multi-second stragglers.

Real-registry A/B (15 randomized cold-install pairs against npmjs) is noise-bound on a saturated ~100 Mbit link (+/-3 s registry variance), median -0.17 s in this PR's favor - consistent with "never slower."
2026-06-11 02:58:36 +02:00
Khải
ac367fce91 chore(rust/clippy): pedantic, nursery, and some (#12209)
* chore: enable clippy::pedantic lint group for pacquet workspace

* style(pacquet): comply with clippy::pedantic

Apply clippy's machine-applicable pedantic fixes across the workspace
(inlined format args, removed needless borrows/closures, added
must_use, etc.), fix a few doc-comment backtick nits, and drop
pointless #[inline(always)] on trivial accessors.

Opt specific pedantic lints back out in [workspace.lints.clippy] with
documented justifications, grouped into false positives, library-API
hygiene that doesn't fit an internal CLI, suggestions that conflict
with the cardinal rule of porting pnpm 1:1, and opinionated style.

* style: taplo-format Cargo.toml lint table

* style(pnpr): comply with clippy::pedantic in merged auth backend code

Re-apply pedantic compliance to the networked-SQLite auth backend that
landed on main (#12186, #12199/#12206): doc-comment backticks, #[must_use]
on constructors and status_code, i64::from over `as`, map_or, and a
method-reference closure.

* docs(clippy): trim and inline the pedantic allow-list comments

* docs(clippy): note perfectionist supersedes many_single_char_names

* docs(clippy): note pnpm-mirroring rationale on structure/naming lints

* docs(clippy): mark unused_async as deferred pending audit

* style: enable clippy::match_wildcard_for_single_variants

* refactor: enable clippy::unused_self

Convert two self-less private methods (overrides pick_most_specific,
tarball head_only_result) to associated functions.

* refactor: enable clippy::ref_option

Widen engine_json to Option<&str>; #[expect] the two serde
serialize_with helpers, which serde must call as f(&field, ser).

* perf: enable clippy::trivially_copy_pass_by_ref

Pass the 1-byte Copy types NodeLinker and FilterWorkspaceProjectsOptions
by value; #[expect] the serde skip_serializing_if helper is_false.

* perf: enable clippy::assigning_clones

Use clone_from for seven field assignments to reuse allocations.

* style: enable clippy::manual_let_else

Convert 27 match/if-let guards to let-else; preserve the non-UTF-8
skip rationale comment in the directory walker.

* style: enable clippy::default_trait_access

Name the concrete type on Default::default() call sites; #[expect] two
struct-literal test fixtures where naming each field type would force
~20 imports.

* refactor: enable clippy::format_push_string

Replace push_str(&format!(...)) with write!/writeln! into the target
String (local 'use std::fmt::Write as _'); writeln! preserves the
exact LF/CRLF shell-shim output.

* refactor: enable clippy::needless_pass_by_value

Take by reference where the argument is only read (incl. dropping
some redundant clones in resolve_peers' recursion). Where converting
would cascade badly, #[expect] with a reason: functions that
destructure/consume the arg (build_resolve_result, PrefetchingResolver,
S3Store::new), the by-value `impl IntoIterator + Clone` in
build_direct_deps_by_importer, and the serde/test helpers whose owned
fixtures keep call sites clean.

* fix(perfectionist): satisfy dylint after format_push_string changes

Add trailing commas to the multi-line writeln! shell-shim templates
(macro_trailing_comma) and merge the new `fmt::Write as _` imports into
each file's existing `use std::{...}` block (import_granularity).

* docs(clippy): explain missing_errors_doc suppression; mark missing_panics_doc deferred

* fix(perfectionist): collapse fmt::{self, Write as _} in work_env imports

The format_push_string Write import landed as a sibling fmt:: path next
to the existing fmt import; merge them so import_granularity passes.

* style: enable clippy::return_self_not_must_use

Add #[must_use] to the WorkspaceTreeCtx builder methods, matching the
#[must_use] already on the parallel TreeCtx builders.

* perf: enable clippy::large_stack_arrays

Heap-allocate the 64 KiB read buffer in verify_file_integrity with a Vec
instead of placing it on the stack.

* chore(clippy): enable clippy::nursery group

Enable the nursery lint group on the pacquet/pnpr workspace and bring the
code into compliance.

Fixed in code:
- iter_on_single_items: [x].into_iter()/.iter() -> std::iter::once
- equatable_if_let: pattern match -> equality check (the install_accelerator
  rewrite wraps in a multi-line matches!, which gets a trailing comma for
  perfectionist::macro_trailing_comma)
- needless_pass_by_ref_mut: load_pending_row/apply_write_msg take &StoreIndex

Opted back out in Cargo.toml, each with a documented justification: use_self,
too_long_first_doc_paragraph, missing_const_for_fn, option_if_let_else,
significant_drop_tightening, redundant_pub_crate, derive_partial_eq_without_eq,
branches_sharing_code, useless_let_if_seq, single_option_map, iter_with_drain,
literal_string_with_formatting_args, collection_is_never_read.

Dropped the now-redundant individual nursery warns (needless_collect,
or_fun_call, redundant_clone) the group now covers, plus the default-on
unnecessary_lazy_evaluations. Kept clone_on_ref_ptr and if_then_some_else_none
(restriction lints not enabled by any group).

* style: bring merged main code into clippy pedantic compliance

The 17 commits merged from main predate this branch's pedantic/nursery
lint config, so their new code tripped pedantic lints. Apply the
machine-applicable fixes (uninlined_format_args, if_not_else,
elidable_lifetime_names, must_use_candidate, single_match_else,
map_unwrap_or, default_trait_access, assigning_clones, doc_markdown, ...)
and re-add the documented #[expect(needless_pass_by_value)] on
S3Store::new that this branch had carried on the now-replaced file.

* style: bring merged main code into clippy pedantic compliance

The 28 commits merged from main predate this branch's lint config, so
their new code tripped pedantic lints. Apply the machine-applicable fixes
(uninlined_format_args, manual_let_else, needless_raw_string_hashes,
redundant_closure_for_method_calls, map_unwrap_or, elidable_lifetime_names,
doc_markdown, ...) plus a few by hand:
- derive Copy on LinkSlotsParallel (all fields are Copy/refs) to clear
  needless_pass_by_value without a signature change
- deduplicate_all takes &[Vec<DepPath>] (it only borrows the duplicates)
- pick_most_specific becomes an associated fn (it never used self)
- default_trait_access -> concrete types; assigning_clones -> clone_from;
  format_push_string -> write!
- #[expect] with reasons where a fix would churn main's feature code:
  needless_pass_by_value on the recursive resolve_node and a test helper,
  and float_cmp on two deterministic-fixture assertions

* style: enable clippy::allow_attributes and allow_attributes_without_reason

Both are restriction lints (not implied by any group), enabled alongside
the existing clone_on_ref_ptr / if_then_some_else_none. Convert every
#[allow(...)] (including one nested in cfg_attr) to #[expect(...)]; all
already carried a reason, so allow_attributes_without_reason is satisfied.

Drop two now-redundant suppressions surfaced by the conversion: a
duplicated #[expect(too_many_arguments)] on fetch_and_extract_zip_once
(a prior merge left both an allow and an expect), and the
#[expect(dead_code)] on MissingPeerInfo's fields (the #[derive(Debug,
Clone)] already reads them, so dead_code never fired).

clone_on_ref_ptr was already enabled. mod_module_files is intentionally
NOT enabled: it mandates mod.rs, the opposite of the flat module.rs
pattern this project requires (CODE_STYLE_GUIDE.md, enforced by
perfectionist::flat_module_pattern).

* style: enable clippy::mod_module_files to enforce the flat module layout

mod_module_files bans mod.rs files, enforcing the flat module.rs pattern
this project already uses (0 mod.rs in the tree, so no violations). Update
CODE_STYLE_GUIDE.md to cite it as the enforcer; perfectionist's
flat_module_pattern is being retired in favor of this Clippy rule.

* fix(perfectionist): trailing comma on wrapped assert_eq! in workspace_yaml tests

The default_trait_access fix lengthened the assert_eq! so fmt wrapped it
to multi-line, which perfectionist::macro_trailing_comma requires to end
with a trailing comma.

* fix(fs): use cfg_attr expect instead of allow for Windows-unused mode args

With clippy::allow_attributes enabled, the #[cfg_attr(windows, allow(unused))]
on make_file_executable and the ensure_file/write_atomic mode params fails
Windows CI. Switch to #[cfg_attr(windows, expect(unused, reason = ...))];
on Windows the lint fires (Unix mode unused there) so the expectation is
fulfilled, and the attribute stays inert on Unix.

* fix(fs): drop the Windows unused suppression on ensure_file's mode arg

ensure_file forwards mode to verify_or_rewrite unconditionally, so it is
used on Windows too; the #[cfg_attr(windows, expect(unused))] was therefore
unfulfilled and failed Windows CI under -D warnings. write_atomic and
make_file_executable keep their expect — they use mode/file only under
#[cfg(unix)], so the lint fires (and the expectation holds) on Windows.

* chore(git): revert "fix(fs): drop the Windows unused suppression on ensure_file's mode arg"

This reverts commit 1d617c3e1f.

* chore(git): revert "fix(fs): use cfg_attr expect instead of allow for Windows-unused mode args"

This reverts commit 155e4a3dde.

* chore(git): revert "style: enable clippy::allow_attributes and allow_attributes_without_reason"

This reverts commit a47d7926f2.

* style: bring merged main code into clippy compliance + fix merge mismatch

- Add & at the two run_postinstall_hooks / run_project_lifecycle_scripts
  call sites: this branch widened lifecycle.rs to take &RunPostinstallHooks,
  but main's by-value call sites came in via the conflict resolution.
- pedantic fixes on main's new code: must_use_candidate, unnested_or_patterns,
  manual_let_else, default_trait_access, iter_on_single_items, and
  trivially_copy_pass_by_ref (map_node_linker takes NodeLinker by value).

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-11 00:43:22 +02:00
Zoltan Kochan
3d50680eda fix(security): verify Node.js runtime SHASUMS OpenPGP signature (#12295)
Follow-up to #12292 (which verifies the **package-manager** binary). This closes the same class of gap for the **Node.js runtime**.

When a repository requests a Node.js runtime — `devEngines.runtime: node@X` (with `onFail: download`, the default) or `useNodeVersion` — pnpm downloads and then executes a Node binary (it's used to run lifecycle / `run` / `exec` scripts). The download **mirror is repository-configurable** via `node-mirror:<channel>` (`nodeDownloadMirrors`) in project `.npmrc`, and the integrity comes from `SHASUMS256.txt` fetched **from that same mirror**.

That's a circular check: a malicious mirror serves a tampered `node` tarball **and** a matching `SHASUMS256.txt`, the sha256 check passes, and pnpm runs the binary. Drive-by on a normal command in a cloned repo.

## Fix

pnpm now fetches `SHASUMS256.txt.sig` and verifies its **detached OpenPGP signature** against the **Node.js release team's public keys, embedded in the pnpm CLI**, before trusting the hashes. A mirror that serves a tampered binary cannot also produce a valid signature, so verification fails. Any faithful mirror (one that proxies the real signed SHASUMS) keeps working.

- `@pnpm/crypto.shasums-file`: new `fetchVerifiedNodeShasums` / `fetchVerifiedNodeShasumsFile` verify the signature via `openpgp` against the embedded keys.
- The keys live in a generated file (`src/nodeReleaseKeys.ts`, 28 keys) mirrored from the canonical `nodejs/release-keys` list. `crypto/shasums-file/scripts/update-node-release-keys.mjs` keeps them current (`pnpm check:node-release-keys` / `--update`), and the **create-release-pr** workflow runs the check as a gate so a new release signer can't silently break verification.
- `@pnpm/engine.runtime.node-resolver` verifies the **configurable-mirror** SHASUMS. The hardcoded `unofficial-builds.nodejs.org` musl mirror is **not** repo-configurable and is signed by a different key, so it stays trusted over TLS.

## Scope

- **Pre-release channels (rc, nightly, …) are not verified** — Node only signs the `release` channel (no `SHASUMS256.txt.sig` exists for them, even on nodejs.org), so they remain unverifiable. Verification is gated on the `release` channel.
- **Bun / Deno are unaffected** — their download/SHASUMS URLs are hardcoded to canonical GitHub (`github.com/oven-sh/bun`, `api.github.com/repos/denoland/deno`), not mirror-configurable, so a repo can't redirect them.
- **Pacquet parity:** `pacquet/crates/engine-runtime-node-resolver` has the same mirror-configurable SHASUMS logic and needs the equivalent Rust port — tracked as a follow-up (per the repo's parity rule, opening the TS side first).
2026-06-10 00:33:31 +02:00
Zoltan Kochan
a06adee919 fix(lockfile): match pnpm's runtime dependency lockfile format (#12277)
A `runtime:` dependency (`node@runtime:<ver>`, a `Variations` resolution)
diverged from pnpm's `pnpm-lock.yaml` in three ways. Reconcile all three
toward pnpm so the runtime entry round-trips byte-for-byte:

- The importer recorded `node@runtime:<ver>` instead of the prefix-stripped
  `runtime:<ver>`. `real_name` returned `None` for `Variations` resolutions
  (the name lives only in the fetched manifest), so `importer_dep_version`
  could not strip the `node@` prefix. Read the manifest name for
  `Variations`, matching the existing http-tarball path and pnpm's
  `depPathToRef`.
- Each variant's `bin` was a bare string (`bin/node`) rather than pnpm's
  named map (`{ node: bin/node }`). `bin_spec_for_platform` now returns the
  `BinarySpec::Map` form.
- The `packages:` entry omitted `version: <ver>`. pnpm's
  `toLockfileDependency` emits it whenever the depPath carries a `:`, the
  manifest declares a version, and the resolution isn't a directory. Add a
  `version` field to `PackageMetadata` populated under the same condition.

Verified byte-identical against pnpm 11.5.2 on a `node@runtime:26.3.0`
fixture; the whole-monorepo diff is unaffected (no runtime deps there).
Ports the lockfile-shape assertions of
`installing/deps-installer/test/install/nodeRuntime.ts:236-269`.
2026-06-09 07:18:59 +02:00
Khải
13b1b9aaa2 chore(rust/dylint): upgrade perfectionist to 0.0.0-rc.17 (#12070)
Co-authored-by: Claude <noreply@anthropic.com>
2026-05-29 20:58:31 +00:00
Khải
d4136eb6f6 chore(pacquet/lint): more clippy (#11839)
Co-authored-by: Claude <noreply@anthropic.com>
2026-05-22 06:48:50 +00:00
Zoltan Kochan
5353fcbf01 perf(pacquet): close the warm-cache resolve gap to pnpm CLI (#11837)
Closes #11832.

On the `alotta-files` benchmark (1362 nodes, warm cache, GVS on), pacquet was 3-5× behind the TypeScript pnpm CLI whenever resolution ran (`firstInstall`, `withWarmCache`, `withWarmModules`, `updatedDependencies`). Wall-clock dropped from ~11.83s to ~5.03s on this branch; pnpm sits at ~4.16s, and the remaining gap is concentrated in the resolver's per-node `pick_package` walk (3.1s of the 5.03s — see #11843 for the `peekManifestFromStore` follow-up that would close it).

The branch is a series of small wins rather than one big rewrite. The original `PrefetchingResolver` (commit f375c916) was replaced by a batched store-index prefetch (461a4c02) — same throughput, far less plumbing.

## What's in this PR

### Resolve-phase

- **Packument fetch dedup** (386a90b5) — `PackumentFetchLocker` (per-cache-key `DashMap<String, Arc<Semaphore>>`) so concurrent picks of the same `(registry, name)` coalesce into one HTTP GET. Mirrors pnpm's `runLimited(pkgMirror, …)` in `pickPackage.ts`. Pacquet was firing N parallel GETs for the same packument per cluster of cross-referencing deps; now it's one.
- **Conditional GET on upgrade fetch** (58f49c90) — forward `etag` / `modified` so the registry can answer `304 Not Modified` on the abbreviated-to-full re-fetch path.
- **Off-tokio mirror disk reads** (6cb50b4f) — the packument cache's mirror read moves to `spawn_blocking` instead of running on the tokio worker.
- **Picked-manifest serialisation dedup** (387b8721) — `PickedManifestCache` `Arc<DashMap<String, Arc<Value>>>` so duplicate picks of the same `name@version` reuse the already-serialised `Arc<Value>` instead of re-running `serde_json::to_value`.
- **Arc-shared resolver outputs** (743c718f, 53e3cde6, 5d6a4207) — `Package`, `ResolveResult.manifest`, and `ResolveResult` itself are now shared via `Arc` so the tree walk's per-occurrence clones become refcount bumps.
- **`std::sync::Mutex` on `TreeCtx`** (a7c94a90) — the per-package dedupe gate is a short `HashMap` insert with no `await` inside, so a sync mutex is the right tool. Tokio's async mutex was paying per-acquire overhead once per visit per ctx field on the resolve hot path.

### Install-phase

- **Batched store-index prefetch** (461a4c02) — one `SELECT … WHERE key IN (…)` against `index.db` at install start, rayon-parallel verify, drops the SQLite mutex before any fs work. Replaces the per-snapshot `spawn_blocking` fan-out that was serialising on `Arc<Mutex<StoreIndex>>` and queueing in tokio's blocking pool.
- **Single `pnpm:progress` per URL** (e54208e1) — `run_with_mem_cache` was emitting `fetched` twice when the in-memory cache hit; mirrors pnpm's `packageRequester` shape where the emit fires exactly once.
- **Retain prefetched manifests for bin linking** (b0bf5970) — the fresh-install bin linker now drives `LinkVirtualStoreBins` with the prefetched bundled-manifests map (and built lockfile snapshots), so per-child `package.json` disk reads on warm hits are gone. Skips the `read_dir` enumeration too. Updated snapshots reflect the now-present `<slot>/node_modules/<pkg>/node_modules/.bin/<pkg>` self-shim the lockfile-driven path writes per pnpm's `linkBinsOfDependencies`.
- **Skip per-snapshot deep clone on warm prefetch** (3f9c1bb5) — `run_with_mem_cache` returns the prefetched `Arc<HashMap>` straight through instead of going via `run_without_mem_cache`'s deep-clone path + redundant `Arc::new`. At 1k+ snapshots that's one per-file map allocation and one `Arc::new` saved per snapshot.

### Correctness

- **Registry-scoped `PickedManifestCache` key** (57c3094e) — the shared cache key was `{name}@{version}` only; two registries (default + JSR + named-registry) serving different artifacts under the same `name@version` (private + public collisions, forks) would hand one resolver the other's manifest. Now keyed `{registry}\x00{name}@{version}`, matching `PackageMetaCache`'s shape. Regression test included.

### Diagnostics

- **Per-phase timing logs** (57864d2c) — `pacquet::install::phase` `tracing` events with `elapsed_ms` for `resolve_importer`, `prefetch_cas_paths`, `build_fresh_lockfile`, `virtual_store_layout_new`, `install_subtree`. Made the profiling for this branch tractable and stays in for future work (the same per-phase trace is what motivated #11843).

### Review cleanup

- **Refactor + comment fixes** (e7b3e6ca) — `build_resolve_result` now takes a `BuildResolveResult` struct instead of 9 positional args (`#[allow(clippy::too_many_arguments)]` gone); resolver-side `Arc` bindings are dropped explicitly before the install pass so the packument cache actually frees; the `resolve_dependency_tree` doc comment was wrong about "skipping the recursion" on dedupe hits.

### Doc + Dylint fixes

- **CI compliance** (0343a472) — broken doc links resolved; single-letter generics, `Arc.clone()` direct, unicode ellipsis in doc comments fixed for Dylint.

## Scope

- **Fresh-lockfile install path only.** The frozen-lockfile path already had the batched store-index prefetch; nothing else here changes its behaviour.
- **No user-visible behavior change** — lockfile format, error codes, CLI surface, `MemCache` semantics unchanged. The cache-key bug fix doesn't change the on-disk lockfile; it only prevents an in-memory mix-up that would otherwise produce a wrong `ResolveResult.manifest` field.

## Follow-ups

- **#11843** — port pnpm's `peekManifestFromStore` fast path. The store-index row carries the bundled `package.json` (name, version, deps, bin, engines, etc.) but no publish-time, so the fast path is safe only when no `published_by` / `minimumReleaseAge` policy is in effect, no `--update`, and the wanted lockfile pins a tarball+integrity. With ~95% of nodes short-circuiting, the `resolve_importer` phase (currently 3.1s on warm cache) drops dramatically — this is the single biggest unimplemented win and the most likely path to parity with pnpm.

## Benchmark

Wall-clock progression on the `alotta-files` warm-cache + GVS-on scenario (this branch vs `main`):

| Stage                               | Wall    |
| ----------------------------------- | ------- |
| `main`                              | ~11.83s |
| + packument fetch dedup             | ~8.21s  |
| + batched store-index prefetch      | ~6.39s  |
| + Arc-shared resolver outputs       | ~5.67s  |
| + std Mutex + manifest serial dedup | ~5.03s  |
| pnpm CLI baseline                   | ~4.16s  |

The remaining ~0.87s gap is concentrated in `resolve_importer` and is what #11843 targets.
2026-05-22 02:15:53 +02:00
Zoltan Kochan
df990fdb51 feat(pacquet): port node/deno/bun runtime resolvers (#11783)
* feat(pacquet): port node/deno/bun runtime resolvers and wire them into the install chain

Ports the three `@pnpm/engine.runtime.*-resolver` packages and the shared
`@pnpm/crypto.shasums-file` helper into pacquet, and slots them into the
default-resolver chain so `node@runtime:<spec>`, `deno@runtime:<spec>`,
and `bun@runtime:<spec>` resolve through pacquet as they do in pnpm.

New crates under `pacquet/crates/`:

- `crypto-shasums-file` — downloads and decodes `SHASUMS256.txt`,
  shared by node and bun. Mirrors `FAILED_DOWNLOAD_SHASUM_FILE`,
  `NODE_INTEGRITY_HASH_NOT_FOUND`, `NODE_MALFORMED_INTEGRITY_HASH`.
- `engine-runtime-node-resolver` — `parse_node_specifier`,
  `get_node_mirror`, `get_node_artifact_address`, `normalize_arch`,
  `resolve_node_version[s]`, and the `Resolver`-impl entry point.
  Handles the unofficial-musl mirror fan-out, the `lts` / LTS-codename
  / channel / range selectors, and the `darwin/arm64 <16 → x64`,
  `win32/ia32 → x86`, `arm → armv7l` arch quirks. Error codes
  `NO_OFFLINE_NODEJS_RESOLUTION`, `NODEJS_VERSION_NOT_FOUND`,
  `INVALID_NODE_RELEASE_CHANNEL` match upstream.
- `engine-runtime-deno-resolver` — version selection delegates to the
  npm resolver; assets come from the GitHub Releases API + per-asset
  SHA256 sidecars. Windows x64 covers arm64 under emulation.
  Errors: `DENO_RESOLUTION_FAILURE`, `DENO_MISSING_ASSETS`,
  `DENO_GITHUB_FAILURE`, `DENO_PARSE_HASH`.
- `engine-runtime-bun-resolver` — version selection delegates to npm;
  assets come from the GitHub-release `SHASUMS256.txt`. `windows` /
  `aarch64` are normalised to `win32` / `arm64`. Error:
  `BUN_RESOLUTION_FAILURE`.

Wiring (`install_without_lockfile.rs`): chain order is now
`npm → git → node → deno → bun`, matching upstream's
`resolving/default-resolver/src/index.ts` at 1627943d2a. The npm
resolver is shared via `Arc<dyn Resolver>` so the deno/bun resolvers
reuse the same metadata cache; a small `ArcResolver` adapter bridges
that to `DefaultResolver`'s `Vec<Box<dyn Resolver>>`.

Out of scope (called out in code):
- `currentPkg && !update` short-circuit isn't restored yet — needs
  `ResolveOptions::current_pkg` first. The resolver re-fetches the
  asset list on every install.
- `nodeDownloadMirrors` defaults to empty. Wiring it through the
  config layer is a follow-up.

* fix(pacquet): silence rustdoc errors on runtime-resolver doc comments

CI's `Doc` job runs rustdoc with `-D warnings`. Three intra-doc links
in this PR's new doc comments tripped it:

- `crypto-shasums-file` referenced `pacquet_lockfile::BinaryResolution`
  but the crate doesn't (and shouldn't) depend on `pacquet-lockfile`.
  Drop the link and leave the name as plain text.
- `engine-runtime-deno-resolver` linked `DefaultResolver` to
  `pacquet_resolving_default_resolver::DefaultResolver` — same crate-
  dependency story. Rewrite the prose to mention "the default-resolver
  chain" without a link.
- `engine-runtime-deno-resolver` doc comment on the public
  `ReadDenoAssetsError` linked to the private free function
  `read_deno_assets`. Point at the public re-export
  `DenoResolverError::ReadAssets` instead so the link is reachable
  from generated docs.
- `engine-runtime-bun-resolver` had a redundant explicit link target
  on `PlatformAssetResolution` (label and target resolve to the same
  item). Drop the redundant target and reword from `…s` to `… entries`
  so the link label doesn't carry a stray pluralisation `s`.

* fix(pacquet): drop redundant explicit target on PlatformAssetResolution link

The target resolves to the same item as the label (the type is imported
into scope further down in the same file), so rustdoc with
`-D rustdoc::redundant-explicit-links` rejects the form. Drop the
target and let intra-doc resolution pick it up via the existing use.

* fix(pacquet): satisfy CI Doc + Dylint on runtime-resolver crates

The pacquet CI runs rustdoc and `cargo dylint` (perfectionist lints)
with `-D warnings`, both of which catch issues `just ready` doesn't:

- rustdoc on `engine-runtime-node-resolver/lib.rs` reported
  `parse_node_specifier` / `get_node_mirror` / `get_node_artifact_address`
  as ambiguous between the module and the same-named function the
  module re-exports. Disambiguate by appending `()` to the link label
  so rustdoc resolves to the function.
- dylint's `perfectionist::single-letter-closure-param` flagged the
  `|c|` parameter in `parse_node_specifier::prerelease_channel`.
  Rename to `next` and break the chain so the body stays readable.
- dylint's `perfectionist::prefer-raw-string` flagged the regex literal
  on `NODE_EXTRAS_IGNORE_PATTERN`. Convert to a raw string so the
  backslash before `.` reads as the regex escape it is.
- dylint's `perfectionist::macro-trailing-comma` flagged the
  multi-line `matches!` invocation in the shasums-file `NotFound`
  test. Re-shape with the trailing comma and split across lines.

* fix(pacquet): re-flow prerelease_channel closure to keep rustfmt happy

* fix(pacquet): address PR review on runtime-resolver port

Five behavioral / hygiene fixes the CodeRabbit review surfaced that
hold up against the upstream pnpm source:

- `crypto-shasums-file`: tighten `is_sha256_hex` from
  `is_ascii_hexdigit` (accepts `A-F`) to lowercase-only `0-9a-f`,
  matching upstream's `/^[a-f0-9]{64}$/`.
- `engine-runtime-node-resolver/resolve_node_version`: thread
  `error_for_status` into the `fetch_all_versions` GET so non-2xx
  responses from `index.json` surface as `FetchIndex` rather than
  being read into text and decoded as `DecodeIndex`. Matches the
  existing convention in `resolving-npm-resolver/fetch_full_metadata`.
- `engine-runtime-node-resolver/resolve_node_version`: introduce
  `satisfies_with_prereleases` mirroring the strategy already used in
  `resolving-deps-resolver/resolve_peers` so range selectors like
  `rc/18` pick up `18.0.0-rc.X` candidates. Upstream's
  `semver.maxSatisfying(...)` runs with `includePrerelease: true`;
  `node-semver` Rust does not — strip the prerelease suffix on a
  failed straight check and retry against the base version.
- `engine-runtime-deno-resolver/read_deno_assets`: same
  `error_for_status` fix on the GitHub Releases API call so a 404 or
  rate-limit response is a `FetchReleaseIndex` failure, not a JSON
  decode error.
- `package-manager/install_without_lockfile`: also drop the
  standalone `npm_resolver` Arc binding after the resolve pass.
  `drop(resolver)` only releases the `DefaultResolver` chain (one
  strong reference); the `npm_resolver` local kept a second strong
  reference because the deno- and bun-resolvers were handed clones
  of the same `Arc`. Without the explicit drop the packument cache
  stays alive through every fetch/import/link, which the comment
  above already says we want to avoid.

Plus one test-only addition (a `darwin/arm` passthrough assertion in
`normalize_arch/tests.rs`) that pins the upstream behavior — pnpm
applies the `arm → armv7l` quirk unconditionally, including outside
Linux. Locking that in keeps a well-meaning future "Linux-only"
narrowing from silently diverging from pnpm.

Other CodeRabbit suggestions (propagate caller opts on the
npm-delegation calls, error on empty Deno variants, offline guard in
`resolve_latest_impl`, scope `arm` rewrite to Linux) all reflect
behaviors that *upstream pnpm doesn't have* — adopting any of them
would break the parity contract in
[`pacquet/AGENTS.md`](pacquet/AGENTS.md). Left in place.

* fix(pacquet/deno-resolver): surface hex-decode failures as DENO_PARSE_HASH

`fetch_sha256` already guarantees the returned string is a 64-char
lower-case hex run, so `decode_hex` cannot fail in practice. Drop the
`unwrap_or_default()` fallback (which would silently feed an empty
byte slice into the integrity construction and then trip an opaque
\`Integrity\` parse error downstream) in favor of an explicit
`ParseHash` error, so a future change that loosens `extract_sha256`'s
validator surfaces with the right code instead of obscuring the
failure shape.
2026-05-21 01:32:40 +02:00