Files
pnpm/env/node.fetcher/test/node.test.ts
Zoltan Kochan b7f0f21582 feat: use SQLite for storing package index in the content-addressable store (#10827)
## Summary

Replace individual `.mpk` (MessagePack) files under `$STORE/index/` with a single SQLite database at `$STORE/index.db` using Node.js 22's built-in `node:sqlite` module. This reduces filesystem syscall overhead and improves space efficiency for small metadata entries.

Closes #10826

## Design

### New package: `@pnpm/store.index`

A new `StoreIndex` class wraps a SQLite database with a simple key-value API (`get`, `set`, `delete`, `has`, `entries`). Data is serialized with msgpackr and stored as BLOBs. The table uses `WITHOUT ROWID` for compact storage.

Key design decisions:

- **WAL mode** enables concurrent reads from workers while the main process writes.
- **`busy_timeout=5000`** plus a retry loop with `Atomics.wait`-based `sleepSync` handles `SQLITE_BUSY` errors from concurrent access.
- **Performance PRAGMAs**: `synchronous=NORMAL`, `mmap_size=512MB`, `cache_size=32MB`, `temp_store=MEMORY`, `wal_autocheckpoint=10000`.
- **Write batching**: `queueWrites()` batches pre-packed entries from tarball extraction and flushes them in a single transaction on `process.nextTick`. `setRawMany()` writes immediate batches (e.g. from `addFilesFromDir`).
- **Lifecycle**: `close()` auto-flushes pending writes, runs `PRAGMA optimize`, and closes the DB. A `process.on('exit')` handler ensures cleanup even on unexpected exits.
- **`VACUUM` after `deleteMany`** (used by `pnpm store prune`) to reclaim disk space.

### Key format

Keys are `integrity\tpkgId` (tab-separated). Git-hosted packages use `pkgId\tbuilt` or `pkgId\tnot-built`.

### Shared StoreIndex instance

A single `StoreIndex` instance is threaded through the entire install lifecycle — from `createNewStoreController` through the fetcher chain, package requester, license scanner, SBOM collector, and dependencies hierarchy. This replaces the previous pattern of each component creating its own file-based index access.

### Worker architecture

Index writes are performed in the main process, not in worker threads. Workers send pre-packed `{ key, buffer }` pairs back to the main process via `postMessage`, where they are batched and flushed to SQLite. This avoids SQLite write contention between threads.

### SQLite ExperimentalWarning suppression

`node:sqlite` emits an `ExperimentalWarning` on first load. This is suppressed via a `process.emitWarning` override injected through esbuild's `banner` option, which runs on line 1 of both `dist/pnpm.mjs` and `dist/worker.js` — before any module that loads `node:sqlite`.

### No migration from `.mpk` files

Old `.mpk` index files are not migrated. Packages missing from the new SQLite index are re-fetched on demand (the same behavior as a fresh store).

## Changed packages

121 files changed across these areas:

- **`store/index/`** — New `@pnpm/store.index` package
- **`worker/`** — Write batching moved from worker module into `StoreIndex` class; workers send pre-packed buffers to main process
- **`store/package-store/`** — StoreIndex creation and lifecycle management
- **`store/cafs/`** — Removed `getFilePathInCafs` index-file utilities (no longer needed)
- **`store/pkg-finder/`** — Reads from StoreIndex instead of `.mpk` files
- **`store/plugin-commands-store/`** — `store status` uses StoreIndex
- **`store/plugin-commands-store-inspecting/`** — `cat-index` and `find-hash` use StoreIndex
- **`fetching/tarball-fetcher/`** — Threads StoreIndex through fetchers; git-hosted fetcher flushes before reading
- **`fetching/git-fetcher/`, `binary-fetcher/`, `pick-fetcher/`** — Accept StoreIndex parameter
- **`pkg-manager/`** — `client`, `core`, `headless`, `package-requester` thread StoreIndex
- **`reviewing/`** — `license-scanner`, `sbom`, `dependencies-hierarchy` accept StoreIndex
- **`cache/api/`** — Cache view uses StoreIndex
- **`pnpm/bundle.ts`** — esbuild banner for ExperimentalWarning suppression

## Test plan

- [x] `pnpm --filter @pnpm/store.index test` — Unit tests for StoreIndex CRUD and batching
- [x] `pnpm --filter @pnpm/package-store test` — Store controller lifecycle
- [x] `pnpm --filter @pnpm/package-requester test` — Package requester reads from SQLite index
- [x] `pnpm --filter @pnpm/tarball-fetcher test` — Tarball and git-hosted fetcher writes
- [x] `pnpm --filter @pnpm/headless test` — Headless install
- [x] `pnpm --filter @pnpm/core test` — Core install, side effects, patching
- [x] `pnpm --filter @pnpm/plugin-commands-rebuild test` — Rebuild reads from index
- [x] `pnpm --filter @pnpm/license-scanner test` — License scanning
- [x] e2e tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-03-06 12:59:04 +01:00

116 lines
3.7 KiB
TypeScript

import AdmZip from 'adm-zip'
import { Response } from 'node-fetch'
import path from 'path'
import { Readable } from 'stream'
import type { FetchNodeOptionsToDir as FetchNodeOptions } from '@pnpm/node.fetcher'
import { StoreIndex } from '@pnpm/store.index'
import { tempDir } from '@pnpm/prepare'
import { jest } from '@jest/globals'
jest.unstable_mockModule('detect-libc', () => ({
isNonGlibcLinux: jest.fn(),
}))
const { fetchNode } = await import('@pnpm/node.fetcher')
const { isNonGlibcLinux } = await import('detect-libc')
// A stable fake hex digest used as placeholder sha256 in mock SHASUMS256.txt files.
// Any non-zero value works; the tarball content won't match, so integrity will
// fail — but all URL assertions run before that happens.
const FAKE_SHA256 = '5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef'
const fetchMock = jest.fn(async (url: string) => {
if (url.endsWith('SHASUMS256.txt')) {
// Return a minimal SHASUMS file covering the artifacts used in tests.
return new Response(
`${FAKE_SHA256} node-v22.0.0-linux-x64-musl.tar.gz\n`
)
}
if (url.endsWith('.zip')) {
// The Windows code path for pnpm's node bootstrapping expects a subdir
// within the .zip file.
const pkgName = path.basename(url, '.zip')
const zip = new AdmZip()
zip.addFile(`${pkgName}/dummy-file`, Buffer.from('test'))
return new Response(Readable.from(zip.toBuffer()))
}
return new Response(Readable.from(Buffer.alloc(0)))
})
const storeIndexes: StoreIndex[] = []
afterAll(() => {
for (const si of storeIndexes) si.close()
})
beforeEach(() => {
jest.mocked(isNonGlibcLinux).mockReturnValue(Promise.resolve(false))
fetchMock.mockClear()
})
test.skip('install Node using a custom node mirror', async () => {
tempDir()
const nodeMirrorBaseUrl = 'https://pnpm-node-mirror-test.localhost/download/release/'
const storeDir = path.resolve('store')
const storeIndex = new StoreIndex(storeDir)
storeIndexes.push(storeIndex)
const opts: FetchNodeOptions = {
nodeMirrorBaseUrl,
storeDir,
storeIndex,
}
await fetchNode(fetchMock, '16.4.0', path.resolve('node'), opts)
for (const call of fetchMock.mock.calls) {
expect(call[0]).toMatch(nodeMirrorBaseUrl)
}
})
test.skip('install Node using the default node mirror', async () => {
tempDir()
const storeDir = path.resolve('store')
const storeIndex = new StoreIndex(storeDir)
storeIndexes.push(storeIndex)
const opts: FetchNodeOptions = {
storeDir,
storeIndex,
}
await fetchNode(fetchMock, '16.4.0', path.resolve('node'), opts)
for (const call of fetchMock.mock.calls) {
expect(call[0]).toMatch('https://nodejs.org/download/release/')
}
})
test('auto-detects musl on non-glibc Linux and uses unofficial-builds mirror', async () => {
jest.mocked(isNonGlibcLinux).mockReturnValue(Promise.resolve(true))
tempDir()
// The function will throw because the downloaded tarball content won't match
// the fake sha256 we put in the SHASUMS256.txt mock, but all fetch calls are
// recorded before the integrity check, so we can assert the correct URLs.
const storeIndex = new StoreIndex(path.resolve('store'))
storeIndexes.push(storeIndex)
await expect(
fetchNode(fetchMock, '22.0.0', path.resolve('node'), {
storeDir: path.resolve('store'),
storeIndex,
platform: 'linux',
arch: 'x64',
retry: { retries: 0 },
})
).rejects.toThrow()
const shasumsUrl = fetchMock.mock.calls[0][0] as string
expect(shasumsUrl).toContain('unofficial-builds.nodejs.org')
const tarballUrl = fetchMock.mock.calls[1][0] as string
expect(tarballUrl).toContain('unofficial-builds.nodejs.org')
expect(tarballUrl).toContain('node-v22.0.0-linux-x64-musl.tar.gz')
})