* chore: enable clippy::pedantic lint group for pacquet workspace * style(pacquet): comply with clippy::pedantic Apply clippy's machine-applicable pedantic fixes across the workspace (inlined format args, removed needless borrows/closures, added must_use, etc.), fix a few doc-comment backtick nits, and drop pointless #[inline(always)] on trivial accessors. Opt specific pedantic lints back out in [workspace.lints.clippy] with documented justifications, grouped into false positives, library-API hygiene that doesn't fit an internal CLI, suggestions that conflict with the cardinal rule of porting pnpm 1:1, and opinionated style. * style: taplo-format Cargo.toml lint table * style(pnpr): comply with clippy::pedantic in merged auth backend code Re-apply pedantic compliance to the networked-SQLite auth backend that landed on main (#12186, #12199/#12206): doc-comment backticks, #[must_use] on constructors and status_code, i64::from over `as`, map_or, and a method-reference closure. * docs(clippy): trim and inline the pedantic allow-list comments * docs(clippy): note perfectionist supersedes many_single_char_names * docs(clippy): note pnpm-mirroring rationale on structure/naming lints * docs(clippy): mark unused_async as deferred pending audit * style: enable clippy::match_wildcard_for_single_variants * refactor: enable clippy::unused_self Convert two self-less private methods (overrides pick_most_specific, tarball head_only_result) to associated functions. * refactor: enable clippy::ref_option Widen engine_json to Option<&str>; #[expect] the two serde serialize_with helpers, which serde must call as f(&field, ser). * perf: enable clippy::trivially_copy_pass_by_ref Pass the 1-byte Copy types NodeLinker and FilterWorkspaceProjectsOptions by value; #[expect] the serde skip_serializing_if helper is_false. * perf: enable clippy::assigning_clones Use clone_from for seven field assignments to reuse allocations. * style: enable clippy::manual_let_else Convert 27 match/if-let guards to let-else; preserve the non-UTF-8 skip rationale comment in the directory walker. * style: enable clippy::default_trait_access Name the concrete type on Default::default() call sites; #[expect] two struct-literal test fixtures where naming each field type would force ~20 imports. * refactor: enable clippy::format_push_string Replace push_str(&format!(...)) with write!/writeln! into the target String (local 'use std::fmt::Write as _'); writeln! preserves the exact LF/CRLF shell-shim output. * refactor: enable clippy::needless_pass_by_value Take by reference where the argument is only read (incl. dropping some redundant clones in resolve_peers' recursion). Where converting would cascade badly, #[expect] with a reason: functions that destructure/consume the arg (build_resolve_result, PrefetchingResolver, S3Store::new), the by-value `impl IntoIterator + Clone` in build_direct_deps_by_importer, and the serde/test helpers whose owned fixtures keep call sites clean. * fix(perfectionist): satisfy dylint after format_push_string changes Add trailing commas to the multi-line writeln! shell-shim templates (macro_trailing_comma) and merge the new `fmt::Write as _` imports into each file's existing `use std::{...}` block (import_granularity). * docs(clippy): explain missing_errors_doc suppression; mark missing_panics_doc deferred * fix(perfectionist): collapse fmt::{self, Write as _} in work_env imports The format_push_string Write import landed as a sibling fmt:: path next to the existing fmt import; merge them so import_granularity passes. * style: enable clippy::return_self_not_must_use Add #[must_use] to the WorkspaceTreeCtx builder methods, matching the #[must_use] already on the parallel TreeCtx builders. * perf: enable clippy::large_stack_arrays Heap-allocate the 64 KiB read buffer in verify_file_integrity with a Vec instead of placing it on the stack. * chore(clippy): enable clippy::nursery group Enable the nursery lint group on the pacquet/pnpr workspace and bring the code into compliance. Fixed in code: - iter_on_single_items: [x].into_iter()/.iter() -> std::iter::once - equatable_if_let: pattern match -> equality check (the install_accelerator rewrite wraps in a multi-line matches!, which gets a trailing comma for perfectionist::macro_trailing_comma) - needless_pass_by_ref_mut: load_pending_row/apply_write_msg take &StoreIndex Opted back out in Cargo.toml, each with a documented justification: use_self, too_long_first_doc_paragraph, missing_const_for_fn, option_if_let_else, significant_drop_tightening, redundant_pub_crate, derive_partial_eq_without_eq, branches_sharing_code, useless_let_if_seq, single_option_map, iter_with_drain, literal_string_with_formatting_args, collection_is_never_read. Dropped the now-redundant individual nursery warns (needless_collect, or_fun_call, redundant_clone) the group now covers, plus the default-on unnecessary_lazy_evaluations. Kept clone_on_ref_ptr and if_then_some_else_none (restriction lints not enabled by any group). * style: bring merged main code into clippy pedantic compliance The 17 commits merged from main predate this branch's pedantic/nursery lint config, so their new code tripped pedantic lints. Apply the machine-applicable fixes (uninlined_format_args, if_not_else, elidable_lifetime_names, must_use_candidate, single_match_else, map_unwrap_or, default_trait_access, assigning_clones, doc_markdown, ...) and re-add the documented #[expect(needless_pass_by_value)] on S3Store::new that this branch had carried on the now-replaced file. * style: bring merged main code into clippy pedantic compliance The 28 commits merged from main predate this branch's lint config, so their new code tripped pedantic lints. Apply the machine-applicable fixes (uninlined_format_args, manual_let_else, needless_raw_string_hashes, redundant_closure_for_method_calls, map_unwrap_or, elidable_lifetime_names, doc_markdown, ...) plus a few by hand: - derive Copy on LinkSlotsParallel (all fields are Copy/refs) to clear needless_pass_by_value without a signature change - deduplicate_all takes &[Vec<DepPath>] (it only borrows the duplicates) - pick_most_specific becomes an associated fn (it never used self) - default_trait_access -> concrete types; assigning_clones -> clone_from; format_push_string -> write! - #[expect] with reasons where a fix would churn main's feature code: needless_pass_by_value on the recursive resolve_node and a test helper, and float_cmp on two deterministic-fixture assertions * style: enable clippy::allow_attributes and allow_attributes_without_reason Both are restriction lints (not implied by any group), enabled alongside the existing clone_on_ref_ptr / if_then_some_else_none. Convert every #[allow(...)] (including one nested in cfg_attr) to #[expect(...)]; all already carried a reason, so allow_attributes_without_reason is satisfied. Drop two now-redundant suppressions surfaced by the conversion: a duplicated #[expect(too_many_arguments)] on fetch_and_extract_zip_once (a prior merge left both an allow and an expect), and the #[expect(dead_code)] on MissingPeerInfo's fields (the #[derive(Debug, Clone)] already reads them, so dead_code never fired). clone_on_ref_ptr was already enabled. mod_module_files is intentionally NOT enabled: it mandates mod.rs, the opposite of the flat module.rs pattern this project requires (CODE_STYLE_GUIDE.md, enforced by perfectionist::flat_module_pattern). * style: enable clippy::mod_module_files to enforce the flat module layout mod_module_files bans mod.rs files, enforcing the flat module.rs pattern this project already uses (0 mod.rs in the tree, so no violations). Update CODE_STYLE_GUIDE.md to cite it as the enforcer; perfectionist's flat_module_pattern is being retired in favor of this Clippy rule. * fix(perfectionist): trailing comma on wrapped assert_eq! in workspace_yaml tests The default_trait_access fix lengthened the assert_eq! so fmt wrapped it to multi-line, which perfectionist::macro_trailing_comma requires to end with a trailing comma. * fix(fs): use cfg_attr expect instead of allow for Windows-unused mode args With clippy::allow_attributes enabled, the #[cfg_attr(windows, allow(unused))] on make_file_executable and the ensure_file/write_atomic mode params fails Windows CI. Switch to #[cfg_attr(windows, expect(unused, reason = ...))]; on Windows the lint fires (Unix mode unused there) so the expectation is fulfilled, and the attribute stays inert on Unix. * fix(fs): drop the Windows unused suppression on ensure_file's mode arg ensure_file forwards mode to verify_or_rewrite unconditionally, so it is used on Windows too; the #[cfg_attr(windows, expect(unused))] was therefore unfulfilled and failed Windows CI under -D warnings. write_atomic and make_file_executable keep their expect — they use mode/file only under #[cfg(unix)], so the lint fires (and the expectation holds) on Windows. * chore(git): revert "fix(fs): drop the Windows unused suppression on ensure_file's mode arg" This reverts commit1d617c3e1f. * chore(git): revert "fix(fs): use cfg_attr expect instead of allow for Windows-unused mode args" This reverts commit155e4a3dde. * chore(git): revert "style: enable clippy::allow_attributes and allow_attributes_without_reason" This reverts commita47d7926f2. * style: bring merged main code into clippy compliance + fix merge mismatch - Add & at the two run_postinstall_hooks / run_project_lifecycle_scripts call sites: this branch widened lifecycle.rs to take &RunPostinstallHooks, but main's by-value call sites came in via the conflict resolution. - pedantic fixes on main's new code: must_use_candidate, unnested_or_patterns, manual_let_else, default_trait_access, iter_on_single_items, and trivially_copy_pass_by_ref (map_node_linker takes NodeLinker by value). --------- Co-authored-by: Claude <noreply@anthropic.com>
43 KiB
Code Style Guide
Introduction
Clippy cannot yet detect all suboptimal code. This guide supplements that.
This guide is incomplete. More may be added as more pull requests are reviewed.
This is a guide, not a rule. Contributors may break them if they have a good reason to do so.
Terminology
Owned type
Doesn't have a lifetime, neither implicit nor explicit.
Examples: String, OsString, PathBuf, Vec<T>, etc.
Borrowed type
Has a lifetime, either implicit or explicit.
Examples: &str, &OsStr, &Path, &[T], etc.
Copying
The act of cloning or creating an owned data from another owned/borrowed data.
Examples:
owned_data.clone()borrowed_data.to_owned()OwnedType::from(borrowed_data)path.to_path_buf()str.to_string()- etc.
Guides
Naming convention
Follow the Rust API guidelines. Specific naming conventions for generics, variables, and closure parameters are covered in the sections below.
Module Organization
- Use the flat file pattern (
module.rs) rather thanmodule/mod.rsfor submodules. Enforced byclippy::mod_module_files, which bansmod.rsfiles. (perfectionist::flat_module_patterncovered this previously and is being retired in favor of the Clippy rule.) - List
pub moddeclarations first, thenpub usere-exports, then private imports and items. - Use
pub useto re-export key types at the module level for convenience.
pub mod install_package_by_snapshot;
pub mod install_package_from_registry;
pub mod install_without_lockfile;
pub use install_package_by_snapshot::InstallPackageBySnapshot;
pub use install_package_from_registry::InstallPackageFromRegistry;
Import Organization
Prefer merged imports. Combine multiple items from the same crate root into a single use statement with nested braces rather than separate use lines (the crate granularity). Import ordering is enforced by cargo fmt; the granularity is enforced by perfectionist::import_granularity (configured to crate in dylint.toml). Imports gated by a platform attribute such as #[cfg(unix)] go in a separate block after the main imports.
use crate::{
package_manager::PackageManager,
store::Store,
};
use pipe_trait::Pipe;
use std::{path::PathBuf, sync::Arc};
#[cfg(unix)]
use std::os::unix::fs::PermissionsExt;
No star imports
Avoid star (glob) imports inside the bodies of regular modules. Import items explicitly by name everywhere except the two cases noted below. The rule applies to production code, tests, integration tests, build scripts, and developer tooling under tasks/.
The two exceptions are:
- External-crate preludes, such as
use rayon::prelude::*;oruse assert_cmd::prelude::*;. The upstream crate has already curated which items are intended to be glob-imported, so listing them out by hand creates a maintenance burden the moment the upstream prelude changes. Use the prelude in the form the crate documents. - Re-exports at the root of a module or crate, such as
pub use submodule::*;inlib.rs. These are part of the public surface that the crate intentionally exposes, and listing the items individually duplicates information that already lives in the submodule.
Star imports inside a module body are the case worth banning. They make it hard to tell where a name comes from, they hide accidental shadowing, and use super::*; is especially harmful in tests. The form pulls every privately imported item from the outer module into scope, so an import the production code no longer uses can keep compiling indefinitely as long as some test still references it. Removing dead imports becomes guesswork.
// Bad
#[cfg(test)]
mod tests {
use super::*;
}
// Good
#[cfg(test)]
mod tests {
use super::{ParsedThing, parse_thing};
}
// Allowed (external-crate prelude)
use rayon::prelude::*;
use assert_cmd::prelude::*;
// Allowed (root-of-module re-export)
pub use comver::*;
pub use load_lockfile::*;
Generic Parameter Naming
Use descriptive names for type parameters (Size, Name, Manifest, Store, Reporter) and const generics instead of single letters. Enforced by perfectionist::single_letter_generic and perfectionist::single_letter_const_generic. The idiomatic T for a lone type parameter and N for a const-generic array length are accepted at the site with #[expect(perfectionist::single_letter_generic, …)] / #[expect(perfectionist::single_letter_const_generic, …)].
Variable and Closure Parameter Naming
Use descriptive names for variables and closure parameters, in test code as well as production code. Single letters are accepted only where the rules' default allowlists permit them: n/f/i/j/k for their conventional roles, the sort_by / sort_by_key / min_by / max_by / fold callback shape, and single-expression closure bodies. Multi-line closure bodies and every let binding (including in #[cfg(test)] code) are flagged. The same applies to single-letter const/static items. Enforced by:
perfectionist::single_letter_function_paramperfectionist::single_letter_closure_paramperfectionist::single_letter_let_bindingperfectionist::single_letter_const_itemperfectionist::single_letter_static_item
When to use owned parameter? When to use borrowed parameter?
This is a trade-off between API flexibility and performance.
If using an owned signature would reduce copying, one should use an owned signature.
Otherwise, use a borrowed signature to widen the API surface.
Example 1: Preferring owned signature.
fn push_path(list: &mut Vec<PathBuf>, item: &Path) {
list.push(item.to_path_buf());
}
push_path(&mut my_list, &my_path_buf);
push_path(&mut my_list, my_path_ref);
The above code is suboptimal because it forces the copying of my_path_buf even though the type of my_path_buf is already PathBuf.
Changing the signature of item to PathBuf would help remove .to_path_buf() inside the push_path function, eliminate the cloning of my_path_buf (the ownership of my_path_buf is transferred to push_path).
fn push_path(list: &mut Vec<PathBuf>, item: PathBuf) {
list.push(item);
}
push_path(&mut my_list, my_path_buf);
push_path(&mut my_list, my_path_ref.to_path_buf());
It does force my_path_ref to be explicitly copied, but since item is not copied, the total number of copying remains the same for my_path_ref.
Example 2: Preferring borrowed signature.
fn show_path(path: PathBuf) {
println!("The path is {path:?}");
}
show_path(my_path_buf);
show_path(my_path_ref.to_path_buf());
The above code is suboptimal because it forces the copying of my_path_ref even though a &Path is already compatible with the code inside the function.
Changing the signature of path to &Path would help remove .to_path_buf(), eliminating the unnecessary copying:
fn show_path(path: &Path) {
println!("The path is {path:?}");
}
show_path(my_path_buf);
show_path(my_path_ref);
Use the most encompassing type for function parameters
The goal is to allow the function to accept more types of parameters, reducing type conversion.
Example 1:
fn node_bin_dir(workspace: &PathBuf) -> PathBuf {
workspace.join("node_modules").join(".bin")
}
let a = node_bin_dir(&my_path_buf);
let b = node_bin_dir(&my_path_ref.to_path_buf());
The above code is suboptimal because it forces the copying of my_path_ref only to be used as a reference.
Changing the signature of workspace to &Path would help remove .to_path_buf(), eliminating the unnecessary copying:
fn node_bin_dir(workspace: &Path) -> PathBuf {
workspace.join("node_modules").join(".bin")
}
let a = node_bin_dir(&my_path_buf);
let b = node_bin_dir(my_path_ref);
Trait Bounds
Prefer where clauses over inline bounds when there are multiple constraints:
impl<Store, Manifest, Reporter> InstallPackage<Store, Manifest, Reporter>
where
Store: PackageStore + Send + Sync,
Manifest: AsRef<PackageManifest> + Send,
Reporter: ProgressReporter + Sync + ?Sized,
{
/* ... */
}
Pattern Matching
When mapping enum variants to values, prefer the concise wrapping style:
ExitCode::from(match self {
InstallError::NetworkFailure(_) => 2,
InstallError::ManifestParseFailure(_) => 3,
})
Using pipe-trait
This codebase uses the pipe-trait crate. The Pipe trait enables method-chaining through unary functions, keeping code in a natural left-to-right reading order. Import it as use pipe_trait::Pipe;.
Any callable that takes a single argument works with .pipe(). This includes free functions, closures, newtype constructors, enum variant constructors, Some, Ok, Err, and trait methods such as From::from. The guidance below applies equally to all of them.
When to use pipe
Chaining through a unary function at the end of an expression chain:
// Good: pipe keeps the chain flowing left-to-right
manifest.dependencies().pipe(DependencyMap)
entries.into_iter().collect::<HashMap<_, _>>().pipe(Store)
Avoiding deeply nested function calls:
// Nested calls are harder to read
let parsed = serde_json::from_slice::<Manifest>(&bytes)?;
// Prefer piping instead
let parsed = bytes.as_slice().pipe(serde_json::from_slice::<Manifest>)?;
Chaining through multiple unary functions:
name.pipe(InstallError::MissingPackage).pipe(Err)
Continuing a method chain through a free function and back to methods:
url
.pipe(normalize_registry_url)
.map(Cow::Borrowed)
Using .pipe_as_ref() to pass a reference mid-chain. This avoids introducing a temporary variable when a free function takes &T:
// Good: pipe_as_ref calls .as_ref() then passes to the function
"node_modules"
.pipe(Path::new)
.join(package_name)
.pipe_as_ref(is_within_store)
.then_some(package_name)
When NOT to use pipe
Simple standalone function calls. Pipe adds noise with no readability benefit:
// Bad: unnecessary pipe
let result = value.pipe(foo);
// Good: just call the function directly
let result = foo(value);
This applies to any unary callable, such as Some, Ok, or constructors, when there is no preceding chain to continue:
// Bad: pipe adds nothing here
let result = value.pipe(Some);
// Good: direct call is clearer
let result = Some(value);
However, piping through any unary function is preferred when it continues an existing chain:
// Good: continues a chain
manifest.summarize().pipe(Some)
Doc comment intra-links
When a doc comment (/// or //! ) mentions an identifier (type, trait, function, method, module, constant, macro, etc.) that is intra-linkable from the current scope, write the mention as a rustdoc intra-doc link rather than as bare prose. Intra-doc links give readers one-click navigation and let rustdoc warn when a referenced item is renamed or removed, so the docs stay in sync with the code.
// Good
/// Installs the package described by [`PackageManifest`] into [`Store`].
pub fn install(manifest: &PackageManifest, store: &Store) { /* ... */ }
// Bad: identifiers are mentioned as bare prose
/// Installs the package described by `PackageManifest` into `Store`.
pub fn install(manifest: &PackageManifest, store: &Store) { /* ... */ }
If the identifier is not directly in scope, use a path link ([`crate::store::Store`]) or a disambiguated link ([`Store`](crate::store::Store)) rather than dropping back to bare prose. Reserve plain backticks for things that genuinely cannot be linked, such as identifiers from external code that rustdoc cannot resolve, shell commands, file paths, or literal values.
Documentation comments
A doc comment (/// or //!) is rendered by rustdoc as the documentation of the item it attaches to, and it is visible to every reader who can see that item. The doc comment of a pub item therefore reaches every downstream user of the crate, including users who never read the source. Do not reference items more private than the item being documented. Disallowed references include a private function named in the doc comment of a pub item, a pub(crate) type named in the doc comment of a pub item, and a private constant named in the doc comment of a pub(crate) item. A reader who only sees the rendered docs cannot follow such a reference, and intra-doc links to inaccessible items become broken links in cargo doc output.
If the explanation genuinely depends on that more private item, choose one of two fixes. The first option is to widen the visibility of the referenced item, adding a re-export when one fits the API. The second option is to move the explanation into a regular // comment on the implementation, where readers of the source can see it. Reserve /// and //! for things a downstream user of the item needs to know. Use // for notes useful only to someone reading the source.
// Bad: public doc references a private helper
/// Builds the lockfile by walking dependencies.
///
/// Internally calls [`walk_deps_inner`] to handle cycles.
pub fn build_lockfile(/* ... */) { /* ... */ }
fn walk_deps_inner(/* ... */) { /* ... */ }
// Good: public doc describes observable behavior
/// Builds the lockfile by walking dependencies.
///
/// Cycles in the dependency graph are reported as
/// [`LockfileError::CycleDetected`].
pub fn build_lockfile(/* ... */) { /* ... */ }
fn walk_deps_inner(/* ... */) { /* ... */ }
Serde Cow<'de, str> vs String source types
When wiring a type into serde with #[serde(try_from = "...")] or #[serde(from = "...")], pick the source according to whether the deserialized value retains the string.
Do not use &'de str. Text formats such as JSON, YAML, and TOML accept escape sequences (for example, JSON's "\u0061") that the deserializer must decode into a fresh buffer. Borrowed &'de str deserialization rejects every input that requires decoding, so the type fails on values the format itself accepts.
Prefer Cow<'de, str> when the conversion discards the string or splits it into pieces, for example when it parses into a number, an enum discriminant, or a struct whose fields are substrings of the input. The deserializer borrows from the input when no decoding is needed and allocates only when escapes force it.
Prefer String when the entire input is moved into the resulting value verbatim. Taking String lets the conversion store the buffer directly without re-cloning.
use std::borrow::Cow;
#[derive(serde::Deserialize)]
#[serde(try_from = "Cow<'de, str>")]
struct Port(u16);
impl<'a> TryFrom<Cow<'a, str>> for Port {
type Error = std::num::ParseIntError;
fn try_from(value: Cow<'a, str>) -> Result<Self, Self::Error> {
value.parse().map(Port)
}
}
#[derive(serde::Deserialize)]
#[serde(try_from = "String")]
struct PackageName(String);
impl TryFrom<String> for PackageName {
type Error = InvalidPackageName;
fn try_from(value: String) -> Result<Self, Self::Error> {
validate(&value)?;
Ok(PackageName(value))
}
}
The same trade-off applies to the infallible #[serde(from = "...")] form.
Error Handling
- Use
derive_morefor error types. Only derive the traits that are actually used:Display: derive when the type needs to be displayed, such as when it is printed to stderr or used in format strings.Error: derive when the type is used as astd::error::Error, such as the error type inResultor the source of another error. Not all types withDisplayneedError.- A type that only needs formatting and not error handling should derive
DisplaywithoutError.
- Minimize
unwrap()in non-test code; use proper error propagation.unwrap()is acceptable in tests, and is also acceptable for provably infallible operations when accompanied by a comment explaining the invariant. When deliberately ignoring an error, use.ok()and document the rationale.
#[derive(Debug, Display, Error)]
#[non_exhaustive]
pub enum InstallError {
#[display("NetworkFailure: {_0}")]
NetworkFailure(reqwest::Error),
}
Conditional Test Skipping: #[cfg] vs #[cfg_attr(..., ignore)]
When a test cannot run under certain conditions, such as on the wrong platform, prefer #[cfg_attr(..., ignore)] over #[cfg(...)] to skip it. The test still compiles on every configuration and is only skipped at runtime. This approach catches type errors and regressions that a #[cfg] skip would hide.
Use #[cfg] on tests only when the code cannot compile under the condition. An example is a test that references types, functions, or trait methods gated behind #[cfg] that do not exist on other platforms or feature sets.
Prefer including a reason string in the ignore attribute to explain why the test is skipped.
// Good: test compiles everywhere, skipped at runtime on non-unix
#[test]
#[cfg_attr(not(unix), ignore = "only one path separator style is tested")]
fn unix_path_logic() { /* uses hardcoded unix paths but no unix-only types */ }
// Good: test CANNOT compile on non-unix (uses unix-only types)
#[cfg(unix)]
#[test]
fn unix_permissions() { /* uses PermissionsExt which only exists on unix */ }
When or when not to log during tests? What to log? How to log?
The goal is to enable the programmer to quickly inspect the test subject should a test fail.
Logging is almost always necessary when the assertion is not assert_eq!. For example: assert!, assert_ne!, etc.
Logging is sometimes necessary when the assertion is assert_eq!.
If the values being compared with assert_eq! are simple scalar or single line strings, logging is almost never necessary. It is because assert_eq! should already show both values when assertion fails.
If the values being compared with assert_eq! are strings that may have many lines, they should be logged with eprintln! and {} format.
If the values being compared with assert_eq! have complex structures (such as a struct or an array), they should be logged with dbg!.
Example 1: Logging before assertion is necessary
let message = my_func().unwrap_err().to_string();
eprintln!("MESSAGE:\n{message}\n");
assert!(message.contains("expected segment"));
let output = execute_my_command();
let received = output.stdout.to_string_lossy(); // could have multiple lines
eprintln!("STDOUT:\n{received}\n");
assert_eq!(received, expected);
let hash_map = create_map(my_argument);
dbg!(&hash_map);
assert!(hash_map.contains_key("foo"));
assert!(hash_map.contains_key("bar"));
Example 2: Logging is unnecessary
let received = add(2, 3);
assert_eq!(received, 5);
If the assertion fails, the value of received will appear alongside the error message.
Unit test file layout
Always place unit tests in a dedicated external tests submodule (#[cfg(test)] mod tests;) rather than inline in the parent file — for src/foo.rs the tests live in src/foo/tests.rs. This keeps production and test code in separate files and avoids churning git blame. Enforced by perfectionist::unit_test_file_layout (configured to inline_style = "external_only", nested layout), which flags inline test code, the flattened src/foo_tests.rs sibling, and the skipped-intermediate form.
Cloning Arc and Rc
Prefer Arc::clone(&x) / Rc::clone(&x) over x.clone() for reference-counted types. The qualified form makes the O(1) refcount bump visible at the call site and fails to compile if a refactor changes the binding's type to something whose Clone is an arbitrarily expensive deep copy. Enforced by clippy::clone_on_ref_ptr, wired into [workspace.lints.clippy] in the root Cargo.toml.
Reading process state
Library code should rarely call std::env::var, std::env::var_os, or std::env::current_dir directly. Process state belongs to the running process, not to the operation a function performs, so reaching for it couples the logic to whichever shell invoked pacquet and hides the dependency from the caller. Prefer one of:
- Take the value as a parameter. A function that needs a path declares
path: &Path; one that needs an env-var value takes it as a&strargument or anOption<String>. - Thread the read through the DI seam. Declare
Sys: EnvVar(orEnvVarOs,GetHomeDir,GetCurrentDir, …) and callSys::var(name)/Sys::var_os(name)/Sys::home_dir()/Sys::current_dir(). See Dependency injection for tests for the convention. This is howConfig::currentresolves the home dir and theNPM_CONFIG_WORKSPACE_DIRlookup, and howcrates/workspace'sfind_workspace_dir_from_env_withresolves its env-var lookup — both let tests drive the read without touching the process environment.
The narrow case where a direct call is acceptable is computing the default of a Config field that has no caller to take the value from — default_store_dir, default_modules_dir, and default_virtual_store_dir are the canonical examples. Even there, route the lookup through the DI seam: write the helper as default_store_dir<Sys: EnvVar + GetHomeDir + GetCurrentDir>() and inline the production composition at the SmartDefault site (#[default(_code = "default_store_dir::<Host>()")]), so the fake-Sys test path and the production Host path resolve through the same generic. The other accepted boundary reads are program-entry knobs (RAYON_NUM_THREADS in crates/cli, TRACE in crates/diagnostics) and the lifecycle-script env snapshot in crates/executor and crates/git-fetcher, which has to forward the parent environment to spawned children verbatim. If new library code seems to need env::var / env::current_dir for any other reason, the answer is almost always a &Path (or equivalent) parameter, not a process-state read.
Dependency injection for tests
Side-effecting code — filesystem access, environment variables, network calls, time, process state — has two testing routes. The default route is a real fixture: a tempfile::TempDir for filesystem work, the mocked registry (just registry-mock) for HTTP, an integration test that spawns the actual pacquet binary in a scratch directory for end-to-end flows. Real fixtures keep tests close to what users see and scale with the codebase without per-call-site plumbing; they are the right tool everywhere except the cases enumerated below.
The dependency-injection seam described below is the narrow second route. Reach for it only when one of the following applies:
- Filesystem error branches the host OS won't reproduce portably.
PermissionDenied,ENOSPC, a directory that disappears mid-walk, a chmod that fails after the file exists — provoking these on real disks is platform-specific, racy, or both. A fake that returns the exactio::ErrorKindis the only portable way to drive the branch. - Deterministic time. Asserting that
prunedAtequals a specific HTTP-date (RFC 7231 IMF-fixdate, whathttpdate::fmt_http_dateemits), or that a throttled emitter fires on the second sample, needs the clock to be a known value. The realSystemTime::nowmakes those assertions flaky. - Shared process-global state that tests would otherwise mutate. When a branch depends on a single per-process slot — environment variables, the current working directory, the umask, signal handlers, the global allocator — the only way to exercise it without DI is to write to that slot, and the write is observed by every other test in the same process. A serialisation lock (the
EnvGuardpattern that pnpm/pacquet#343 + pnpm/pnpm#11718 retired fromdefault_store_dir) restores correctness only by forcing the affected tests to run single-threaded, and leaves anunsafe { env::set_var(...) }block at the call site. A capability-trait fake keeps the read deterministic and the mutation contained to the test that needs it, so nextest's in-process parallelism stays useful andunsafestays out of test code. The reverse follows too: a real-fixture happy path should not be promoted to DI just because it crosses a process-global slot — only the mutation side of the slot is the problem; reading whatever the shell already has is fine. - External-service happy paths that can't be staged in CI. Upstream pnpm has features whose normal flow depends on real external systems —
pnpm login's 2FA prompt round-trip, the OIDC token exchange and provenance attestation inpnpm publish, and similar — where the happy path itself is what needs faking, not just the error path. When pacquet ports those features, DI is the right tool for their tests too. (As of writing, these are not yet ported; this exception is documented so the convention is in place when they land.) - Unreachable-by-design preconditions. When a function declares a capability bound but a specific test exercises a branch that never reaches that capability, the fake satisfies the bound with
unreachable!and documents the precondition. See the worked example below.
A function that takes a <Sys> generic but is only ever exercised via real fixtures is a smell — either the DI branches are missing coverage, or the generic is over-design. Either add the tests that justify the seam, or drop the generic and let the real fixture cover everything.
The rest of this section is the convention that applies when DI is the right tool.
Names
- The generic type parameter is named
Sys— short for "system seam," the slot in the function signature that selects between the real OS and the test fake. A single short name makes a generic call site instantly recognisable as the DI seam. - The production provider struct is named
Host— unqualified, because the production implementation is the default. Fakes carry behaviour-based names that describe what they do (FailingRead,EmptyRead,PermissionDenied,FakeHostName), not what category of thing they are. - Capability traits use the form
<Domain><Action>: filesystem capabilities areFs*(FsReadToString,FsCreateDirAll,FsWrite,FsReadDir,FsWalkFiles,FsSetExecutable,FsEnsureExecutableBits); environment-variable lookup isEnvVar; clock reads areClock; hostname lookup isGetHostName. The domain prefix lets a reader of a generic bound see which side effect the function reaches for without chasing definitions. Method names mirror theirstdequivalents so the trait is a thin seam overstd::fs::*/std::env::var/SystemTime::now, not a re-imagining.
Eight principles
-
Single-purpose traits. Each capability gets its own trait — one method per side effect, no umbrella trait that bundles
read,write,create_dir_allinto one bag. A function then binds only the capabilities it actually consumes, and a test fake implements only the methods the function under test exercises. -
One generic parameter with multiple bounds. Compose bounds on a single
Sys, never introduce a second type parameter per capability:// Good: one parameter, composed bounds pub fn read_modules_manifest<Sys>(modules_dir: &Path) -> Result<...> where Sys: FsReadToString + Clock, { /* ... */ } // Bad: a parameter per capability — every call site has to satisfy two slots pub fn read_modules_manifest<Fs, C>(modules_dir: &Path) -> Result<...> where Fs: FsReadToString, C: Clock { /* ... */ }One parameter keeps turbofish call sites short (
read_modules_manifest::<Host>(dir)) and makes the fake's job obvious: implement every trait in the bound list, no more. -
Static methods, not
&self. Capability methods are associated functions (no&selfreceiver). The provider is a unit struct that carries no data:pub trait FsReadToString { fn read_to_string(path: &Path) -> io::Result<String>; } pub struct Host; impl FsReadToString for Host { fn read_to_string(path: &Path) -> io::Result<String> { fs::read_to_string(path) } }Stateful fakes (a fake clock that returns a fixed
SystemTime, a recording fake that captures every call) store their state in an interior-mutablestaticdeclared inside the#[test]body, so the trait shape doesn't have to change to accommodate state. This keeps the production impl free of&selfplumbing the test-only fake would otherwise force on it. -
Associated types for data operations. When a capability operates over a domain data type, expose the data type as an associated type rather than threading an instance through every call. The provider chooses the concrete type, and fakes can pick a stub-friendly stand-in.
-
Capability traits on the implementor.
impl FsReadToString for Hostlives on the provider, not on the data type. The data types stay free of test-shim conditional impls; the seam is the provider. -
Domain-neutral provider, domain-scoped traits. The generic is
Sys, the production type isHost(or whatever your crate exports as its provider), and the trait names carry the domain prefix (Fs*,Env*,Clock,GetHostName, …). A reader ofSys: FsReadToString + Clock + EnvVarknows immediately which side effects the function reaches for. -
Explicit turbofish in production. Production call sites name the provider:
read_modules_manifest::<Host>(modules_dir) write_modules_manifest::<Host>(modules_dir, manifest) Config::current::<Host, _, _, _, _>(env::current_dir, home::home_dir, Default::default)The turbofish makes the production choice visible at the call site instead of relying on type inference; if a future caller wants to swap in a different provider (a test driver, a dry-run shim), the spot to change is obvious.
-
Capabilities are primitives, not algorithms. Each trait names a leaf-level effect that maps to a single
stdfunction (read_to_string,create_dir_all,write,var, …). Higher-level guarantees — atomic write, retry loops, walk-with-options — become free functions composed on top of those primitives, not new trait methods. That keeps the fake surface dead-simple: a fake declares one method per capability, never a knob-laden builder.
Worked example: modules-yaml
crates/modules-yaml reads and writes node_modules/.modules.yaml. The read path is generic over FsReadToString + Clock because it needs to read the file and stamp prunedAt from the wall clock; the write path is generic over FsCreateDirAll + FsWrite because it needs to ensure the parent directory exists before writing the serialized manifest:
pub trait FsReadToString {
fn read_to_string(path: &Path) -> io::Result<String>;
}
pub trait FsCreateDirAll {
fn create_dir_all(path: &Path) -> io::Result<()>;
}
pub trait FsWrite {
fn write(path: &Path, contents: &[u8]) -> io::Result<()>;
}
pub trait Clock {
fn now() -> SystemTime;
}
pub struct Host;
impl FsReadToString for Host {
fn read_to_string(path: &Path) -> io::Result<String> { fs::read_to_string(path) }
}
impl FsCreateDirAll for Host {
fn create_dir_all(path: &Path) -> io::Result<()> { fs::create_dir_all(path) }
}
impl FsWrite for Host {
fn write(path: &Path, contents: &[u8]) -> io::Result<()> { fs::write(path, contents) }
}
impl Clock for Host {
fn now() -> SystemTime { SystemTime::now() }
}
pub fn read_modules_manifest<Sys>(modules_dir: &Path) -> Result<Option<Modules>, ReadModulesError>
where
Sys: FsReadToString + Clock,
{
let content = match Sys::read_to_string(&manifest_path) { /* ... */ };
// ...
manifest.pruned_at = httpdate::fmt_http_date(Sys::now());
Ok(Some(manifest))
}
A test that wants to drive the PermissionDenied branch declares a unit-struct fake inside the #[test] body, implementing only the capability the function touches:
#[test]
fn read_propagates_non_not_found_io_error() {
struct FailingRead;
impl FsReadToString for FailingRead {
fn read_to_string(_: &Path) -> io::Result<String> {
Err(io::Error::new(io::ErrorKind::PermissionDenied, "mocked"))
}
}
// `read_modules_manifest`'s bound list is `FsReadToString + Clock`,
// so every fake must satisfy both bounds at the type level — Rust
// doesn't know that the `prunedAt` branch is unreachable for this
// input. The convention for capabilities the test won't exercise
// is a trivial impl whose body is `unreachable!`: the bound is
// satisfied, and the panic message documents the precondition the
// test relies on.
impl Clock for FailingRead {
fn now() -> SystemTime {
unreachable!("clock must not be called when read_to_string fails");
}
}
let err = read_modules_manifest::<FailingRead>(Path::new("/")).unwrap_err();
assert!(matches!(err, ReadModulesError::ReadFile { .. }));
}
Stateful fakes (deterministic clock, recording reads) hold their state in a static inside the test fn:
#[test]
fn read_fills_in_pruned_at_when_missing() {
static FAKE_NOW: SystemTime = SystemTime::UNIX_EPOCH;
struct FakeClock;
impl Clock for FakeClock {
fn now() -> SystemTime { FAKE_NOW }
}
// The `static` lives in this fn's scope, so other tests get
// independent storage and never race on it. The provider type
// stays an empty unit struct; the state lives next to the test
// that needs it. Use `SystemTime::UNIX_EPOCH` (a `const`) for a
// const-initialised static, or `LazyLock` when the desired value
// needs a runtime constructor.
// ...
}
Cross-domain composition
When a function needs capabilities from more than one domain, list them inline on Sys:
fn write_shim<Sys>(target_path: &Path, shim_path: &Path) -> Result<(), LinkBinsError>
where
Sys: FsReadToString + FsReadHead + FsWrite + FsSetExecutable + FsEnsureExecutableBits,
{ /* ... */ }
The provider implements each trait independently, so adding a domain to an existing Sys is one more impl X for Host block — no churn on the production type beyond the new line, and no churn on existing tests beyond the ones whose fakes now need the new method.
Reporter / log events
Pacquet's user-facing output mirrors pnpm's: every channel pnpm fires must fire from the corresponding pacquet site, with the same payload shape and the same firing cadence. The reporter lives in crates/reporter (the Reporter capability trait, the LogEvent enum, the NdjsonReporter and SilentReporter sinks); this section is the convention for porting emissions into ported functions.
Finding the upstream emit
In pnpm/pnpm, log events come from one of three call shapes:
globalLogger.<channel>.<level>(...)— the ergonomic helpers incore/core-loggers/src/<channel>Logger.ts.logger.<level>({ name: 'pnpm:<channel>', ... })— the raw logger with a name discriminant.- Direct
streamParser.write(...)calls — only in pnpm's reporter internals; when porting, you wouldn't write through this.
When porting a function, grep for those patterns in the upstream files you're porting from (not workspace-wide). The emit usually sits immediately before or after the side effect the event describes — e.g., progressLogger.debug({ status: 'fetched' }) after the tarball finishes downloading, before the import step starts.
Channel mapping
The channels pacquet currently emits live in crates/reporter/src/lib.rs's LogEvent enum. Each variant pins #[serde(rename = "pnpm:<channel>")] so the wire string matches upstream byte-for-byte. The enum is not an exhaustive list of pnpm's channels — variants are added as pacquet starts emitting them. When porting a function whose upstream emits a channel LogEvent doesn't yet have, add the variant first (see "To add a new channel" below). Read the doc comment on each existing variant for the upstream type permalink; channels that fire from a single canonical emit site link that too, while multi-site channels (Stage and Progress, which span the install lifecycle) only pin the type and let the porter grep the upstream file for the per-status emits.
To add a new channel: extend the enum with a #[serde(rename = "pnpm:<channel>")] variant whose payload mirrors the upstream TS shape field-for-field — camelCase via #[serde(rename_all = "camelCase")] where applicable, preserving status-tagged-union shapes (see ProgressMessage for the pattern). Add a wire-shape unit test to crates/reporter/src/tests.rs that asserts the JSON renders exactly what pnpm's TS emitter would.
Threading the reporter
A function that emits is generic over R: Reporter. Inside, it calls R::emit(...) with a LogEvent whose variant matches the channel — R::emit(&LogEvent::Stage(...)), R::emit(&LogEvent::Context(...)), etc.
fn install_step<R: Reporter>(prefix: String) {
R::emit(&LogEvent::Stage(StageLog {
level: LogLevel::Debug,
prefix,
stage: Stage::ImportingStarted,
}));
// ...
}
Production callers turbofish at the entry point:
Install { /* ... */ }.run::<NdjsonReporter>().await
Tests use the no-op SilentReporter when they don't care about emits, or a recording fake when they do (see Testing below).
The generic monomorphises away — there's no runtime cost. The ergonomic cost is one <R: Reporter> per intermediate fn and one turbofish at the production entry point. When threading reaches into a struct, add <R: Reporter> to the impl method or carry an install-scoped state field that the relevant emit depends on (see link_file::log_method_once's &AtomicU8 parameter for an example of the latter — the function dedupes per-install rather than per-process).
Where to put the emit
Match the upstream call site's position relative to side effects. pnpm's reporter expects events in a specific order (resolved before fetched, importing_started before any per-package events, etc.). Emitting in the wrong order makes the JS reporter render the "X/Y resolved" counter incorrectly or skip animations entirely.
Cite the upstream permalink (pinned SHA per the cardinal rule in AGENTS.md) in the code comment next to the emit:
// `pnpm:context` carries the directories pnpm's reporter prints
// in the install header. Mirrors the upstream emit at
// <https://github.com/pnpm/pnpm/blob/086c5e91e8/installing/context/src/index.ts#L196>.
R::emit(&LogEvent::Context(ContextLog {
level: LogLevel::Debug,
current_lockfile_exists: false,
store_dir: config.store_dir.display().to_string(),
virtual_store_dir: config.virtual_store_dir.to_string_lossy().into_owned(),
}));
Testing the emit
Use a recording-fake reporter: a unit-struct declared inside the #[test] body, recording into a static Mutex<Vec<LogEvent>> declared in the same body. Assert the captured sequence. The unit struct keeps the fake's reach narrow (one test fn) and the static mutex makes the recorded events available to the assertion without threading a handle through the fn under test.
#[tokio::test]
async fn install_emits_pnpm_event_sequence() {
static EVENTS: Mutex<Vec<LogEvent>> = Mutex::new(Vec::new());
EVENTS.lock().unwrap().clear();
struct RecordingReporter;
impl Reporter for RecordingReporter {
fn emit(event: &LogEvent) {
EVENTS.lock().unwrap().push(event.clone());
}
}
// ... drive the function under test with `::<RecordingReporter>()` ...
let captured = EVENTS.lock().unwrap();
assert!(matches!(
captured.as_slice(),
[
LogEvent::Context(_),
LogEvent::Stage(StageLog { stage: Stage::ImportingStarted, .. }),
LogEvent::Stage(StageLog { stage: Stage::ImportingDone, .. }),
LogEvent::Summary(_),
]
), "unexpected event sequence: {captured:?}");
}
The static lives in the test function's own scope, so other tests have independent buffers and never race on it. Reset it at the start of the test anyway, in case the same test is re-run within the same process (nextest does this on retry).
Verify the test catches a regression: temporarily comment out the emit, run the test, observe the sequence assertion fail, then restore the emit. Same discipline plans/TEST_PORTING.md calls for.
What not to do
- Don't reformat upstream messages. Field names and string values are part of the wire contract — change them and
@pnpm/cli.default-reportersilently drops the record. - Don't invent new channels. If pnpm doesn't have it, pacquet doesn't either. Channels expand only when upstream adds them.
- Don't emit at higher granularity than pnpm. Throttling and size gates exist for a reason — see
pacquet-tarball'sfetch_and_extract_once, which gatespnpm:fetching-progress in_progresson a knownContent-Lengthand>= 5 MB(BIG_TARBALL_SIZE), then throttles to 500ms with leading + trailing edges. That mirrorslodash.throttle(opts.onProgress, 500)in upstream'sremoteTarballFetcher.ts:143-144exactly. - Don't emit at lower granularity, either. Skipping events the consumer expects (
fetchedafter a download succeeds,importedaftercreate_cas_filesOk) breaks pnpm's reporter counters.
Worked example: pnpm:summary in Install::run
The Install::run function brackets the install with pnpm:stage events and emits pnpm:summary after import completes. Upstream emits summaryLogger.debug({ prefix }) after each importer's link phase finishes, at installing/deps-installer/src/install/index.ts:1663.
In pacquet, the same emit lives at the bottom of Install::run:
R::emit(&LogEvent::Stage(StageLog {
level: LogLevel::Debug,
prefix: prefix.clone(),
stage: Stage::ImportingDone,
}));
// `pnpm:summary` closes the install and lets the reporter render
// the accumulated `pnpm:root` events as a "+N -M" block. Must
// come after `importing_done`, matching pnpm's ordering at
// <https://github.com/pnpm/pnpm/blob/086c5e91e8/installing/deps-installer/src/install/index.ts#L1663>.
R::emit(&LogEvent::Summary(SummaryLog { level: LogLevel::Debug, prefix }));
The recording-fake test that pins the sequence lives in crates/package-manager/src/install/tests.rs::install_emits_pnpm_event_sequence. The same pattern carries over to every other channel — only the variant name and payload shape change.