How npm, Yarn, and pnpm Store Dependencies

Devops

2025 Nov 251478 words|Estimated reading time: 8 minutes

Why Storage Matters

As front-end developers, package managers (npm, Yarn, pnpm) are essential tools in our daily work. When we run the install command, how are thousands of dependency packages downloaded, organized, and stored?

Understanding their underlying storage principles not only helps us solve real-world problems like phantom dependencies, disk space usage, and slow installation speeds, but is also a necessary step towards more efficient and stable front-end engineering.

This article will deeply analyze the storage strategies and the technical philosophies behind the three major tools: npm, Yarn, and pnpm.

The Basic Strategy of npm/Yarn: File Organization Based on `node_modules`

📦 npm v2: The Original "Nested Hell"

In npm v2 and earlier versions, dependencies were stored in a fully nested way.

Storage Structure: If package A depends on B, and B depends on C, the file structure strictly follows the dependency tree:

node_modules
└── A
    └── node_modules
        └── B
            └── node_modules
                └── C

Problems:
1. Wasted Disk Space: Different versions of the same library were downloaded and stored multiple times.
2. Long Paths: This "nested hell" could easily hit path length limits, especially on Windows systems.

🌳 npm v3+ & Yarn: The Flattening Upgrade (Hoisting)

To solve the "nested hell," npm v3 introduced a flattening mechanism called Hoisting. The classic Yarn (v1) used a similar approach.

Core Mechanism: The installer tries to "hoist" (lift) as many modules as possible from the dependency tree into the project's root node_modules directory.
Storage Structure: Dependencies with version conflicts are nested inside their parent's node_modules. Dependencies without conflicts are hoisted to the root.
Problems: Phantom Dependencies and Doppelgangers.
1. Phantom Dependencies: Your code can accidentally use indirect dependencies (dependencies of your dependencies) that are hoisted but not listed in your package.json. This creates a mismatch between declared and actual dependencies.
2. Doppelgangers: Even with flattening, the storage model is still based on file copying. Dependencies that cannot be hoisted remain as full, separate copies on disk.

Yarn PnP: A New Attempt to Remove `node_modules`

With the release of Yarn v2 (Berry), Yarn introduced Plug'n'Play (PnP), a radical dependency management strategy that completely challenges the traditional node_modules structure.

🛑 Goodbye `node_modules` Folder

Storage Mechanism: Yarn PnP completely removes the traditional node_modules directory. Dependency packages are no longer extracted into your project folder.
Storage Location: Dependencies are downloaded as Zip files and stored in Yarn's global cache or the project's .yarn/cache directory.
Benefits:
1. Very Fast Installation: Avoids the slow file I/O operations of extracting files and building the node_modules tree.
2. No Phantom Dependencies: Completely eliminates phantom dependencies because there is no flat node_modules folder for non-declared dependencies to be accessed from.

📃 The `.pnp.cjs` File and Module Resolution

Core Idea: PnP takes over Node.js's standard module resolution (require/import) by generating a special .pnp.cjs file.
How it Works: This .pnp.cjs file is a large JavaScript file containing a complex map. It precisely records each package's name, version, and the exact location of its files within the Zip cache.
Module Lookup: When your code tries to import a module, Node.js first loads .pnp.cjs. This file then uses the map to directly locate the file inside the dependency's Zip file, enabling isolated access.

Summary: Yarn PnP achieves near-zero installation time (if dependencies are already downloaded) and strict isolation. However, it works at the software level (by taking over Node.js) rather than the file system level, and requires specific configuration support from other tools (like Webpack, TypeScript).

pnpm: Content-Addressable Storage and Linking

pnpm (Performant npm) changed the storage model for package managers. It borrows the content-addressable idea from systems like Git, achieving true file sharing at a global level and strict dependency isolation.

🏠 Global Store: Content-Addressable Storage

pnpm's first key innovation is a global content store, which is the foundation of its efficiency and disk space savings.

Storage Location: pnpm maintains a single storage directory called the Store in a global location on your disk (usually ~/.pnpm-store).
Storage Mechanism: When pnpm downloads a package, it generates a unique hash based on the package's content, version, and metadata. The Store's internal structure is organized using these hash values.
- For example, files for lodash@4.17.21 might be stored in a path like ~/.pnpm-store/v3/files/c2b53c...d81a2f.
Result:
- Maximum Deduplication: No matter how many projects or versions depend on the same package, the actual files are stored on disk only once. This is true deduplication and space saving.
- Immutability: Hash-based storage ensures files cannot be changed once stored, improving reliability.

🔗 Inside the Project: Hard Links and Symbolic Links

Instead of copying files into the project's node_modules, pnpm uses file system linking for efficient referencing and strict isolation.

🔹 First Layer: Physical Storage (Hard Links)

Hard links are the key technology for saving space.

Operation: When you run pnpm install, it creates hard links inside the project's node_modules that point to the actual files in the global Store (~/.pnpm-store).
Advantage of Hard Links: They are file system "aliases" pointing to the same physical disk space (inode). Creating a hard link uses almost no extra disk space, and access speed is the same as accessing the original file.

🛡️ Second Layer: Virtual Store Directory

The structure of the node_modules folder in a pnpm project is key to its strict isolation.

Virtual Store Directory (.pnpm): There is a special .pnpm directory inside the root node_modules. This "virtual store" reflects pnpm's precise dependency graph and contains a flat list of all dependencies (both direct and indirect).
Naming Rule: Directory names include the package name, version, and a hash of its peer dependencies, for example:
```
node_modules/.pnpm/lodash@4.17.21/node_modules/lodash
```
The files inside this directory are hard links pointing to the actual files in the global Store.

🔑 Third Layer: Dependency Isolation (Symbolic Links)

Symbolic links are the key technology for dependency isolation.

Building the Tree: Inside each package directory in the virtual store (e.g., .../lodash/node_modules/lodash), symbolic links point to the exact locations of its dependencies within the .pnpm directory.
Project Root node_modules: This directory contains only symbolic links for packages you directly declared in your package.json. These links point to the corresponding entries in the .pnpm virtual directory.

A Visual of pnpm's Structure

This layered structure perfectly solves the problems of traditional package managers:

Layer	Content / Method	Solves
Store (Global)	Actual files, stored by content hash, shared across projects.	Wasted disk space (global deduplication).
Virtual Dir (`.pnpm`)	Contains hard links to all dependencies, building strict relationships.	File redundancy (via hard links to the Store).
Project `node_modules`	Contains only symbolic links to direct dependencies, pointing to `.pnpm`.	Phantom dependencies (forced isolation).

When Node.js tries to resolve a package:

It first looks in the project root node_modules.
If found, it's a symbolic link. Node.js follows it to the .pnpm virtual directory.
Inside the .pnpm directory, Node.js only sees the dependencies that this package directly declared. Non-declared dependencies are hidden by the unique symbolic link structure, achieving strict isolation.

Conclusion: pnpm's storage model is an engineering masterpiece. It uses hard links at the physical storage layer for maximum deduplication and symbolic links at the logical access layer for strict dependency isolation.

Summary and Practical Insights

Feature	npm (v3+) & Yarn (v1)	Yarn PnP (v2+)	pnpm
Storage Model	File copying/moving based on project `node_modules`	Zip cache-based, no `node_modules`	Global content store + project hard links
Disk Usage	Saves some repeats inside project (Hoisting), but global redundancy exists.	Uses Zip cache. Fast install, but needs tooling support.	Global deduplication. Files stored once. Saves huge space.
`node_modules`	Mostly flat, but can be unpredictable.	No `node_modules`. Uses `.pnp.cjs` map.	Strict symbolic link structure. Forces isolation.
File Operations	Lots of file copying, moving, deleting.	Downloads Zip files. Almost no project file ops.	Mainly file system linking. Very I/O efficient.
Speed	Slower.	Very Fast (zero install if cached, needs first download).	Very Fast (especially on reinstall or switching projects).
Dependency Certainty	Risk of phantom dependencies.	Strict isolation. No phantom dependencies.	Strict isolation. No phantom dependencies.

For those seeking efficiency and stability, especially in complex environments with multiple projects and dependency versions, both pnpm and Yarn PnP offer better solutions than traditional Hoisting. pnpm's advantage lies in its file-system-based universality and extreme disk space efficiency, making it a top choice for modern engineering.

How npm, Yarn, and pnpm Store Dependencies

Devops

2025 Nov 251478 words|Estimated reading time: 8 minutes

Why Storage Matters

This article will deeply analyze the storage strategies and the technical philosophies behind the three major tools: npm, Yarn, and pnpm.

The Basic Strategy of npm/Yarn: File Organization Based on `node_modules`

📦 npm v2: The Original "Nested Hell"

In npm v2 and earlier versions, dependencies were stored in a fully nested way.

Storage Structure: If package A depends on B, and B depends on C, the file structure strictly follows the dependency tree:

node_modules
└── A
    └── node_modules
        └── B
            └── node_modules
                └── C

Problems:
1. Wasted Disk Space: Different versions of the same library were downloaded and stored multiple times.
2. Long Paths: This "nested hell" could easily hit path length limits, especially on Windows systems.

🌳 npm v3+ & Yarn: The Flattening Upgrade (Hoisting)

To solve the "nested hell," npm v3 introduced a flattening mechanism called Hoisting. The classic Yarn (v1) used a similar approach.

Core Mechanism: The installer tries to "hoist" (lift) as many modules as possible from the dependency tree into the project's root node_modules directory.
Storage Structure: Dependencies with version conflicts are nested inside their parent's node_modules. Dependencies without conflicts are hoisted to the root.
Problems: Phantom Dependencies and Doppelgangers.
1. Phantom Dependencies: Your code can accidentally use indirect dependencies (dependencies of your dependencies) that are hoisted but not listed in your package.json. This creates a mismatch between declared and actual dependencies.
2. Doppelgangers: Even with flattening, the storage model is still based on file copying. Dependencies that cannot be hoisted remain as full, separate copies on disk.

Yarn PnP: A New Attempt to Remove `node_modules`

With the release of Yarn v2 (Berry), Yarn introduced Plug'n'Play (PnP), a radical dependency management strategy that completely challenges the traditional node_modules structure.

🛑 Goodbye `node_modules` Folder

Storage Mechanism: Yarn PnP completely removes the traditional node_modules directory. Dependency packages are no longer extracted into your project folder.
Storage Location: Dependencies are downloaded as Zip files and stored in Yarn's global cache or the project's .yarn/cache directory.
Benefits:
1. Very Fast Installation: Avoids the slow file I/O operations of extracting files and building the node_modules tree.
2. No Phantom Dependencies: Completely eliminates phantom dependencies because there is no flat node_modules folder for non-declared dependencies to be accessed from.

📃 The `.pnp.cjs` File and Module Resolution

Core Idea: PnP takes over Node.js's standard module resolution (require/import) by generating a special .pnp.cjs file.
How it Works: This .pnp.cjs file is a large JavaScript file containing a complex map. It precisely records each package's name, version, and the exact location of its files within the Zip cache.
Module Lookup: When your code tries to import a module, Node.js first loads .pnp.cjs. This file then uses the map to directly locate the file inside the dependency's Zip file, enabling isolated access.

Summary: Yarn PnP achieves near-zero installation time (if dependencies are already downloaded) and strict isolation. However, it works at the software level (by taking over Node.js) rather than the file system level, and requires specific configuration support from other tools (like Webpack, TypeScript).

pnpm: Content-Addressable Storage and Linking

🏠 Global Store: Content-Addressable Storage

pnpm's first key innovation is a global content store, which is the foundation of its efficiency and disk space savings.

Storage Location: pnpm maintains a single storage directory called the Store in a global location on your disk (usually ~/.pnpm-store).
Storage Mechanism: When pnpm downloads a package, it generates a unique hash based on the package's content, version, and metadata. The Store's internal structure is organized using these hash values.
- For example, files for lodash@4.17.21 might be stored in a path like ~/.pnpm-store/v3/files/c2b53c...d81a2f.
Result:
- Maximum Deduplication: No matter how many projects or versions depend on the same package, the actual files are stored on disk only once. This is true deduplication and space saving.
- Immutability: Hash-based storage ensures files cannot be changed once stored, improving reliability.

🔗 Inside the Project: Hard Links and Symbolic Links

Instead of copying files into the project's node_modules, pnpm uses file system linking for efficient referencing and strict isolation.

🔹 First Layer: Physical Storage (Hard Links)

Hard links are the key technology for saving space.

Operation: When you run pnpm install, it creates hard links inside the project's node_modules that point to the actual files in the global Store (~/.pnpm-store).
Advantage of Hard Links: They are file system "aliases" pointing to the same physical disk space (inode). Creating a hard link uses almost no extra disk space, and access speed is the same as accessing the original file.

🛡️ Second Layer: Virtual Store Directory

The structure of the node_modules folder in a pnpm project is key to its strict isolation.

Virtual Store Directory (.pnpm): There is a special .pnpm directory inside the root node_modules. This "virtual store" reflects pnpm's precise dependency graph and contains a flat list of all dependencies (both direct and indirect).
Naming Rule: Directory names include the package name, version, and a hash of its peer dependencies, for example:
```
node_modules/.pnpm/lodash@4.17.21/node_modules/lodash
```
The files inside this directory are hard links pointing to the actual files in the global Store.

🔑 Third Layer: Dependency Isolation (Symbolic Links)

Symbolic links are the key technology for dependency isolation.

Building the Tree: Inside each package directory in the virtual store (e.g., .../lodash/node_modules/lodash), symbolic links point to the exact locations of its dependencies within the .pnpm directory.
Project Root node_modules: This directory contains only symbolic links for packages you directly declared in your package.json. These links point to the corresponding entries in the .pnpm virtual directory.

A Visual of pnpm's Structure

This layered structure perfectly solves the problems of traditional package managers:

Layer	Content / Method	Solves
Store (Global)	Actual files, stored by content hash, shared across projects.	Wasted disk space (global deduplication).
Virtual Dir (`.pnpm`)	Contains hard links to all dependencies, building strict relationships.	File redundancy (via hard links to the Store).
Project `node_modules`	Contains only symbolic links to direct dependencies, pointing to `.pnpm`.	Phantom dependencies (forced isolation).

When Node.js tries to resolve a package:

It first looks in the project root node_modules.
If found, it's a symbolic link. Node.js follows it to the .pnpm virtual directory.
Inside the .pnpm directory, Node.js only sees the dependencies that this package directly declared. Non-declared dependencies are hidden by the unique symbolic link structure, achieving strict isolation.

Summary and Practical Insights

Feature	npm (v3+) & Yarn (v1)	Yarn PnP (v2+)	pnpm
Storage Model	File copying/moving based on project `node_modules`	Zip cache-based, no `node_modules`	Global content store + project hard links
Disk Usage	Saves some repeats inside project (Hoisting), but global redundancy exists.	Uses Zip cache. Fast install, but needs tooling support.	Global deduplication. Files stored once. Saves huge space.
`node_modules`	Mostly flat, but can be unpredictable.	No `node_modules`. Uses `.pnp.cjs` map.	Strict symbolic link structure. Forces isolation.
File Operations	Lots of file copying, moving, deleting.	Downloads Zip files. Almost no project file ops.	Mainly file system linking. Very I/O efficient.
Speed	Slower.	Very Fast (zero install if cached, needs first download).	Very Fast (especially on reinstall or switching projects).
Dependency Certainty	Risk of phantom dependencies.	Strict isolation. No phantom dependencies.	Strict isolation. No phantom dependencies.

How npm, Yarn, and pnpm Store Dependencies

Why Storage Matters

The Basic Strategy of npm/Yarn: File Organization Based on node_modules

📦 npm v2: The Original "Nested Hell"

🌳 npm v3+ & Yarn: The Flattening Upgrade (Hoisting)

Yarn PnP: A New Attempt to Remove node_modules

🛑 Goodbye node_modules Folder

📃 The .pnp.cjs File and Module Resolution

pnpm: Content-Addressable Storage and Linking

🏠 Global Store: Content-Addressable Storage

🔗 Inside the Project: Hard Links and Symbolic Links

🔹 First Layer: Physical Storage (Hard Links)

🛡️ Second Layer: Virtual Store Directory

🔑 Third Layer: Dependency Isolation (Symbolic Links)

A Visual of pnpm's Structure

Summary and Practical Insights

How npm, Yarn, and pnpm Store Dependencies

Why Storage Matters

The Basic Strategy of npm/Yarn: File Organization Based on node_modules

📦 npm v2: The Original "Nested Hell"

🌳 npm v3+ & Yarn: The Flattening Upgrade (Hoisting)

Yarn PnP: A New Attempt to Remove node_modules

🛑 Goodbye node_modules Folder

📃 The .pnp.cjs File and Module Resolution

pnpm: Content-Addressable Storage and Linking

🏠 Global Store: Content-Addressable Storage

🔗 Inside the Project: Hard Links and Symbolic Links

🔹 First Layer: Physical Storage (Hard Links)

🛡️ Second Layer: Virtual Store Directory

🔑 Third Layer: Dependency Isolation (Symbolic Links)

A Visual of pnpm's Structure

Summary and Practical Insights

The Basic Strategy of npm/Yarn: File Organization Based on `node_modules`

Yarn PnP: A New Attempt to Remove `node_modules`

🛑 Goodbye `node_modules` Folder

📃 The `.pnp.cjs` File and Module Resolution

The Basic Strategy of npm/Yarn: File Organization Based on `node_modules`

Yarn PnP: A New Attempt to Remove `node_modules`

🛑 Goodbye `node_modules` Folder

📃 The `.pnp.cjs` File and Module Resolution