Performance Profiling and Optimization
Introduction
Node.js ships with everything you need to find CPU bottlenecks and memory leaks — no npm packages required. The perf_hooks, inspector, and v8 modules give you fine-grained control over profiling, and a set of CLI flags lets you capture detailed data without touching your code.
This tutorial walks through profiling a realistic Express.js app step by step.
Prerequisites
- Node.js 18 or later (some APIs vary by version — check with
node --version) - A small Express app to profile (we’ll build one below)
Create a file called server.js:
// server.js
const express = require('express');
const app = express();
app.get('/slow', (req, res) => {
// Simulate a compute-intensive operation
let total = 0;
for (let i = 0; i < 10_000_000; i++) {
total += Math.sqrt(i);
}
res.send({ result: total.toFixed(2) });
});
app.get('/fast', (req, res) => {
res.send({ status: 'ok' });
});
app.listen(3000, () => {
console.log('Server running on http://localhost:3000');
});
Install Express if you haven’t:
npm install express
Keep the server running in one terminal. You’ll hit it with a load tool while profiling runs in another.
Step 1: Measure Function Latency with perf_hooks
perf_hooks gives you sub-millisecond timing via performance.now(). The real power comes from performance.timerify — it wraps any function and records every call as a PerformanceEntry that you can observe.
Modify your server to wrap the slow handler:
// server.js
const express = require('express');
const { performance, PerformanceObserver } = require('node:perf_hooks');
const app = express();
// Wrap the expensive computation
function computeExpensive() {
let total = 0;
for (let i = 0; i < 10_000_000; i++) {
total += Math.sqrt(i);
}
return total;
}
const timedCompute = performance.timerify(computeExpensive);
// Observe function durations
const observer = new PerformanceObserver((list) => {
list.getEntries().forEach((entry) => {
console.log(`[perf] ${entry.name} took ${entry.duration.toFixed(3)}ms`);
});
});
observer.observe({ entryTypes: ['function'], buffered: true });
app.get('/slow', (req, res) => {
const result = timedCompute();
res.send({ result: result.toFixed(2) });
});
app.get('/fast', (req, res) => {
res.send({ status: 'ok' });
});
app.listen(3000, () => {
console.log('Server running on http://localhost:3000');
});
Run it:
node server.js
In another terminal, hit the slow endpoint a few times:
curl http://localhost:3000/slow
You should see output like:
[perf] computeExpensive took 87.432ms
performance.timerify adds negligible overhead — it’s safe to leave in production code during a profiling window.
Custom Timeline Marks with performance.mark and performance.measure
For more control over what you measure, use performance.mark to place named markers on the timeline, then performance.measure to compute the duration between them:
const { performance, PerformanceObserver } = require('node:perf_hooks');
const measureObserver = new PerformanceObserver((list) => {
list.getEntries().forEach((entry) => {
console.log(`[measure] ${entry.name}: ${entry.duration.toFixed(3)}ms`);
});
});
measureObserver.observe({ entryTypes: ['measure'] });
performance.mark('db-query-start');
// async operation (e.g., a database call)
setTimeout(() => {
performance.mark('db-query-end');
performance.measure('DB query', 'db-query-start', 'db-query-end');
}, 10);
This is useful for timing async operations where performance.timerify alone isn’t enough.
Step 2: CPU Profiling with the --cpu-prof Flag
perf_hooks tells you how long individual functions run. To find the hottest call stacks across your whole process, use the V8 CPU profiler via the --cpu-prof flag.
Start your server with CPU profiling enabled:
node --cpu-prof --cpu-prof-dir ./profiles server.js
The --cpu-prof-dir flag tells Node.js where to write profile files. Without it, files land in the current working directory with auto-generated names.
Run some load against both endpoints:
# Run a quick load test
for i in {1..20}; do curl -s http://localhost:3000/slow > /dev/null & done
curl -s http://localhost:3000/fast > /dev/null
Stop the server with Ctrl+C. V8 flushes the profile buffer on exit, so the file is written automatically.
Check what was generated:
ls profiles/
You should see a file like isolate-0xnnnnnnnn-<timestamp>.cpuprofile. All output files end in .cpuprofile.
Reading the Profile
The .cpuprofile file opens directly in Chrome DevTools as a flame graph:
- Open Chrome and navigate to
chrome://inspect - Click Open dedicated DevTools for Node
- Go to the Profiler tab
- Click Load profile and select your
.cpuprofilefile
The flame graph shows which functions consumed the most CPU time. In our example, computeExpensive and the Math.sqrt call inside it should dominate.
Step 3: Taking Heap Snapshots with the v8 Module
Memory leaks in Node.js are often caused by objects held in closures, global caches, or event listener registries. A heap snapshot shows you exactly what’s in memory at a given moment.
Add a memory-leak pattern to your server:
// server.js
const express = require('express');
const v8 = require('node:v8');
const app = express();
// Global cache that grows unboundedly — a memory leak
const cache = new Map();
let counter = 0;
app.get('/leaky', (req, res) => {
// Every request adds a new entry to the map
// but nothing ever removes it
cache.set(counter++, {
data: new Array(10_000).fill(Math.random()),
timestamp: Date.now(),
});
res.send({ cached: cache.size });
});
app.get('/snapshot', (req, res) => {
const snapPath = v8.writeHeapSnapshot();
res.send({ snapshot: snapPath });
});
app.get('/stats', (req, res) => {
const s = v8.getHeapStatistics();
res.send({
heapUsed: `${(s.used_heap_size / 1024 / 1024).toFixed(1)} MB`,
heapTotal: `${(s.total_heap_size / 1024 / 1024).toFixed(1)} MB`,
heapLimit: `${(s.heap_size_limit / 1024 / 1024).toFixed(1)} MB`,
});
});
app.listen(3000, () => {
console.log('Server running on http://localhost:3000');
});
Start the server:
node server.js
Generate the leak:
for i in {1..100}; do curl -s http://localhost:3000/leaky > /dev/null; done
Check heap stats:
curl http://localhost:3000/stats
You might see something like:
{"heapUsed":"78.4 MB","heapTotal":"92.1 MB","heapLimit":"4096.0 MB"}
Take a snapshot:
curl http://localhost:3000/snapshot
This returns the path where the snapshot was written, for example:
{"snapshot":"/path/to/server.js/Heap-20260320-120302.12345.heapsnapshot"}
Open this file in Chrome DevTools:
- Go to
chrome://inspectand open Open dedicated DevTools for Node - Navigate to the Memory tab
- Click Load snapshot and select the
.heapsnapshotfile
In the summary view, sort by Shallow Size (the object size itself, not what it references). The cache Map and its entries will stand out. Switch to the Containment view to explore the retainer chain — why each cache entry is still live in memory.
Streaming Snapshots with v8.getHeapSnapshot() (Node.js 18+)
Instead of writing to disk directly, v8.getHeapSnapshot() returns a ReadableStream that you can pipe anywhere — useful for sending snapshots to remote storage:
const v8 = require('node:v8');
const fs = require('node:fs');
const stream = v8.getHeapSnapshot();
stream.pipe(fs.createWriteStream('live.heapsnapshot'));
Continuous Heap Profiling with --heap-prof
For unattended heap profiling that writes continuous data to disk, use the --heap-prof flag:
node --heap-prof --heap-prof-dir ./heap-profiles server.js
This samples heap allocations over time and writes output files that you can load in Chrome DevTools. Unlike v8.writeHeapSnapshot() which freezes the heap at one moment, --heap-prof gives you allocation patterns across the entire run. Use --heap-prof-interval to adjust the sampling frequency.
When to Call writeHeapSnapshot
A snapshot requires roughly 2× the current heap size in free memory. Calling it on a near-OOM process risks a crash. Always check before taking a snapshot in production:
const { used_heap_size, heap_size_limit } = v8.getHeapStatistics();
const freeMemory = heap_size_limit - used_heap_size;
if (freeMemory < used_heap_size * 0.5) {
console.error('Not enough headroom for heap snapshot — skipping');
} else {
v8.writeHeapSnapshot();
}
Step 4: Programmatic Profiling with the inspector Module
The inspector module lets you start and stop CPU profiling from within your code — useful when you want to capture a profile triggered by a specific event, rather than profiling from process start.
Create a separate profiling script:
// profile-session.js
const inspector = require('node:inspector');
const fs = require('node:fs');
const path = require('node:path');
const session = new inspector.Session();
session.connect();
console.log('Starting CPU profile...');
session.post('Profiler.enable', () => {
session.post('Profiler.start', () => {
// Simulate a burst of work
const workload = (n) => {
if (n <= 1) return n;
return workload(n - 1) + workload(n - 2);
};
console.log('fibonacci(36) =', workload(36));
// Stop after 5 seconds
setTimeout(() => {
session.post('Profiler.stop', (err, { profile }) => {
if (err) {
console.error('Profiler.stop error:', err);
return;
}
const outFile = path.join(__dirname, `profile-${Date.now()}.cpuprofile`);
fs.writeFileSync(outFile, JSON.stringify(profile));
console.log(`Profile saved to ${outFile}`);
session.disconnect();
process.exit(0);
});
}, 5000);
});
});
Run it:
node profile-session.js
After 5 seconds, a .cpuprofile file is written. Load it in Chrome DevTools the same way as before.
Step 5: Interactive Debugging with --inspect
For live step-through debugging and real-time profiling, use the --inspect flag:
node --inspect server.js
This starts the Node.js inspector on 127.0.0.1:9229. Chrome DevTools can attach to it remotely.
node --inspect=0.0.0.0:9229 server.js # listen on all interfaces (for remote debugging)
node --inspect-brk server.js # pause immediately on the first line
With --inspect-brk, the process waits for a DevTools client to connect before running any JavaScript — useful when the slow behavior happens at startup.
Common Pitfalls
Microbenchmarks Can Lie
console.time and even performance.now() don’t account for V8’s JIT warmup. A function that looks slow in the first 100 calls may be 10× faster after 10,000 calls because the JIT has compiled the hot path.
Always warm up before measuring:
const warmup = (fn, n = 10000) => {
for (let i = 0; i < n; i++) fn();
};
warmup(computeExpensive);
const start = performance.now();
const result = computeExpensive();
const elapsed = performance.now() - start;
console.log(`Warm run: ${elapsed.toFixed(3)}ms, result: ${result.toFixed(2)}`);
GC Pauses Distort Short Profiles
Garbage collection pauses can spike dramatically at any point during execution, especially during heavy allocation phases. A short profile may capture one or two GC pauses that dominate the view entirely.
Run profiles for at least 30–60 seconds of normal traffic. Compare warm profiles against cold ones to separate GC noise from actual bottlenecks.
Profiling With --inspect Changes Performance
Running with --inspect (even without an attached client) alters V8’s execution engine — it may JIT-optimize less aggressively, making profiles less representative of production behavior.
Use --inspect only when you need live debugging. For unattended profiling captures, prefer --cpu-prof and --heap-prof flags.
async_hooks Has Significant Overhead
async_hooks tracks asynchronous causality chains across promises and callbacks — useful for understanding where async operations originate. However, it adds 30–100%+ CPU overhead in promise-heavy code. It is not suitable for production profiling of hot paths without careful benchmarking. Profile with async_hooks disabled first to establish a baseline.
Which Tool to Use
| Situation | Tool |
|---|---|
| Time a specific function | performance.timerify + PerformanceObserver |
| Create explicit timeline markers | performance.mark + performance.measure |
| Find hot call stacks across the whole process | --cpu-prof + Chrome DevTools |
| Continuous heap allocation sampling | --heap-prof + Chrome DevTools |
| Investigate a memory leak | v8.writeHeapSnapshot() + Chrome DevTools |
| Stream a heap snapshot to remote storage | v8.getHeapSnapshot() (Node.js 18+) |
| Trigger profiling from inside the code | inspector module |
| Live step-through debugging | --inspect + Chrome DevTools |
| Set a hard memory limit | --max-old-space-size |
Summary
Node.js gives you three built-in profiling systems that cover most performance investigation needs:
perf_hooks— precise timing for functions and custom marks, with minimal overhead--cpu-profand theinspectormodule — CPU profiling that outputs files you can read as flame graphs in Chrome DevToolsv8module and--heap-prof— memory snapshots showing exactly what objects are live and who retains them
Start with perf_hooks to isolate slow functions, then use --cpu-prof to understand why they’re slow at the call-stack level. When memory is the issue, a heap snapshot in Chrome DevTools makes retainer chains visible.
See Also
- Node.js Modules and npm — understand how
require()and the module system work under the hood - Node.js Logging and Monitoring — observability patterns that complement profiling data
- Node.js Worker Threads — parallelize CPU-intensive work using the same V8 engine