Performance Profiling and Optimization

· 9 min read · Updated March 20, 2026 · intermediate
node performance profiling v8 memory-leaks cpu debugging

Introduction

Node.js ships with everything you need to find CPU bottlenecks and memory leaks — no npm packages required. The perf_hooks, inspector, and v8 modules give you fine-grained control over profiling, and a set of CLI flags lets you capture detailed data without touching your code.

This tutorial walks through profiling a realistic Express.js app step by step.

Prerequisites

  • Node.js 18 or later (some APIs vary by version — check with node --version)
  • A small Express app to profile (we’ll build one below)

Create a file called server.js:

// server.js
const express = require('express');
const app = express();

app.get('/slow', (req, res) => {
  // Simulate a compute-intensive operation
  let total = 0;
  for (let i = 0; i < 10_000_000; i++) {
    total += Math.sqrt(i);
  }
  res.send({ result: total.toFixed(2) });
});

app.get('/fast', (req, res) => {
  res.send({ status: 'ok' });
});

app.listen(3000, () => {
  console.log('Server running on http://localhost:3000');
});

Install Express if you haven’t:

npm install express

Keep the server running in one terminal. You’ll hit it with a load tool while profiling runs in another.

Step 1: Measure Function Latency with perf_hooks

perf_hooks gives you sub-millisecond timing via performance.now(). The real power comes from performance.timerify — it wraps any function and records every call as a PerformanceEntry that you can observe.

Modify your server to wrap the slow handler:

// server.js
const express = require('express');
const { performance, PerformanceObserver } = require('node:perf_hooks');

const app = express();

// Wrap the expensive computation
function computeExpensive() {
  let total = 0;
  for (let i = 0; i < 10_000_000; i++) {
    total += Math.sqrt(i);
  }
  return total;
}

const timedCompute = performance.timerify(computeExpensive);

// Observe function durations
const observer = new PerformanceObserver((list) => {
  list.getEntries().forEach((entry) => {
    console.log(`[perf] ${entry.name} took ${entry.duration.toFixed(3)}ms`);
  });
});
observer.observe({ entryTypes: ['function'], buffered: true });

app.get('/slow', (req, res) => {
  const result = timedCompute();
  res.send({ result: result.toFixed(2) });
});

app.get('/fast', (req, res) => {
  res.send({ status: 'ok' });
});

app.listen(3000, () => {
  console.log('Server running on http://localhost:3000');
});

Run it:

node server.js

In another terminal, hit the slow endpoint a few times:

curl http://localhost:3000/slow

You should see output like:

[perf] computeExpensive took 87.432ms

performance.timerify adds negligible overhead — it’s safe to leave in production code during a profiling window.

Custom Timeline Marks with performance.mark and performance.measure

For more control over what you measure, use performance.mark to place named markers on the timeline, then performance.measure to compute the duration between them:

const { performance, PerformanceObserver } = require('node:perf_hooks');

const measureObserver = new PerformanceObserver((list) => {
  list.getEntries().forEach((entry) => {
    console.log(`[measure] ${entry.name}: ${entry.duration.toFixed(3)}ms`);
  });
});
measureObserver.observe({ entryTypes: ['measure'] });

performance.mark('db-query-start');

// async operation (e.g., a database call)
setTimeout(() => {
  performance.mark('db-query-end');
  performance.measure('DB query', 'db-query-start', 'db-query-end');
}, 10);

This is useful for timing async operations where performance.timerify alone isn’t enough.

Step 2: CPU Profiling with the --cpu-prof Flag

perf_hooks tells you how long individual functions run. To find the hottest call stacks across your whole process, use the V8 CPU profiler via the --cpu-prof flag.

Start your server with CPU profiling enabled:

node --cpu-prof --cpu-prof-dir ./profiles server.js

The --cpu-prof-dir flag tells Node.js where to write profile files. Without it, files land in the current working directory with auto-generated names.

Run some load against both endpoints:

# Run a quick load test
for i in {1..20}; do curl -s http://localhost:3000/slow > /dev/null & done
curl -s http://localhost:3000/fast > /dev/null

Stop the server with Ctrl+C. V8 flushes the profile buffer on exit, so the file is written automatically.

Check what was generated:

ls profiles/

You should see a file like isolate-0xnnnnnnnn-<timestamp>.cpuprofile. All output files end in .cpuprofile.

Reading the Profile

The .cpuprofile file opens directly in Chrome DevTools as a flame graph:

  1. Open Chrome and navigate to chrome://inspect
  2. Click Open dedicated DevTools for Node
  3. Go to the Profiler tab
  4. Click Load profile and select your .cpuprofile file

The flame graph shows which functions consumed the most CPU time. In our example, computeExpensive and the Math.sqrt call inside it should dominate.

Step 3: Taking Heap Snapshots with the v8 Module

Memory leaks in Node.js are often caused by objects held in closures, global caches, or event listener registries. A heap snapshot shows you exactly what’s in memory at a given moment.

Add a memory-leak pattern to your server:

// server.js
const express = require('express');
const v8 = require('node:v8');

const app = express();

// Global cache that grows unboundedly — a memory leak
const cache = new Map();
let counter = 0;

app.get('/leaky', (req, res) => {
  // Every request adds a new entry to the map
  // but nothing ever removes it
  cache.set(counter++, {
    data: new Array(10_000).fill(Math.random()),
    timestamp: Date.now(),
  });
  res.send({ cached: cache.size });
});

app.get('/snapshot', (req, res) => {
  const snapPath = v8.writeHeapSnapshot();
  res.send({ snapshot: snapPath });
});

app.get('/stats', (req, res) => {
  const s = v8.getHeapStatistics();
  res.send({
    heapUsed: `${(s.used_heap_size / 1024 / 1024).toFixed(1)} MB`,
    heapTotal: `${(s.total_heap_size / 1024 / 1024).toFixed(1)} MB`,
    heapLimit: `${(s.heap_size_limit / 1024 / 1024).toFixed(1)} MB`,
  });
});

app.listen(3000, () => {
  console.log('Server running on http://localhost:3000');
});

Start the server:

node server.js

Generate the leak:

for i in {1..100}; do curl -s http://localhost:3000/leaky > /dev/null; done

Check heap stats:

curl http://localhost:3000/stats

You might see something like:

{"heapUsed":"78.4 MB","heapTotal":"92.1 MB","heapLimit":"4096.0 MB"}

Take a snapshot:

curl http://localhost:3000/snapshot

This returns the path where the snapshot was written, for example:

{"snapshot":"/path/to/server.js/Heap-20260320-120302.12345.heapsnapshot"}

Open this file in Chrome DevTools:

  1. Go to chrome://inspect and open Open dedicated DevTools for Node
  2. Navigate to the Memory tab
  3. Click Load snapshot and select the .heapsnapshot file

In the summary view, sort by Shallow Size (the object size itself, not what it references). The cache Map and its entries will stand out. Switch to the Containment view to explore the retainer chain — why each cache entry is still live in memory.

Streaming Snapshots with v8.getHeapSnapshot() (Node.js 18+)

Instead of writing to disk directly, v8.getHeapSnapshot() returns a ReadableStream that you can pipe anywhere — useful for sending snapshots to remote storage:

const v8 = require('node:v8');
const fs = require('node:fs');

const stream = v8.getHeapSnapshot();
stream.pipe(fs.createWriteStream('live.heapsnapshot'));

Continuous Heap Profiling with --heap-prof

For unattended heap profiling that writes continuous data to disk, use the --heap-prof flag:

node --heap-prof --heap-prof-dir ./heap-profiles server.js

This samples heap allocations over time and writes output files that you can load in Chrome DevTools. Unlike v8.writeHeapSnapshot() which freezes the heap at one moment, --heap-prof gives you allocation patterns across the entire run. Use --heap-prof-interval to adjust the sampling frequency.

When to Call writeHeapSnapshot

A snapshot requires roughly 2× the current heap size in free memory. Calling it on a near-OOM process risks a crash. Always check before taking a snapshot in production:

const { used_heap_size, heap_size_limit } = v8.getHeapStatistics();
const freeMemory = heap_size_limit - used_heap_size;
if (freeMemory < used_heap_size * 0.5) {
  console.error('Not enough headroom for heap snapshot — skipping');
} else {
  v8.writeHeapSnapshot();
}

Step 4: Programmatic Profiling with the inspector Module

The inspector module lets you start and stop CPU profiling from within your code — useful when you want to capture a profile triggered by a specific event, rather than profiling from process start.

Create a separate profiling script:

// profile-session.js
const inspector = require('node:inspector');
const fs = require('node:fs');
const path = require('node:path');

const session = new inspector.Session();
session.connect();

console.log('Starting CPU profile...');
session.post('Profiler.enable', () => {
  session.post('Profiler.start', () => {
    // Simulate a burst of work
    const workload = (n) => {
      if (n <= 1) return n;
      return workload(n - 1) + workload(n - 2);
    };

    console.log('fibonacci(36) =', workload(36));

    // Stop after 5 seconds
    setTimeout(() => {
      session.post('Profiler.stop', (err, { profile }) => {
        if (err) {
          console.error('Profiler.stop error:', err);
          return;
        }
        const outFile = path.join(__dirname, `profile-${Date.now()}.cpuprofile`);
        fs.writeFileSync(outFile, JSON.stringify(profile));
        console.log(`Profile saved to ${outFile}`);
        session.disconnect();
        process.exit(0);
      });
    }, 5000);
  });
});

Run it:

node profile-session.js

After 5 seconds, a .cpuprofile file is written. Load it in Chrome DevTools the same way as before.

Step 5: Interactive Debugging with --inspect

For live step-through debugging and real-time profiling, use the --inspect flag:

node --inspect server.js

This starts the Node.js inspector on 127.0.0.1:9229. Chrome DevTools can attach to it remotely.

node --inspect=0.0.0.0:9229 server.js   # listen on all interfaces (for remote debugging)
node --inspect-brk server.js             # pause immediately on the first line

With --inspect-brk, the process waits for a DevTools client to connect before running any JavaScript — useful when the slow behavior happens at startup.

Common Pitfalls

Microbenchmarks Can Lie

console.time and even performance.now() don’t account for V8’s JIT warmup. A function that looks slow in the first 100 calls may be 10× faster after 10,000 calls because the JIT has compiled the hot path.

Always warm up before measuring:

const warmup = (fn, n = 10000) => {
  for (let i = 0; i < n; i++) fn();
};

warmup(computeExpensive);

const start = performance.now();
const result = computeExpensive();
const elapsed = performance.now() - start;
console.log(`Warm run: ${elapsed.toFixed(3)}ms, result: ${result.toFixed(2)}`);

GC Pauses Distort Short Profiles

Garbage collection pauses can spike dramatically at any point during execution, especially during heavy allocation phases. A short profile may capture one or two GC pauses that dominate the view entirely.

Run profiles for at least 30–60 seconds of normal traffic. Compare warm profiles against cold ones to separate GC noise from actual bottlenecks.

Profiling With --inspect Changes Performance

Running with --inspect (even without an attached client) alters V8’s execution engine — it may JIT-optimize less aggressively, making profiles less representative of production behavior.

Use --inspect only when you need live debugging. For unattended profiling captures, prefer --cpu-prof and --heap-prof flags.

async_hooks Has Significant Overhead

async_hooks tracks asynchronous causality chains across promises and callbacks — useful for understanding where async operations originate. However, it adds 30–100%+ CPU overhead in promise-heavy code. It is not suitable for production profiling of hot paths without careful benchmarking. Profile with async_hooks disabled first to establish a baseline.

Which Tool to Use

SituationTool
Time a specific functionperformance.timerify + PerformanceObserver
Create explicit timeline markersperformance.mark + performance.measure
Find hot call stacks across the whole process--cpu-prof + Chrome DevTools
Continuous heap allocation sampling--heap-prof + Chrome DevTools
Investigate a memory leakv8.writeHeapSnapshot() + Chrome DevTools
Stream a heap snapshot to remote storagev8.getHeapSnapshot() (Node.js 18+)
Trigger profiling from inside the codeinspector module
Live step-through debugging--inspect + Chrome DevTools
Set a hard memory limit--max-old-space-size

Summary

Node.js gives you three built-in profiling systems that cover most performance investigation needs:

  • perf_hooks — precise timing for functions and custom marks, with minimal overhead
  • --cpu-prof and the inspector module — CPU profiling that outputs files you can read as flame graphs in Chrome DevTools
  • v8 module and --heap-prof — memory snapshots showing exactly what objects are live and who retains them

Start with perf_hooks to isolate slow functions, then use --cpu-prof to understand why they’re slow at the call-stack level. When memory is the issue, a heap snapshot in Chrome DevTools makes retainer chains visible.

See Also