Streams in Node.js
Streams are one of Node.js is most powerful features. They allow you to process data piece by piece without loading everything into memory, making them essential for handling large files, network requests, and real-time data. Whether you are building file processing utilities, HTTP servers, or data pipelines, streams provide an elegant solution for handling data efficiently.
What Are Streams?
A stream is an abstract interface for working with streaming data in Node.js. Instead of reading or writing an entire file at once, streams process data in chunks-small pieces that flow through the system. This approach dramatically reduces memory usage when dealing with large files.
Streams are everywhere in Node.js:
- fs.createReadStream() — read files piece by piece
- process.stdin — read from terminal input
- http.ServerResponse — send data to clients
- fs.createWriteStream() — write files incrementally
Understanding streams is fundamental to writing efficient Node.js applications.
Types of Streams
Node.js has four fundamental stream types:
- Readable — source of data (e.g., file read stream)
- Writable — destination for data (e.g., file write stream)
- Duplex — both readable and writable (e.g., TCP socket)
- Transform — modifies data as it passes through (e.g., zlib.createGzip())
Each type serves a specific purpose and can be combined with others to build powerful data processing pipelines.
Reading Streams
Create a readable stream to read files efficiently:
const fs = require('fs');
const readStream = fs.createReadStream('./large-file.txt', {
encoding: 'utf8',
highWaterMark: 1024 * 64 // 64KB chunks (default is 64KB)
});
readStream.on('data', (chunk) => {
console.log('Received ' + chunk.length + ' bytes:');
console.log(chunk);
});
readStream.on('end', () => {
console.log('Finished reading file');
});
readStream.on('error', (err) => {
console.error('Error:', err);
});
The data event fires each time a chunk is available. The end event fires when there is no more data to read. The highWaterMark option controls the buffer size - smaller values use less memory but require more frequent processing.
Writing Streams
Create a writable stream to write data efficiently:
const fs = require('fs');
const writeStream = fs.createWriteStream('./output.txt');
const data = 'Hello, streams!';
writeStream.write(data);
writeStream.write('\nAnother line');
writeStream.end('Final chunk');
writeStream.on('finish', () => {
console.log('Finished writing file');
});
writeStream.on('error', (err) => {
console.error('Error:', err);
});
The finish event fires after all data has been flushed to the underlying system. Always listen for error events to catch write failures.
Piping Streams
The pipe operator connects a readable stream to a writable stream:
const fs = require('fs');
const readStream = fs.createReadStream('./input.txt');
const writeStream = fs.createWriteStream('./output.txt');
readStream.pipe(writeStream);
writeStream.on('finish', () => {
console.log('Copy complete');
});
This is the most common pattern for file operations. The pipe method automatically handles backpressure, pausing the readable stream when the writable stream buffer is full.
You can also pipe through transform streams for on-the-fly processing:
const fs = require('fs');
const zlib = require('zlib');
fs.createReadStream('./input.txt')
.pipe(zlib.createGzip())
.pipe(fs.createWriteStream('./input.txt.gz'));
This reads input.txt, compresses it with gzip, and writes the compressed result—all without loading the entire file into memory.
Transform Streams
Transform streams are duplex streams that can modify data as it passes through:
const { Transform } = require('stream');
const upperCaseStream = new Transform({
transform(chunk, encoding, callback) {
this.push(chunk.toString().toUpperCase());
callback();
}
});
process.stdin.pipe(upperCaseStream).pipe(process.stdout);
This simple transform converts all input to uppercase. More complex examples include compressing/decompressing data with zlib, parsing JSON chunks, and encrypting/decrypting data.
Backpressure
When writing to a slower destination, you must handle backpressure-where the writable stream cannot keep up with the readable stream:
const fs = require('fs');
const readStream = fs.createReadStream('./large-file.txt');
const writeStream = fs.createWriteStream('./output.txt');
readStream.on('data', (chunk) => {
const canContinue = writeStream.write(chunk);
if (!canContinue) {
readStream.pause();
writeStream.once('drain', () => {
readStream.resume();
});
}
});
When write() returns false, stop reading and wait for the drain event. The pipe operator handles this automatically, which is why it is the recommended approach for most use cases.
Working with Buffers
Under the hood, streams use buffers to temporarily hold data. When reading from a stream without specifying an encoding, you get Buffer objects instead of strings:
const fs = require('fs');
const readStream = fs.createReadStream('./input.txt');
readStream.on('data', (chunk) => {
console.log(chunk instanceof Buffer); // true
console.log(chunk.length); // chunk size in bytes
});
The buffer size is controlled by the highWaterMark option. The default is 64KB for streams, but you can adjust it based on your use case-larger chunks mean fewer system calls but higher memory usage.
Practical Example: Processing Large JSON Files
Streams shine when processing large JSON files:
const fs = require('fs');
const { Transform } = require('stream');
const parseJsonStream = new Transform({
readableObjectMode: true,
transform(chunk, encoding, callback) {
try {
const parsed = JSON.parse(chunk);
this.push(parsed);
} catch (e) {
// Skip invalid JSON chunks
}
callback();
}
});
fs.createReadStream('./large-data.json')
.pipe(parseJsonStream)
.on('data', (obj) => {
console.log('Processing:', obj);
});
This approach lets you process gigabyte-sized JSON files without loading them into memory.
Error Handling
Always handle errors on both readable and writable streams:
const fs = require('fs');
const readStream = fs.createReadStream('./nonexistent.txt');
const writeStream = fs.createWriteStream('./output.txt');
readStream.on('error', (err) => {
console.error('Read error:', err.message);
});
writeStream.on('error', (err) => {
console.error('Write error:', err.message);
});
Unhandled stream errors can crash your application, so always attach error listeners.
Conclusion
Streams are essential for building efficient Node.js applications. They enable memory-efficient processing of large files, real-time data handling, and elegant data transformation pipelines. Remember to handle backpressure when piping streams manually, and always listen for error events to prevent silent failures.