BLOG POSTS

MangoHost Blog / How to Work with ZIP Files in Node.js

How to Work with ZIP Files in Node.js

ZIP files are everywhere in the web development world, from package distributions to user uploads and data archiving. For Node.js developers, mastering ZIP file operations is essential for building robust applications that can handle file compression, extraction, and manipulation efficiently. This guide covers everything from basic ZIP creation and extraction to advanced streaming operations, performance optimization, and real-world implementation patterns that you’ll actually use in production environments.

How ZIP File Operations Work in Node.js

Node.js doesn’t include native ZIP support in its core modules, so you’ll need third-party libraries. The ecosystem offers several options, but the most popular are yauzl and yazl for low-level operations, adm-zip for simplicity, and archiver combined with unzipper for streaming operations.

ZIP files use the DEFLATE compression algorithm by default and store file metadata including paths, timestamps, and permissions. When working with ZIP files in Node.js, you’re essentially reading/writing binary data while managing the ZIP file structure and compression.

Here’s how the main libraries compare:

Library	Best For	Memory Usage	Streaming Support	Learning Curve
adm-zip	Simple operations	High (loads entire file)	No	Easy
archiver + unzipper	Production applications	Low	Yes	Medium
yauzl + yazl	Fine-grained control	Low	Yes	Steep

Setting Up Your ZIP File Environment

First, let’s install the necessary packages. For most use cases, I recommend starting with archiver and unzipper as they provide the best balance of features and performance:

npm install archiver unzipper
npm install --save-dev @types/archiver @types/unzipper  # If using TypeScript

For simpler use cases where memory usage isn’t critical:

npm install adm-zip

Create a basic project structure:

mkdir zip-operations
cd zip-operations
npm init -y
mkdir test-files output
echo "Hello World" > test-files/hello.txt
echo "Node.js ZIP operations" > test-files/readme.md

Creating ZIP Files

Let’s start with creating ZIP files using different approaches. Here’s a comprehensive example using archiver for streaming operations:

const fs = require('fs');
const archiver = require('archiver');
const path = require('path');

async function createZipFile(sourcePath, outputPath) {
  return new Promise((resolve, reject) => {
    const output = fs.createWriteStream(outputPath);
    const archive = archiver('zip', {
      zlib: { level: 9 } // Maximum compression
    });

    output.on('close', () => {
      console.log(`ZIP created: ${archive.pointer()} total bytes`);
      resolve();
    });

    archive.on('error', (err) => {
      reject(err);
    });

    archive.pipe(output);

    // Add files and directories
    if (fs.statSync(sourcePath).isDirectory()) {
      archive.directory(sourcePath, false);
    } else {
      archive.file(sourcePath, { name: path.basename(sourcePath) });
    }

    archive.finalize();
  });
}

// Usage
createZipFile('./test-files', './output/archive.zip')
  .then(() => console.log('Archive created successfully'))
  .catch(err => console.error('Error:', err));

For simple operations where you don’t need streaming, adm-zip is more straightforward:

const AdmZip = require('adm-zip');
const fs = require('fs');

function createSimpleZip(files, outputPath) {
  const zip = new AdmZip();
  
  files.forEach(file => {
    if (fs.statSync(file).isDirectory()) {
      zip.addLocalFolder(file);
    } else {
      zip.addLocalFile(file);
    }
  });
  
  zip.writeZip(outputPath);
}

// Usage
createSimpleZip(['./test-files/hello.txt', './test-files/readme.md'], './output/simple.zip');

Extracting ZIP Files

Extraction is where things get interesting, especially when dealing with security concerns and large files. Here’s a robust extraction function using unzipper:

const fs = require('fs');
const path = require('path');
const unzipper = require('unzipper');

async function extractZipFile(zipPath, extractPath, options = {}) {
  const { maxFileSize = 100 * 1024 * 1024, allowedExtensions = null } = options;
  
  return new Promise((resolve, reject) => {
    const extractedFiles = [];
    
    fs.createReadStream(zipPath)
      .pipe(unzipper.Parse())
      .on('entry', (entry) => {
        const fileName = entry.path;
        const type = entry.type;
        const size = entry.vars.uncompressedSize;
        
        // Security checks
        if (fileName.includes('..')) {
          console.warn(`Skipping potentially dangerous path: ${fileName}`);
          entry.autodrain();
          return;
        }
        
        if (size > maxFileSize) {
          console.warn(`Skipping large file: ${fileName} (${size} bytes)`);
          entry.autodrain();
          return;
        }
        
        if (allowedExtensions && !allowedExtensions.includes(path.extname(fileName))) {
          console.warn(`Skipping file with disallowed extension: ${fileName}`);
          entry.autodrain();
          return;
        }
        
        const fullPath = path.join(extractPath, fileName);
        
        if (type === 'File') {
          // Ensure directory exists
          fs.mkdirSync(path.dirname(fullPath), { recursive: true });
          entry.pipe(fs.createWriteStream(fullPath));
          extractedFiles.push(fullPath);
        } else {
          entry.autodrain();
        }
      })
      .on('error', reject)
      .on('close', () => resolve(extractedFiles));
  });
}

// Usage with security options
extractZipFile('./output/archive.zip', './extracted', {
  maxFileSize: 50 * 1024 * 1024, // 50MB limit
  allowedExtensions: ['.txt', '.md', '.json']
})
.then(files => console.log('Extracted files:', files))
.catch(err => console.error('Extraction error:', err));

For simpler scenarios, adm-zip provides a more direct approach:

const AdmZip = require('adm-zip');

function extractSimpleZip(zipPath, extractPath) {
  const zip = new AdmZip(zipPath);
  const entries = zip.getEntries();
  
  entries.forEach(entry => {
    // Basic security check
    if (entry.entryName.includes('..')) {
      console.warn(`Skipping dangerous path: ${entry.entryName}`);
      return;
    }
    
    console.log(`Extracting: ${entry.entryName}`);
  });
  
  zip.extractAllTo(extractPath, true);
}

extractSimpleZip('./output/simple.zip', './extracted-simple');

Real-World Use Cases and Examples

Let’s explore practical scenarios you’ll encounter in production applications.

File Upload Processing

A common use case is processing ZIP files uploaded by users in web applications:

const express = require('express');
const multer = require('multer');
const unzipper = require('unzipper');
const fs = require('fs');
const path = require('path');

const upload = multer({ 
  dest: 'uploads/',
  limits: { fileSize: 10 * 1024 * 1024 } // 10MB limit
});

async function processUploadedZip(filePath, userId) {
  const extractPath = `./processed/${userId}/${Date.now()}`;
  const results = {
    totalFiles: 0,
    processedFiles: [],
    errors: []
  };
  
  try {
    await fs.promises.mkdir(extractPath, { recursive: true });
    
    const files = await new Promise((resolve, reject) => {
      const filesList = [];
      
      fs.createReadStream(filePath)
        .pipe(unzipper.Parse())
        .on('entry', async (entry) => {
          const fileName = entry.path;
          const size = entry.vars.uncompressedSize;
          
          // Implement business logic
          if (size > 5 * 1024 * 1024) { // 5MB per file limit
            results.errors.push(`File too large: ${fileName}`);
            entry.autodrain();
            return;
          }
          
          if (fileName.match(/\.(exe|bat|sh|scr)$/i)) {
            results.errors.push(`Executable files not allowed: ${fileName}`);
            entry.autodrain();
            return;
          }
          
          const outputPath = path.join(extractPath, fileName);
          await fs.promises.mkdir(path.dirname(outputPath), { recursive: true });
          
          entry.pipe(fs.createWriteStream(outputPath));
          filesList.push({ name: fileName, size, path: outputPath });
          results.totalFiles++;
        })
        .on('close', () => resolve(filesList))
        .on('error', reject);
    });
    
    results.processedFiles = files;
    return results;
    
  } catch (error) {
    results.errors.push(error.message);
    return results;
  } finally {
    // Clean up uploaded file
    fs.unlink(filePath, () => {});
  }
}

const app = express();

app.post('/upload-zip', upload.single('zipfile'), async (req, res) => {
  if (!req.file) {
    return res.status(400).json({ error: 'No file uploaded' });
  }
  
  const results = await processUploadedZip(req.file.path, req.user?.id || 'anonymous');
  res.json(results);
});

app.listen(3000, () => console.log('Server running on port 3000'));

Backup System Implementation

Here’s a robust backup system that creates incremental ZIP archives:

const fs = require('fs').promises;
const path = require('path');
const archiver = require('archiver');
const crypto = require('crypto');

class BackupManager {
  constructor(baseDir) {
    this.baseDir = baseDir;
    this.backupDir = path.join(baseDir, 'backups');
  }
  
  async initialize() {
    await fs.mkdir(this.backupDir, { recursive: true });
  }
  
  async createBackup(sourcePaths, backupName) {
    const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
    const backupFileName = `${backupName}-${timestamp}.zip`;
    const backupPath = path.join(this.backupDir, backupFileName);
    
    const manifest = {
      created: new Date().toISOString(),
      files: [],
      totalSize: 0
    };
    
    return new Promise(async (resolve, reject) => {
      const output = await fs.open(backupPath, 'w');
      const archive = archiver('zip', {
        zlib: { level: 6 } // Balanced compression
      });
      
      const stream = output.createWriteStream();
      
      stream.on('close', async () => {
        await output.close();
        manifest.compressedSize = archive.pointer();
        
        // Save manifest
        const manifestPath = backupPath.replace('.zip', '.manifest.json');
        await fs.writeFile(manifestPath, JSON.stringify(manifest, null, 2));
        
        resolve({
          backupPath,
          manifestPath,
          stats: manifest
        });
      });
      
      archive.on('error', reject);
      archive.pipe(stream);
      
      // Add files with metadata
      for (const sourcePath of sourcePaths) {
        try {
          const stats = await fs.stat(sourcePath);
          
          if (stats.isDirectory()) {
            archive.directory(sourcePath, path.basename(sourcePath));
            const files = await this.getDirectoryFiles(sourcePath);
            manifest.files.push(...files);
          } else {
            const content = await fs.readFile(sourcePath);
            const hash = crypto.createHash('sha256').update(content).digest('hex');
            
            archive.file(sourcePath, { name: path.basename(sourcePath) });
            manifest.files.push({
              path: sourcePath,
              size: stats.size,
              hash,
              modified: stats.mtime.toISOString()
            });
            manifest.totalSize += stats.size;
          }
        } catch (error) {
          console.warn(`Skipping ${sourcePath}: ${error.message}`);
        }
      }
      
      archive.finalize();
    });
  }
  
  async getDirectoryFiles(dirPath) {
    const files = [];
    const entries = await fs.readdir(dirPath, { withFileTypes: true });
    
    for (const entry of entries) {
      const fullPath = path.join(dirPath, entry.name);
      
      if (entry.isDirectory()) {
        files.push(...await this.getDirectoryFiles(fullPath));
      } else {
        const stats = await fs.stat(fullPath);
        const content = await fs.readFile(fullPath);
        const hash = crypto.createHash('sha256').update(content).digest('hex');
        
        files.push({
          path: fullPath,
          size: stats.size,
          hash,
          modified: stats.mtime.toISOString()
        });
      }
    }
    
    return files;
  }
  
  async listBackups() {
    const files = await fs.readdir(this.backupDir);
    const backups = [];
    
    for (const file of files) {
      if (file.endsWith('.manifest.json')) {
        const manifestPath = path.join(this.backupDir, file);
        const manifest = JSON.parse(await fs.readFile(manifestPath, 'utf8'));
        const zipPath = manifestPath.replace('.manifest.json', '.zip');
        
        backups.push({
          name: file.replace('.manifest.json', ''),
          created: manifest.created,
          files: manifest.files.length,
          totalSize: manifest.totalSize,
          compressedSize: manifest.compressedSize,
          compressionRatio: (1 - manifest.compressedSize / manifest.totalSize) * 100,
          zipPath,
          manifestPath
        });
      }
    }
    
    return backups.sort((a, b) => new Date(b.created) - new Date(a.created));
  }
}

// Usage example
async function runBackupExample() {
  const backupManager = new BackupManager('./backup-system');
  await backupManager.initialize();
  
  const result = await backupManager.createBackup([
    './test-files',
    './package.json'
  ], 'daily-backup');
  
  console.log('Backup created:', result);
  
  const backups = await backupManager.listBackups();
  console.log('Available backups:', backups);
}

runBackupExample().catch(console.error);

Performance Optimization and Best Practices

When working with ZIP files in production, performance and memory usage are critical considerations. Here are the key optimization strategies:

Use streaming operations – Always prefer streaming over loading entire files into memory
Implement compression level tuning – Level 6 provides the best speed/compression balance for most use cases
Add proper error handling – ZIP operations can fail in many ways, especially with user-generated content
Implement security measures – Always validate file paths and sizes to prevent zip bomb attacks
Use worker threads for CPU-intensive operations – Compression can block the event loop

Here’s a performance comparison of different compression levels:

Compression Level	Speed	Size Reduction	CPU Usage	Best For
1 (Fastest)	Very Fast	~40%	Low	Real-time operations
6 (Balanced)	Fast	~60%	Medium	General use
9 (Best)	Slow	~65%	High	Archival storage

Here’s an advanced example that implements worker threads for heavy compression tasks:

const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');
const fs = require('fs');
const archiver = require('archiver');

if (isMainThread) {
  // Main thread - API handler
  async function compressWithWorker(sourcePath, outputPath, options = {}) {
    return new Promise((resolve, reject) => {
      const worker = new Worker(__filename, {
        workerData: { sourcePath, outputPath, options }
      });
      
      worker.on('message', resolve);
      worker.on('error', reject);
      worker.on('exit', (code) => {
        if (code !== 0) {
          reject(new Error(`Worker stopped with exit code ${code}`));
        }
      });
      
      // Set timeout for long-running operations
      setTimeout(() => {
        worker.terminate();
        reject(new Error('Compression timeout'));
      }, options.timeout || 300000); // 5 minutes default
    });
  }
  
  // Export for use in your application
  module.exports = { compressWithWorker };
  
} else {
  // Worker thread - actual compression
  const { sourcePath, outputPath, options } = workerData;
  
  async function performCompression() {
    const output = fs.createWriteStream(outputPath);
    const archive = archiver('zip', {
      zlib: { 
        level: options.compressionLevel || 6,
        memLevel: options.memLevel || 8
      }
    });
    
    let totalFiles = 0;
    let totalSize = 0;
    
    archive.on('entry', (entry) => {
      totalFiles++;
      totalSize += entry.stats.size;
      
      // Report progress every 100 files
      if (totalFiles % 100 === 0) {
        parentPort.postMessage({
          type: 'progress',
          files: totalFiles,
          size: totalSize
        });
      }
    });
    
    archive.on('error', (err) => {
      parentPort.postMessage({ type: 'error', error: err.message });
    });
    
    output.on('close', () => {
      parentPort.postMessage({
        type: 'complete',
        totalFiles,
        totalSize,
        compressedSize: archive.pointer(),
        compressionRatio: (1 - archive.pointer() / totalSize) * 100
      });
    });
    
    archive.pipe(output);
    
    if (fs.statSync(sourcePath).isDirectory()) {
      archive.directory(sourcePath, false);
    } else {
      archive.file(sourcePath, { name: require('path').basename(sourcePath) });
    }
    
    archive.finalize();
  }
  
  performCompression().catch(err => {
    parentPort.postMessage({ type: 'error', error: err.message });
  });
}

Common Pitfalls and Troubleshooting

Working with ZIP files can be tricky. Here are the most common issues and their solutions:

Memory Issues with Large Files

Problem: Your application crashes with “JavaScript heap out of memory” when processing large ZIP files.

Solution: Always use streaming operations and avoid loading entire files into memory:

// Bad - loads entire file into memory
const zip = new AdmZip(largeZipFile);
const entries = zip.getEntries(); // Memory explosion here

// Good - uses streaming
fs.createReadStream(largeZipFile)
  .pipe(unzipper.Parse())
  .on('entry', (entry) => {
    // Process one entry at a time
    entry.pipe(fs.createWriteStream(entry.path));
  });

Path Traversal Security Issues

Problem: Malicious ZIP files can extract files outside the intended directory using “../” in file paths.

Solution: Always validate and sanitize file paths:

function sanitizePath(filePath, extractDir) {
  const resolvedPath = path.resolve(extractDir, filePath);
  const resolvedExtractDir = path.resolve(extractDir);
  
  if (!resolvedPath.startsWith(resolvedExtractDir)) {
    throw new Error(`Path traversal attempt detected: ${filePath}`);
  }
  
  return resolvedPath;
}

Encoding Issues

Problem: File names with special characters or non-ASCII characters are corrupted.

Solution: Specify proper encoding options:

const archive = archiver('zip', {
  zlib: { level: 9 },
  forceLocalTime: true,
  comment: 'Created with proper encoding support'
});

// For extraction, many libraries auto-detect encoding
// but you can specify it explicitly if needed

Incomplete Archives

Problem: ZIP files are created but some files are missing or corrupted.

Solution: Always wait for the finalize event and handle errors properly:

archive.on('warning', (err) => {
  if (err.code === 'ENOENT') {
    console.warn('Warning:', err);
  } else {
    throw err;
  }
});

archive.on('error', (err) => {
  throw err;
});

// Wait for finalize to complete
await archive.finalize();
await new Promise((resolve) => output.on('close', resolve));

For more detailed information about ZIP file handling and Node.js streaming, check out the official Node.js Streams documentation and the Archiver.js documentation.

When deploying applications that handle ZIP files at scale, consider using robust hosting solutions like VPS hosting for better control over system resources, or dedicated servers for high-volume file processing operations that require maximum performance and memory allocation.

ZIP file operations in Node.js are powerful when implemented correctly, but they require careful attention to memory management, security, and error handling. The streaming approach with proper validation will serve you well in production environments, whether you’re building file upload systems, backup solutions, or data processing pipelines.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.

This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.