BLOG POSTS
    MangoHost Blog / atop Command in Linux – System and Process Monitoring
atop Command in Linux – System and Process Monitoring

atop Command in Linux – System and Process Monitoring

The atop command is one of those Swiss Army knife utilities that experienced Linux admins swear by but newcomers often overlook. It’s an advanced system monitor that provides detailed insights into system resources and process activity, going far beyond what basic tools like top or htop offer. While top shows you current snapshots, atop gives you historical data, resource accounting, and the ability to track down exactly what happened during performance incidents. In this post, we’ll dive into atop’s capabilities, walk through practical monitoring scenarios, and explore how it compares to other monitoring tools in your arsenal.

How atop Works – The Technical Foundation

Unlike traditional process monitors, atop operates as both a real-time monitor and a historical data collector. It leverages kernel accounting features and /proc filesystem data to gather comprehensive system metrics. The tool runs as a daemon (atopd) that continuously logs system statistics to binary files, typically stored in /var/log/atop/.

The magic happens through atop’s dual architecture:

  • Real-time collection: Reads from /proc/stat, /proc/meminfo, /proc/diskstats, and individual process entries
  • Historical logging: Compresses and stores snapshots at configurable intervals (default 10 minutes)
  • Process accounting: Tracks resource usage even for short-lived processes that other tools miss
  • Network monitoring: Integrates with netatop module for per-process network statistics

What makes atop particularly powerful is its ability to show cumulative resource usage over time intervals, not just instantaneous values. This means you can catch CPU spikes, memory leaks, or I/O bottlenecks that happened hours or days ago.

Installation and Initial Setup

Getting atop running is straightforward on most distributions, but the setup varies slightly depending on your system.

Ubuntu/Debian installation:

sudo apt update
sudo apt install atop atopsar

RHEL/CentOS/Fedora installation:

sudo dnf install atop
# or for older versions
sudo yum install atop

Enable automatic logging:

sudo systemctl enable atop
sudo systemctl start atop

The service will start collecting data immediately. You can verify it’s working by checking:

sudo systemctl status atop
ls -la /var/log/atop/

Optional network monitoring setup:

For per-process network statistics, install the netatop kernel module:

sudo apt install netatop  # Ubuntu/Debian
sudo dnf install netatop  # RHEL/Fedora
sudo modprobe netatop
sudo systemctl enable netatop
sudo systemctl start netatop

Real-time Monitoring with atop

The basic atop command launches an interactive interface that updates every few seconds. Here’s how to navigate and customize the display:

atop

The default view shows system-level statistics at the top and process details below. Key interactive commands include:

  • m – Sort by memory usage
  • c – Sort by CPU usage
  • d – Sort by disk activity
  • n – Sort by network activity (requires netatop)
  • a – Sort by most active resource
  • g – Show general process info
  • v – Show various process info
  • u – Filter by specific user

Customizing the refresh interval:

atop 5  # Update every 5 seconds
atop -i 2  # Alternative syntax for 2-second intervals

Monitoring specific processes:

atop -p 1234,5678  # Monitor specific PIDs
atop -u nginx  # Monitor processes by user

Historical Data Analysis

This is where atop really shines compared to other monitoring tools. You can analyze system behavior from any point in time where data was collected.

View yesterday’s data:

atop -r /var/log/atop/atop_$(date -d yesterday +%Y%m%d)

Analyze a specific time period:

atop -r /var/log/atop/atop_20231201 -b 14:00 -e 16:00

Navigate through historical data:

  • t – Jump to next sample
  • T – Jump to previous sample
  • b – Jump to specific time
  • r – Reset to beginning

Generate reports with atopsar:

# CPU utilization report
atopsar -c -r /var/log/atop/atop_20231201 -b 09:00 -e 17:00

# Memory usage report
atopsar -m -r /var/log/atop/atop_20231201

# Disk activity report
atopsar -d -r /var/log/atop/atop_20231201

Real-world Use Cases and Examples

Troubleshooting performance incidents:

Last week, users reported slowness around 2 PM. Here’s how to investigate:

# Find the relevant log file
ls /var/log/atop/atop_$(date +%Y%m%d)

# Open atop for that timeframe
atop -r /var/log/atop/atop_$(date +%Y%m%d) -b 13:45 -e 14:15

# Press 'a' to sort by most active resource
# Navigate with 't' to see progression over time

Identifying memory leaks:

Generate a memory usage trend report:

atopsar -m -r /var/log/atop/atop_$(date +%Y%m%d) | grep -E "(TIME|committed)"

Database performance correlation:

When database queries slow down, check I/O patterns:

# Monitor MySQL process specifically
atop -r /var/log/atop/atop_$(date +%Y%m%d) -b 10:00 -e 11:00
# Press 'd' to sort by disk activity
# Look for mysqld process and its read/write patterns

Capacity planning:

Generate weekly resource utilization summaries:

#!/bin/bash
for day in {1..7}; do
    date_str=$(date -d "$day days ago" +%Y%m%d)
    echo "=== $(date -d "$day days ago" +%Y-%m-%d) ==="
    atopsar -A -r /var/log/atop/atop_$date_str | head -20
    echo
done

Comparison with Alternative Monitoring Tools

Feature atop htop iotop vmstat sar
Real-time monitoring
Historical data
Process accounting
Memory details
Disk I/O per process
Network per process ✓ (with netatop)
Resource overhead Low-Medium Low Medium Very Low Low
Learning curve Medium Easy Easy Easy Medium

Performance comparison on a typical web server:

Metric atop htop System Impact
CPU usage 0.1-0.3% 0.1-0.2% Negligible
Memory usage 15-25 MB 8-12 MB Minimal
Disk space (logs/day) 50-200 MB 0 MB Plan for rotation
Network overhead None None Local only

Advanced Configuration and Best Practices

Customize logging intervals:

Edit /etc/default/atop to change collection frequency:

# Default interval in seconds (600 = 10 minutes)
LOGOPTS="-R"
LOGINTERVAL=300  # Change to 5 minutes for higher resolution

Log rotation configuration:

Atop logs can grow large. Configure retention in /etc/atop/atop.daily:

# Keep logs for 28 days
KEEP_DAYS=28

# Compress old logs
COMPRESS_LOGS=yes

Custom monitoring scripts:

Create automated alerts based on atop data:

#!/bin/bash
# Check for high CPU usage in last 30 minutes
LOGFILE="/var/log/atop/atop_$(date +%Y%m%d)"
HIGH_CPU=$(atopsar -c -r $LOGFILE -b $(date -d '30 minutes ago' +%H:%M) | \
           awk '/Average/ && $4 > 80 {print $4}')

if [ ! -z "$HIGH_CPU" ]; then
    echo "High CPU detected: $HIGH_CPU%" | mail -s "CPU Alert" admin@company.com
fi

Integration with monitoring systems:

Export atop data to external systems:

# Export JSON format for parsing
atop -P JSON 5 1 > /tmp/atop_snapshot.json

# Parse with jq for specific metrics
cat /tmp/atop_snapshot.json | jq '.processes[] | select(.pid == "1234")'

Security considerations:

  • Atop logs contain sensitive system information – restrict access with proper file permissions
  • Consider log encryption for compliance requirements
  • Monitor disk space usage to prevent log partition filling
  • Use logrotate to manage historical data retention

Common Pitfalls and Troubleshooting

Problem: atop daemon not collecting data

Check if the service is running and has proper permissions:

sudo systemctl status atop
sudo journalctl -u atop -f
ls -la /var/log/atop/

Problem: Missing network statistics

Ensure netatop module is loaded:

lsmod | grep netatop
sudo modprobe netatop
# Add to /etc/modules-load.d/netatop.conf for persistence

Problem: High disk usage from atop logs

Implement log rotation and compression:

# Check current usage
du -sh /var/log/atop/

# Manual cleanup of old logs
find /var/log/atop/ -name "atop_*" -mtime +30 -delete

# Verify logrotate configuration
sudo logrotate -d /etc/logrotate.d/atop

Problem: Performance impact on busy systems

Adjust collection frequency and monitoring scope:

# Reduce logging frequency for busy systems
echo "LOGINTERVAL=1800" >> /etc/default/atop  # 30 minutes

# Monitor specific processes only
atop -p $(pgrep -d, important_service)

For comprehensive documentation and advanced features, check the official atop resources at atoptool.nl and the detailed man pages. The tool’s flexibility makes it invaluable for both reactive troubleshooting and proactive system monitoring, especially when you need that historical context that other monitoring tools simply can’t provide.



This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.

This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.

Leave a reply

Your email address will not be published. Required fields are marked