
Use nvtop to Monitor NVIDIA GPU Performance in Linux
Table of Contents
- What’s This Post About?
- A Real-World GPU Headache
- Why GPU Monitoring on Linux Matters
- How Does nvtop Actually Work?
- Use Cases: When and Why to Use nvtop
- Fast and Easy Setup: Your Step-By-Step Guide
- Mini Glossary: Real-Talk Definitions
- Case Studies, Comic Comparisons & Pro Tips
- Beginner Mistakes, Myths, and Alternatives
- nvtop or Not? The Decision Flowchart
- Automation, Scripting & Unusual Tricks
- Short Story: The Drowning Admin
- Wrap-up & Recommendations
What’s This Post About?
If you’ve ever run GPU workloads on a Linux server—whether it’s a slick VPS, a Dockerized deep learning rig, or a gnarly rackmount behemoth—you’ve probably wondered: “How busy are my GPUs? Who’s chewing up all that sweet VRAM? Why is everything so slow?”
This post is your fast lane to mastering nvtop, a real-time, terminal-based NVIDIA GPU monitor for Linux. We’ll dig into what makes nvtop tick, how to get it running in minutes, and why every coder, sysadmin, and ML enthusiast should have this tool in their arsenal. Get ready for GIFs-in-your-head, comic metaphors, and the kind of practical advice you won’t find in the man pages.
A Real-World GPU Headache
Picture this: It’s 3am. You’re on call. The production AI recommender has ground to a halt. Slack is blowing up. Your cloud bill ticks up by the minute. You SSH into your dedicated server…
nvidia-smi shows something, but it’s clunky. You want to see which process is eating GPU, how much VRAM you have left, and live stats—not just a snapshot. And you want it now.
That’s where nvtop comes in. It’s like htop
, but for NVIDIA GPUs—colorful, interactive, and perfect for the terminal crowd.
Why GPU Monitoring on Linux Matters
- GPUs are expensive. You want to squeeze every flop out of them—and not let idle VRAM go to waste.
- Multi-user servers are chaos. Who is running what, and why is my training job crawling?
- Docker, Kubernetes, and cloud? Dynamic, ephemeral workloads make GPU monitoring a moving target.
- nvidia-smi is cool, but static. Sometimes you want streaming, interactive updates. Especially when debugging runaway jobs.
Whether you’re renting a VPS, deploying on a dedicated server, or spinning up containers, nvtop is your new best friend for real-time GPU insights.
How Does nvtop Actually Work?
Under the hood, nvtop is a C-based, ncurses-powered terminal app. It queries NVIDIA’s libnvidia-ml
(the same library as nvidia-smi
), but presents the info in a dynamic, interactive TUI (text user interface). What does that mean for you?
- Live stats. Watch GPU utilization, memory usage, fan speed, and temperature update in real time.
- Process list. See which PIDs are using the GPU, how much VRAM each process is eating, and their command lines.
- Multiple GPUs? nvtop shows them all, side-by-side. No more guessing which card is melting.
- Low overhead. It’s lightweight—perfect for SSH, tmux, or screen sessions.
- Keyboard controls. Sort, filter, and zoom without leaving your terminal.
Algorithmically, nvtop is polling the NVIDIA driver via the Management Library every second or so, parsing process tables, and rendering pretty graphs using ASCII art and colors. It’s like htop
and nvidia-smi
had a beautiful, geeky baby.
Use Cases: When and Why to Use nvtop
- Debugging slow jobs. Instantly see if your code is bottlenecked by GPU or CPU.
- Multi-user environments. Track who is using what (and when it’s time to send “please kill your job” messages).
- Cloud cost optimization. Identify idle GPUs and right-size your fleet.
- Docker/K8s visibility. See inside containers (as long as they have access to the NVIDIA device files).
- Home lab bragging rights. Show off your RTX 4090’s utilization in glorious ASCII at your next meetup.
- Automated alerts/scripts. Use nvtop’s output (or the underlying NVML API) to trigger alerts when thermals spike or VRAM runs out.
- Remote monitoring. Combine with SSH, tmux, and even web-based terminal dashboards.
Fast and Easy Setup: Your Step-By-Step Guide
Let’s get you running in five minutes or less. Assumes you have NVIDIA drivers and CUDA set up already.
Step 1: Prerequisites
- Linux (Ubuntu, Debian, CentOS, Fedora, Arch, etc.)
- NVIDIA GPU with driver installed
- libnvidia-ml (part of NVIDIA drivers)
- ncurses-dev / development tools (for source install)
Step 2: Installing nvtop
- Ubuntu/Debian (20.04+):
sudo apt update sudo apt install nvtop
- Fedora:
sudo dnf install nvtop
- Arch Linux:
sudo pacman -S nvtop
- Other distros / Building from source:
Clone the repo: https://github.com/Syllo/nvtop
Build with cmake:sudo apt install cmake libncurses5-dev libncursesw5-dev git git clone https://github.com/Syllo/nvtop.git cd nvtop cmake . make sudo make install
Detailed instructions: nvtop GitHub
Step 3: Run It!
nvtop
- Use the up/down arrows to navigate, or
h
for help. - Sort by VRAM, utilization, PID, or command line.
- Press
q
to quit.
Step 4: (Optional) Docker or Headless Use
- Make sure your container has access to
/dev/nvidia*
devices and the right drivers. Use the NVIDIA Container Toolkit. - Install
nvtop
inside your container, as above.
Mini Glossary: Real-Talk Definitions
- VRAM: The GPU’s working memory, like RAM on your PC but faster and shinier.
- Utilization: How busy is your GPU? 0% = sleeping, 100% = melting.
- Process: A running program using the GPU, e.g., your PyTorch model or a random cryptominer (uh oh).
- Fan Speed: Self-explanatory. If it’s at 100%, your server may soon become a jet engine.
- ncurses: A library for making terminal apps look not-ugly.
- NVML: NVIDIA’s Management Library, the API for querying GPU stats.
Case Studies, Comic Comparisons & Pro Tips
Comic Metaphor Table: “GPU Monitoring Tools Fight Club”
Tool | Personality | Strengths | Weaknesses | Best Use Case |
---|---|---|---|---|
nvidia-smi | The Stoic Librarian | Always available, precise | Static, no live updates, no colors | Scripted logs, one-off checks |
nvtop | The Rave DJ | Colorful, real-time, interactive | Terminal only, NVIDIA-only | Debugging live workloads, daily ops |
gpustat | The Twitter Addict | Compact summaries, JSON output | No interactive UI | Quick terminal checks, pretty output |
htop | The Old Guard | Process-centric, system-wide | No GPU stats | CPU/memory troubleshooting |
DCGM | The Corporate Overlord | Enterprise features, telemetry | Complex, overkill for most | Fleet-wide GPU monitoring |
Pro Tip:
- Combine nvtop with
htop
in split tmux panes for a full picture of CPU and GPU chaos. - Use
gpustat -i
for a quick text-based alternative if you only need summary info (gpustat GitHub).
Positive Case:
“I used nvtop to spot a rogue Jupyter notebook that was leaking VRAM. Saved my team three hours and a reboot!”
Negative Case:
“Tried running nvtop on an AMD GPU. Got a blank screen. Oops—NVIDIA only, folks.”
Beginner Mistakes, Myths, and Alternatives
- Myth: “I need to install CUDA to use nvtop.” Fact: You just need the driver and NVML.
- Myth: “nvtop works for all GPUs.” Fact: It’s NVIDIA only (for now).
- Mistake: Not running as root when your user lacks GPU access. Solution: Add user to
video
group or usesudo
. - Mistake: Running inside Docker without
--gpus all
or missing device files. See NVIDIA Docker docs. - Alternative: For AMD cards, try nvtop (experimental branch) or rocm-smi.
Common Error Messages and Fixes
- “No NVML device found” — Check NVIDIA driver install, run
nvidia-smi
first. - “No GPUs detected” — Is your server virtualized without GPU passthrough? Is the kernel module loaded?
nvtop or Not? The Decision Flowchart
🖥️ | ├──> Do you have an NVIDIA GPU? | | | No --> Try rocm-smi (AMD) or intel_gpu_top (Intel) | └──> Need real-time, interactive stats? | Yes --> Use nvtop! | No | ┌───────────────┬─────────────┐ | | | Need JSON? Want summary? Fleet metrics? | | | Use DCGM Use gpustat Use DCGM or Prometheus
Automation, Scripting & Unusual Tricks
Cool Things You Can Do With nvtop & Friends
- Monitor GPU health over SSH in tmux—leave it running in the background for remote troubleshooting.
- Scripted alerts: Use
nvidia-smi --query-gpu=utilization.gpu,memory.used --format=csv
in a cron job or with a small Python script to email/Slack/page you if GPU is >95% for too long. - Combine with Prometheus/Grafana: Pull metrics via
nvidia-dcgm-exporter
for pretty dashboards, but use nvtop for hands-on debugging. - Embed in Jupyter Notebooks: Use
!nvtop -m 1
in cell magic (though output is less pretty in web UIs).
Sample Script: Alert When GPU VRAM >90%
#!/bin/bash
THRESHOLD=90
CURRENT=$(nvidia-smi --query-gpu=memory.used,memory.total --format=csv,noheader,nounits | awk -F',' '{used+=$1; total+=$2} END {print int(used/total*100)}')
if [ "$CURRENT" -gt "$THRESHOLD" ]; then
echo "Warning: GPU memory usage is at ${CURRENT}%!"
# Here you could send an email, Slack, etc.
fi
Short Story: The Drowning Admin
Once upon a Tuesday, Alex, a sleep-deprived admin, gets an urgent ping: “The deep learning server is down!” SSH-ing in, Alex runs nvidia-smi
and sees a wall of numbers. Confused, Alex tries nvtop—and instantly, a rainbow of activity, sorted by PID, reveals that Bob from accounting left a training job running over lunch.
Lesson: If Bob can use the GPU, you need nvtop. Save your sanity.
Wrap-up & Recommendations
- nvtop is a must-have if you run NVIDIA GPUs on Linux, especially in shared, cloud, or containerized environments.
- It’s fast, lightweight, and gives you the live feedback you need to debug, optimize, and show off.
- Perfect for DevOps, ML engineers, researchers, and even curious hobbyists.
- Not for AMD/Intel (yet). For those, check out rocm-smi or intel_gpu_top.
- For full-stack ops, combine with
htop
,gpustat
, and Prometheus/Grafana for all the monitoring you’ll ever need. - If you need a rock-solid VPS or dedicated box with GPUs, check out VPS or dedicated server options at mangohost.
Don’t let your GPUs run wild in the server farm—tame them with nvtop, and sleep better tonight.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.
This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.