
How to Transform JSON Data with jq – CLI Guide
JSON (JavaScript Object Notation) has become the ubiquitous data interchange format across APIs, configuration files, and web services, but parsing and transforming JSON from the command line can be a nightmare without the right tools. That’s where jq comes in – a lightweight, flexible command-line JSON processor that lets you slice, filter, map, and transform JSON data with surgical precision. In this guide, you’ll learn how to harness jq’s powerful query language to manipulate JSON data efficiently, from basic field extraction to complex transformations that would take dozens of lines in traditional scripting languages.
What is jq and How It Works
jq is a command-line JSON processor written in C that treats JSON data as a stream of values. Unlike traditional text processing tools like grep or sed, jq understands JSON structure natively, allowing you to navigate nested objects and arrays without string manipulation gymnastics.
The core concept behind jq is its filter-based approach. Every jq operation is essentially a filter that takes JSON input and produces JSON output. These filters can be chained together using pipes, similar to Unix command-line tools, creating powerful data transformation pipelines.
Here’s how jq processes data:
- Parses input JSON into an internal representation
- Applies the specified filter expression
- Outputs the result as formatted JSON
- Handles streaming for large datasets efficiently
The beauty of jq lies in its composability – simple filters can be combined to create complex transformations that would require significant programming effort in other languages.
Installation and Basic Setup
Getting jq installed is straightforward across different platforms:
# Ubuntu/Debian
sudo apt-get update && sudo apt-get install jq
# CentOS/RHEL/Rocky Linux
sudo yum install jq
# or for newer versions
sudo dnf install jq
# macOS
brew install jq
# Windows (using Chocolatey)
choco install jq
# From source (latest version)
wget https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64
chmod +x jq-linux64
sudo mv jq-linux64 /usr/local/bin/jq
Verify your installation:
jq --version
# Output: jq-1.6
For production environments, especially when running on VPS or dedicated servers, you might want to compile from source to get the latest features and security patches.
Essential jq Syntax and Filters
jq uses a domain-specific language for filtering JSON. Here are the fundamental building blocks:
Basic Filters
# Identity filter - returns input unchanged
echo '{"name": "john", "age": 30}' | jq '.'
# Field access
echo '{"name": "john", "age": 30}' | jq '.name'
# Output: "john"
# Nested field access
echo '{"user": {"name": "john", "age": 30}}' | jq '.user.name'
# Output: "john"
# Array indexing
echo '["apple", "banana", "cherry"]' | jq '.[1]'
# Output: "banana"
# Array slicing
echo '[1,2,3,4,5]' | jq '.[1:3]'
# Output: [2,3]
Handling Missing Fields
# Safe navigation with optional operator
echo '{"name": "john"}' | jq '.age?'
# Output: null (instead of error)
# Providing default values
echo '{"name": "john"}' | jq '.age // 0'
# Output: 0
Data Transformation Techniques
Filtering Arrays
One of jq’s most powerful features is array manipulation:
# Sample data
cat > users.json << EOF
{
"users": [
{"name": "alice", "age": 25, "active": true},
{"name": "bob", "age": 35, "active": false},
{"name": "charlie", "age": 28, "active": true}
]
}
EOF
# Filter active users
jq '.users[] | select(.active == true)' users.json
# Get names of users over 30
jq '.users[] | select(.age > 30) | .name' users.json
# Map transformation - add full_name field
jq '.users | map(. + {"full_name": (.name | ascii_upcase)})' users.json
Grouping and Aggregation
# Group users by active status
jq '.users | group_by(.active)' users.json
# Count users by status
jq '.users | group_by(.active) | map({status: .[0].active, count: length})' users.json
# Calculate average age
jq '.users | map(.age) | add / length' users.json
Complex Object Construction
# Build new object structure
jq '{
summary: {
total_users: (.users | length),
active_users: (.users | map(select(.active)) | length),
average_age: (.users | map(.age) | add / length)
},
user_names: [.users[].name]
}' users.json
Real-World Use Cases and Examples
API Response Processing
Processing API responses is where jq truly shines:
# GitHub API example - get repository names
curl -s "https://api.github.com/users/torvalds/repos" | \
jq -r '.[].name' | head -5
# Extract specific fields from API response
curl -s "https://api.github.com/users/torvalds/repos" | \
jq '.[] | {name: .name, stars: .stargazers_count, language: .language}' | \
jq -s 'sort_by(-.stars) | .[0:5]'
Log File Analysis
# Process JSON logs
cat access.log | jq -r 'select(.status >= 400) | "\(.timestamp) \(.ip) \(.status) \(.path)"'
# Aggregate error counts by status code
cat access.log | jq -s 'group_by(.status) | map({status: .[0].status, count: length})'
Configuration File Management
# Update configuration values
jq '.database.host = "new-host.example.com" | .database.port = 5432' config.json > config.new.json
# Merge configuration files
jq -s '.[0] * .[1]' base-config.json env-config.json > merged-config.json
CSV to JSON Conversion
# Convert CSV to JSON (with headers)
jq -R -s '
split("\n")[:-1] |
map(split(",")) |
.[0] as $headers |
.[1:] |
map(. as $row | reduce range(0; $headers|length) as $i ({}; .[$headers[$i]] = $row[$i]))
' data.csv
Performance and Streaming
jq handles large datasets efficiently through streaming and optimized memory usage:
File Size | Memory Usage | Processing Time | Streaming Mode |
---|---|---|---|
1MB | ~5MB RAM | 0.1s | No |
100MB | ~150MB RAM | 2.5s | Recommended |
1GB+ | ~50MB RAM | 15s+ | Required |
Streaming Large Files
# Stream processing for large files
jq -c '.[]' large-file.json | while read -r line; do
echo "$line" | jq -r '.field_name'
done
# Using --stream for very large files
jq --stream 'select(length == 2 and .[0][1] == "target_field") | .[1]' huge-file.json
Comparison with Alternatives
Tool | Learning Curve | Performance | Features | Best Use Case |
---|---|---|---|---|
jq | Medium | Fast | Comprehensive | Complex JSON transformations |
Python json module | Easy | Medium | Full programming language | Integration with larger scripts |
grep/sed/awk | Easy | Very Fast | Limited | Simple text extraction |
yq (YAML processor) | Medium | Fast | YAML/JSON/XML | Multi-format processing |
Advanced Techniques and Best Practices
Error Handling
# Handle missing fields gracefully
jq '.users[]? | select(.email != null) | .email' data.json
# Try-catch equivalent
jq '.users[] | (.email // "no-email")' data.json
# Validate JSON structure
jq 'if type == "object" and has("required_field") then . else error("Invalid structure") end' data.json
Custom Functions
# Define reusable functions
jq '
def is_adult: .age >= 18;
def format_user: "\(.name) (\(.age))";
.users[] | select(is_adult) | format_user
' users.json
Working with Dates
# Convert Unix timestamp to ISO date
echo '{"timestamp": 1640995200}' | jq '.timestamp | strftime("%Y-%m-%d %H:%M:%S")'
# Parse ISO date to timestamp
echo '{"date": "2022-01-01T00:00:00Z"}' | jq '.date | strptime("%Y-%m-%dT%H:%M:%SZ") | mktime'
Common Pitfalls and Troubleshooting
Typical Issues
- Null handling: Always use the optional operator (?) or provide defaults (//)
- Array vs object confusion: Use .[] for arrays, .field for objects
- Quoting issues: Use single quotes for jq expressions, double quotes for JSON strings
- Memory issues with large files: Use streaming mode or process in chunks
Debugging Techniques
# Debug with intermediate steps
jq '. | debug | .users[] | debug | select(.active)' users.json
# Use length and type for inspection
jq '. | length, type' data.json
# Pretty print for readability
jq '.' messy.json > formatted.json
Performance Optimization
# Avoid repeated parsing - use variables
jq '.users as $users | $users | length, ($users | map(.age) | add / length)' data.json
# Use map instead of repeated operations
jq '[.users[] | select(.active)]' data.json # Slower
jq '.users | map(select(.active))' data.json # Faster
Integration with Shell Scripts and Automation
#!/bin/bash
# Example: Monitor API health and extract metrics
API_URL="https://api.example.com/health"
RESPONSE=$(curl -s "$API_URL")
# Extract metrics
STATUS=$(echo "$RESPONSE" | jq -r '.status')
RESPONSE_TIME=$(echo "$RESPONSE" | jq -r '.metrics.response_time')
ERROR_RATE=$(echo "$RESPONSE" | jq -r '.metrics.error_rate')
# Alert if thresholds exceeded
if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
echo "High error rate detected: $ERROR_RATE"
# Send alert
fi
# Generate report
jq -n \
--arg status "$STATUS" \
--arg response_time "$RESPONSE_TIME" \
--arg error_rate "$ERROR_RATE" \
--arg timestamp "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'{
timestamp: $timestamp,
status: $status,
metrics: {
response_time: ($response_time | tonumber),
error_rate: ($error_rate | tonumber)
}
}' >> health_log.json
jq transforms JSON data manipulation from a tedious programming task into an elegant command-line operation. Its filter-based approach, combined with powerful built-in functions, makes it indispensable for developers working with APIs, processing logs, or managing configuration files. The learning curve pays dividends in productivity gains, especially when dealing with complex nested JSON structures that would be painful to parse with traditional tools.
For comprehensive documentation and advanced features, check the official jq manual. The jq playground is also an excellent resource for testing expressions before using them in production scripts.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.
This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.