
How to Use Nested Queries in SQL
Nested queries, also known as subqueries, are essentially queries within queries that allow you to perform complex data retrieval operations in SQL. They’re incredibly useful when you need to filter, aggregate, or join data based on the results of another query, and mastering them will seriously level up your database skills. In this post, we’ll break down how nested queries work, walk through practical examples, cover performance considerations, and share some battle-tested best practices that’ll help you avoid common pitfalls.
How Nested Queries Work
A nested query is a SELECT statement that’s embedded inside another SQL statement, typically within the WHERE, FROM, or HAVING clauses. The inner query executes first, and its results are used by the outer query to complete the operation. Think of it like calling a function within another function – the inner function needs to return its result before the outer function can proceed.
There are three main types of nested queries:
- Scalar subqueries – Return a single value
- Row subqueries – Return a single row with multiple columns
- Table subqueries – Return multiple rows and columns
Here’s a basic example to illustrate the concept:
SELECT employee_name, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
In this case, the inner query calculates the average salary, and the outer query uses that value to find all employees earning above average.
Step-by-Step Implementation Guide
Let’s start with simple examples and work our way up to more complex scenarios. We’ll use a typical e-commerce database with customers, orders, and products tables.
Step 1: Basic WHERE Clause Subquery
Find all customers who have placed orders:
SELECT customer_id, customer_name, email
FROM customers
WHERE customer_id IN (
SELECT DISTINCT customer_id
FROM orders
);
Step 2: Correlated Subqueries
These are more advanced – the inner query references columns from the outer query:
SELECT c.customer_name, c.email
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
AND o.order_date >= '2023-01-01'
);
Step 3: Subqueries in FROM Clause
You can treat subquery results as temporary tables:
SELECT avg_order_value.customer_id, avg_order_value.avg_value
FROM (
SELECT customer_id, AVG(total_amount) as avg_value
FROM orders
GROUP BY customer_id
HAVING AVG(total_amount) > 100
) as avg_order_value;
Step 4: Multiple Level Nesting
Find products that were ordered by customers from specific cities:
SELECT product_name, price
FROM products
WHERE product_id IN (
SELECT product_id
FROM order_items
WHERE order_id IN (
SELECT order_id
FROM orders
WHERE customer_id IN (
SELECT customer_id
FROM customers
WHERE city = 'New York'
)
)
);
Real-World Examples and Use Cases
Here are some practical scenarios where nested queries shine:
Finding Top Performers
Get employees whose sales are in the top 10% of their department:
SELECT e.employee_name, e.department, s.sales_amount
FROM employees e
JOIN sales s ON e.employee_id = s.employee_id
WHERE s.sales_amount > (
SELECT PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY sales_amount)
FROM sales s2
JOIN employees e2 ON s2.employee_id = e2.employee_id
WHERE e2.department = e.department
);
Data Cleanup Operations
Delete duplicate records keeping only the most recent:
DELETE FROM user_logs
WHERE log_id NOT IN (
SELECT MAX(log_id)
FROM user_logs
GROUP BY user_id, DATE(created_at)
);
Dynamic Reporting
Compare current month performance against historical average:
SELECT
product_id,
current_month_sales,
historical_avg,
ROUND((current_month_sales - historical_avg) / historical_avg * 100, 2) as growth_percent
FROM (
SELECT
product_id,
SUM(quantity) as current_month_sales,
(SELECT AVG(monthly_sales)
FROM (
SELECT product_id, SUM(quantity) as monthly_sales
FROM order_items oi2
JOIN orders o2 ON oi2.order_id = o2.order_id
WHERE o2.order_date < DATE_TRUNC('month', CURRENT_DATE)
GROUP BY product_id, DATE_TRUNC('month', o2.order_date)
) historical
WHERE historical.product_id = oi.product_id
) as historical_avg
FROM order_items oi
JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= DATE_TRUNC('month', CURRENT_DATE)
GROUP BY product_id
) performance_comparison;
Comparison with Alternatives
While nested queries are powerful, they’re not always the best solution. Here’s how they stack up against alternatives:
Approach | Performance | Readability | Maintainability | Best Use Case |
---|---|---|---|---|
Nested Queries | Moderate | Can be complex | Medium | Complex filtering, one-off queries |
JOINs | Generally faster | Usually clearer | High | Combining related data |
CTEs (Common Table Expressions) | Similar to subqueries | Very clear | High | Complex queries, recursive operations |
Window Functions | Excellent | Moderate | Medium | Analytical queries, rankings |
For example, this nested query:
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date > '2023-01-01'
);
Could be rewritten as a more efficient JOIN:
SELECT DISTINCT c.customer_name
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date > '2023-01-01';
Performance Considerations and Optimization
Nested queries can be performance killers if not used carefully. Here are some key points to remember:
Correlated vs Non-Correlated Subqueries
- Non-correlated subqueries execute once and cache the result
- Correlated subqueries execute once for each row in the outer query
- Always try to convert correlated subqueries to JOINs when possible
Indexing Strategy
Make sure columns used in subquery WHERE clauses are properly indexed:
-- If you frequently filter by customer_id in subqueries
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_date_customer ON orders(order_date, customer_id);
Query Execution Plans
Always check your execution plans. In PostgreSQL:
EXPLAIN ANALYZE
SELECT customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date > '2023-01-01'
);
Look for operations like “Nested Loop” or “Seq Scan” that might indicate performance issues.
Best Practices and Common Pitfalls
Do’s:
- Use EXISTS instead of IN when checking for existence – it’s often faster
- Limit subquery results when possible using TOP/LIMIT
- Consider using CTEs for complex nested queries to improve readability
- Test performance with realistic data volumes
- Use proper indentation and aliasing for maintainability
Don’ts:
- Don’t nest more than 3-4 levels deep – it becomes unmaintainable
- Avoid using SELECT * in subqueries – be specific about columns needed
- Don’t use subqueries in SELECT clauses that return multiple rows
- Avoid correlated subqueries in large datasets without proper indexing
Common Gotchas:
NULL values can cause unexpected results with IN/NOT IN:
-- This might not work as expected if subquery returns NULLs
SELECT * FROM customers WHERE customer_id NOT IN (
SELECT customer_id FROM blacklisted_customers
);
-- Better approach
SELECT * FROM customers WHERE customer_id NOT IN (
SELECT customer_id FROM blacklisted_customers WHERE customer_id IS NOT NULL
);
Debugging Complex Nested Queries
Break them down and test each part separately:
-- Test the innermost query first
SELECT customer_id FROM customers WHERE city = 'New York';
-- Then the next level
SELECT order_id FROM orders WHERE customer_id IN (1, 2, 3);
-- Finally combine them
SELECT product_id FROM order_items WHERE order_id IN (
SELECT order_id FROM orders WHERE customer_id IN (
SELECT customer_id FROM customers WHERE city = 'New York'
)
);
For more detailed information about SQL subqueries and optimization techniques, check out the PostgreSQL official documentation on subquery expressions and the MySQL subquery documentation.
Nested queries are a powerful tool in your SQL arsenal, but like any powerful tool, they need to be used thoughtfully. Start with simple examples, always consider performance implications, and don’t be afraid to refactor complex nested queries into more readable alternatives like CTEs or JOINs when it makes sense. With practice, you’ll develop an intuition for when nested queries are the right choice and when to reach for other solutions.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.
This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.