Advanced SQL: Joins, Window Functions, and Subqueries with Complex Examples
In the world of SQL, joins, window functions, and subqueries are critical for solving complex queries and extracting valuable insights from data. This blog will dive deep into these advanced SQL concepts with practical, real-world examples.
1. Joins
A join is used to combine rows from two or more tables based on a related column between them. There are various types of joins, each serving different purposes.
a. INNER JOIN: Fetch matching records from both tables.
Input Data:
employees
table:
employee_id | first_name | department_id |
---|---|---|
101 | John | 1 |
102 | Sarah | 2 |
103 | Mike | 1 |
departments
table:
department_id | department_name |
---|---|
1 | Sales |
2 | Marketing |
3 | IT |
Query:
SELECT e.first_name, d.department_name
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id;
Output:
first_name | department_name |
---|---|
John | Sales |
Sarah | Marketing |
Mike | Sales |
b. LEFT JOIN: Fetch all records from the left table and matched records from the right table. Unmatched records from the right table are returned as NULL
.
Query:
SELECT e.first_name, d.department_name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.department_id;
Output:
first_name | department_name |
---|---|
John | Sales |
Sarah | Marketing |
Mike | Sales |
c. RIGHT JOIN: Fetch all records from the right table and matched records from the left table. Unmatched records from the left table are returned as NULL
.
Query:
SELECT e.first_name, d.department_name
FROM employees e
RIGHT JOIN departments d
ON e.department_id = d.department_id;
Output:
first_name | department_name |
---|---|
John | Sales |
Sarah | Marketing |
Mike | Sales |
NULL | IT |
d. FULL OUTER JOIN: Fetch all records when there is a match in either table. Unmatched rows are filled with NULL
.
Query:
SELECT e.first_name, d.department_name
FROM employees e
FULL OUTER JOIN departments d
ON e.department_id = d.department_id;
Output:
first_name | department_name |
---|---|
John | Sales |
Sarah | Marketing |
Mike | Sales |
NULL | IT |
2. Window Functions
Window functions allow you to perform calculations across a set of table rows that are related to the current row, but they don’t collapse the result set like aggregate functions do with GROUP BY
.
a. ROW_NUMBER(): Assigns a unique row number for each row, starting from 1.
Input Data:
sales
table:
sale_id | employee_id | amount |
---|---|---|
1 | 101 | 500 |
2 | 102 | 700 |
3 | 101 | 450 |
4 | 103 | 600 |
5 | 102 | 900 |
Query:
SELECT employee_id, amount,
ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY amount DESC) AS rank
FROM sales;
Output:
employee_id | amount | rank |
---|---|---|
101 | 500 | 1 |
101 | 450 | 2 |
102 | 900 | 1 |
102 | 700 | 2 |
103 | 600 | 1 |
b. RANK(): Similar to ROW_NUMBER()
, but it assigns the same rank to rows with equal values. Gaps appear in ranks after ties.
Query:
SELECT employee_id, amount,
RANK() OVER (ORDER BY amount DESC) AS rank
FROM sales;
Output:
employee_id | amount | rank |
---|---|---|
102 | 900 | 1 |
102 | 700 | 2 |
103 | 600 | 3 |
101 | 500 | 4 |
101 | 450 | 5 |
c. LEAD(): Fetches the value from the next row in the result set, based on a defined ordering.
Query:
SELECT employee_id, amount,
LEAD(amount) OVER (ORDER BY amount DESC) AS next_amount
FROM sales;
Output:
employee_id | amount | next_amount |
---|---|---|
102 | 900 | 700 |
102 | 700 | 600 |
103 | 600 | 500 |
101 | 500 | 450 |
101 | 450 | NULL |
d. SUM() OVER: Calculates the running total or any other aggregate without collapsing rows.
Query:
SELECT employee_id, amount,
SUM(amount) OVER (ORDER BY amount) AS running_total
FROM sales;
Output:
employee_id | amount | running_total |
---|---|---|
101 | 450 | 450 |
101 | 500 | 950 |
103 | 600 | 1550 |
102 | 700 | 2250 |
102 | 900 | 3150 |
3. Subqueries
A subquery is a query nested inside another query. It can return a single value, multiple values, or even a table of values.
a. Subquery Returning a Single Value: Using a subquery in a SELECT
statement to return a specific value.
Input Data:
employees
table:
employee_id | first_name | salary |
---|---|---|
101 | John | 55000 |
102 | Sarah | 70000 |
103 | Mike | 62000 |
Query:
SELECT first_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
Output:
first_name | salary |
---|---|
Sarah | 70000 |
Mike | 62000 |
b. Subquery Returning Multiple Values: Use a subquery in a WHERE
clause to filter results.
Query:
SELECT first_name
FROM employees
WHERE employee_id IN (SELECT employee_id FROM sales WHERE amount > 600);
Output:
first_name |
---|
Sarah |
Mike |
c. Correlated Subquery: A subquery that refers to the outer query for each row.
Query:
SELECT e.first_name, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);
This query returns employees whose salary is higher than the average salary in their department.
4. Nested Queries
A nested query involves multiple layers of subqueries and can be used to handle complex business logic.
a. Example of Nested Queries: Finding employees whose salary is above the average of departments that have more than 5 employees.
Input Data:
employees
table:
employee_id | first_name | department_id | salary |
---|---|---|---|
101 | John | 1 | 55000 |
102 | Sarah | 2 | 70000 |
103 | Mike | 1 | 62000 |
104 | Emma | 3 | 48000 |
105 | David | 2 | 64000 |
Query:
SELECT first_name, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees e2
WHERE e2.department_id = employees.department_id
AND department_id IN (
SELECT department_id
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 5
)
);
This query first checks for departments with more than 5 employees and then compares the salary of employees in those departments against the department’s average salary.
Conclusion
SQL joins, window functions, and subqueries are powerful tools that can simplify complex queries and make your data analysis more effective. By mastering these techniques, you’ll be well-equipped to tackle real-world problems and perform