Brushing Up Advance Concepts: Joins, Window Functions, and Subqueries in SQL

Advanced SQL: Joins, Window Functions, and Subqueries with Complex Examples

In the world of SQL, joins, window functions, and subqueries are critical for solving complex queries and extracting valuable insights from data. This blog will dive deep into these advanced SQL concepts with practical, real-world examples.

1. Joins

A join is used to combine rows from two or more tables based on a related column between them. There are various types of joins, each serving different purposes.

a. INNER JOIN: Fetch matching records from both tables.

Input Data:

employees table:

employee_idfirst_namedepartment_id
101John1
102Sarah2
103Mike1

departments table:

department_iddepartment_name
1Sales
2Marketing
3IT

Query:

SELECT e.first_name, d.department_name
FROM employees e
INNER JOIN departments d
ON e.department_id = d.department_id;

Output:

first_namedepartment_name
JohnSales
SarahMarketing
MikeSales

b. LEFT JOIN: Fetch all records from the left table and matched records from the right table. Unmatched records from the right table are returned as NULL.

Query:

SELECT e.first_name, d.department_name
FROM employees e
LEFT JOIN departments d
ON e.department_id = d.department_id;

Output:

first_namedepartment_name
JohnSales
SarahMarketing
MikeSales

c. RIGHT JOIN: Fetch all records from the right table and matched records from the left table. Unmatched records from the left table are returned as NULL.

Query:

SELECT e.first_name, d.department_name
FROM employees e
RIGHT JOIN departments d
ON e.department_id = d.department_id;

Output:

first_namedepartment_name
JohnSales
SarahMarketing
MikeSales
NULLIT

d. FULL OUTER JOIN: Fetch all records when there is a match in either table. Unmatched rows are filled with NULL.

Query:

SELECT e.first_name, d.department_name
FROM employees e
FULL OUTER JOIN departments d
ON e.department_id = d.department_id;

Output:

first_namedepartment_name
JohnSales
SarahMarketing
MikeSales
NULLIT

2. Window Functions

Window functions allow you to perform calculations across a set of table rows that are related to the current row, but they don’t collapse the result set like aggregate functions do with GROUP BY.

a. ROW_NUMBER(): Assigns a unique row number for each row, starting from 1.

Input Data:

sales table:

sale_idemployee_idamount
1101500
2102700
3101450
4103600
5102900

Query:

SELECT employee_id, amount, 
       ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY amount DESC) AS rank
FROM sales;

Output:

employee_idamountrank
1015001
1014502
1029001
1027002
1036001

b. RANK(): Similar to ROW_NUMBER(), but it assigns the same rank to rows with equal values. Gaps appear in ranks after ties.

Query:

SELECT employee_id, amount, 
       RANK() OVER (ORDER BY amount DESC) AS rank
FROM sales;

Output:

employee_idamountrank
1029001
1027002
1036003
1015004
1014505

c. LEAD(): Fetches the value from the next row in the result set, based on a defined ordering.

Query:

SELECT employee_id, amount, 
       LEAD(amount) OVER (ORDER BY amount DESC) AS next_amount
FROM sales;

Output:

employee_idamountnext_amount
102900700
102700600
103600500
101500450
101450NULL

d. SUM() OVER: Calculates the running total or any other aggregate without collapsing rows.

Query:

SELECT employee_id, amount, 
       SUM(amount) OVER (ORDER BY amount) AS running_total
FROM sales;

Output:

employee_idamountrunning_total
101450450
101500950
1036001550
1027002250
1029003150

3. Subqueries

A subquery is a query nested inside another query. It can return a single value, multiple values, or even a table of values.

a. Subquery Returning a Single Value: Using a subquery in a SELECT statement to return a specific value.

Input Data:

employees table:

employee_idfirst_namesalary
101John55000
102Sarah70000
103Mike62000

Query:

SELECT first_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

Output:

first_namesalary
Sarah70000
Mike62000

b. Subquery Returning Multiple Values: Use a subquery in a WHERE clause to filter results.

Query:

SELECT first_name
FROM employees
WHERE employee_id IN (SELECT employee_id FROM sales WHERE amount > 600);

Output:

first_name
Sarah
Mike

c. Correlated Subquery: A subquery that refers to the outer query for each row.

Query:

SELECT e.first_name, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);

This query returns employees whose salary is higher than the average salary in their department.


4. Nested Queries

A nested query involves multiple layers of subqueries and can be used to handle complex business logic.

a. Example of Nested Queries: Finding employees whose salary is above the average of departments that have more than 5 employees.

Input Data:

employees table:

employee_idfirst_namedepartment_idsalary
101John155000
102Sarah270000
103Mike162000
104Emma348000
105David264000

Query:

SELECT first_name, salary
FROM employees
WHERE salary > (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.department_id = employees.department_id
    AND department_id IN (
        SELECT department_id
        FROM employees
        GROUP BY department_id
        HAVING COUNT(*) > 5
    )
);

This query first checks for departments with more than 5 employees and then compares the salary of employees in those departments against the department’s average salary.


Conclusion

SQL joins, window functions, and subqueries are powerful tools that can simplify complex queries and make your data analysis more effective. By mastering these techniques, you’ll be well-equipped to tackle real-world problems and perform