In this tutorial, we will solve a problem that involves identifying the top cities with the highest number of completed trade orders. We’ll use two tables from a Robinhood trading system: trades
and users
. Our goal is to find the top three cities with the highest number of completed trades and list them in descending order based on the number of completed orders.
Problem Overview
You have the following two tables:
trades
Table
order_id
: Unique identifier for each trade order.user_id
: Identifier for the user who placed the trade.quantity
: Number of units traded.status
: Status of the trade (either ‘Completed’ or ‘Cancelled’).date
: Date and time when the trade was placed.price
: Price per unit of the traded item.
users
Table
user_id
: Unique identifier for each user.city
: City where the user is located.email
: User’s email address.signup_date
: Date and time when the user signed up.
Objective: Write a query to retrieve the top three cities with the highest number of completed trade orders. The results should be sorted in descending order based on the number of completed orders. Output the city name and the corresponding number of completed orders.
Example Data:
trades
Table:
order_id | user_id | quantity | status | date | price |
---|---|---|---|---|---|
100101 | 111 | 10 | Cancelled | 08/17/2022 12:00:00 | 9.80 |
100102 | 111 | 10 | Completed | 08/17/2022 12:00:00 | 10.00 |
100259 | 148 | 35 | Completed | 08/25/2022 12:00:00 | 5.10 |
100264 | 148 | 40 | Completed | 08/26/2022 12:00:00 | 4.80 |
100305 | 300 | 15 | Completed | 09/05/2022 12:00:00 | 10.00 |
100400 | 178 | 32 | Completed | 09/17/2022 12:00:00 | 12.00 |
100565 | 265 | 2 | Completed | 09/27/2022 12:00:00 | 8.70 |
users
Table:
user_id | city | signup_date | |
---|---|---|---|
111 | San Francisco | rrok10@gmail.com | 08/03/2021 12:00:00 |
148 | Boston | sailor9820@gmail.com | 08/20/2021 12:00:00 |
178 | San Francisco | harrypotterfan182@gmail.com | 01/05/2022 12:00:00 |
265 | Denver | shadower_@hotmail.com | 02/26/2022 12:00:00 |
300 | San Francisco | houstoncowboy1122@hotmail.com | 06/30/2022 12:00:00 |
Desired Output:
city | total_orders |
---|---|
San Francisco | 3 |
Boston | 2 |
Denver | 1 |
Explanation:
- San Francisco has the highest number of completed trade orders with a total of 3.
- Boston has the second highest number with 2 completed orders.
- Denver ranks third with 1 completed order.
SQL Query Breakdown
To solve this problem, follow these steps:
- Join the
trades
table with theusers
table onuser_id
to get the city information for each trade. - Filter the trades to include only those with a ‘Completed’ status.
- Count the number of completed trades per city.
- Sort the results in descending order based on the count of completed trades.
- Limit the results to the top three cities.
Here is the SQL query to achieve this:
SELECT
u.city,
COUNT(t.order_id) AS total_orders
FROM
trades t
JOIN
users u ON t.user_id = u.user_id
WHERE
t.status = 'Completed'
GROUP BY
u.city
ORDER BY
total_orders DESC
LIMIT 3;
Query Explanation:
- Joining Tables:
JOIN users u ON t.user_id = u.user_id
: Joins thetrades
table with theusers
table to include city information.
- Filtering Completed Trades:
WHERE t.status = 'Completed'
: Ensures only completed trades are counted.
- Counting Completed Orders:
COUNT(t.order_id)
: Counts the number of completed trade orders per city.
- Grouping and Sorting:
GROUP BY u.city
: Groups the results by city.ORDER BY total_orders DESC
: Sorts the results in descending order of completed orders.
- Limiting Results:
LIMIT 3
: Restricts the output to the top three cities with the highest number of completed orders.
Conclusion
This SQL query efficiently identifies the top cities with the highest number of completed trade orders by leveraging joins, aggregations, and sorting techniques. Running this query with your dataset will help you gain valuable insights into the trading activity across different cities.