Breakdown of Time Spent on Snapchat Activities by Age Group: An SQL Tutorial

In this SQL tutorial, we’ll learn how to calculate and compare the time spent sending versus opening snaps on Snapchat, grouped by age buckets. This problem tests your ability to use SQL aggregation, conditional sums, and joins to generate insightful metrics. By the end of this tutorial, you’ll know how to break down activity metrics and calculate percentages efficiently in SQL.

Learning Objectives

  • Learn how to join tables to bring together different data points.
  • Use SUM() with CASE statements to conditionally aggregate data.
  • Calculate percentages and format them to two decimal places using SQL.
  • Apply these techniques to group and analyze user activity data.

Problem Statement

You have two tables:

  1. activities: Contains records of Snapchat user activities, including time spent sending, opening, and chatting.
  2. age_breakdown: Maps users to age groups.

Task: Write an SQL query to obtain a breakdown of the time spent sending versus opening snaps as a percentage of the total time spent on these activities, grouped by age group. Round the percentage to 2 decimal places.

Schema

activities Table

Column NameTypeDescription
activity_idintegerThe unique ID of each activity.
user_idintegerThe ID of the user performing the activity.
activity_typestringThe type of activity (‘send’, ‘open’, ‘chat’).
time_spentfloatThe time spent on the activity.
activity_datedatetimeThe date of the activity.

Example Input for activities

activity_iduser_idactivity_typetime_spentactivity_date
7274123open4.5006/22/2022 12:00:00
2425123send3.5006/22/2022 12:00:00
1413456send5.6706/23/2022 12:00:00
1414789chat11.0006/25/2022 12:00:00
2536456open3.0006/25/2022 12:00:00

age_breakdown Table

Column NameTypeDescription
user_idintegerThe ID of the user.
age_bucketstringThe age group of the user.

Example Input for age_breakdown

user_idage_bucket
12331-35
45626-30
78921-25

Expected Output

age_bucketsend_percopen_perc
26-3065.4034.60
31-3543.7556.25

Step-by-Step Solution

Let’s break this problem into a series of steps to make it more manageable.


Step 1: Join the Tables

To get a complete dataset for analysis, we need to join the activities table with the age_breakdown table. This will help us group the activity data by age buckets.

Interactive Exercise 1: Join Tables to Combine Data

Task: Write a query to join the activities and age_breakdown tables.

SELECT 
    a.user_id,
    ab.age_bucket,
    a.activity_type,
    a.time_spent
FROM activities a
JOIN age_breakdown ab ON a.user_id = ab.user_id;
  • Try It: Run this query to see a combined dataset of user activities along with their age buckets.

Explanation

  • JOIN: Combines rows from the activities table with the age_breakdown table where the user_id matches.

Step 2: Calculate Total Time Spent on Sending and Opening Snaps

Now, let’s calculate the total time spent on “send” and “open” activities for each age bucket. We use SUM() with a CASE statement to conditionally sum the time spent on each activity type.

Interactive Exercise 2: Calculate Time Spent on Each Activity

Task: Write a query to calculate the total time spent sending and opening snaps, grouped by age buckets.

SELECT 
    ab.age_bucket,
    SUM(CASE WHEN a.activity_type = 'send' THEN a.time_spent ELSE 0 END) AS total_send_time,
    SUM(CASE WHEN a.activity_type = 'open' THEN a.time_spent ELSE 0 END) AS total_open_time
FROM activities a
JOIN age_breakdown ab ON a.user_id = ab.user_id
GROUP BY ab.age_bucket;
  • Try It: Run this query to see the total time spent on sending and opening snaps for each age bucket.

Explanation

  • CASE Statements: Conditionals used to selectively sum time_spent based on activity_type.
  • SUM(): Aggregates the total time for each activity type.
  • GROUP BY ab.age_bucket: Groups the results by age bucket.

Step 3: Calculate Percentages and Round Them

Now that we have the total time spent on sending and opening snaps, we need to calculate the percentages of total time spent for each activity type and round them to two decimal places.

Interactive Exercise 3: Calculate Percentages

Task: Write a query to calculate the percentages of time spent sending and opening snaps.

SELECT 
    ab.age_bucket,
    ROUND(
        SUM(CASE WHEN a.activity_type = 'send' THEN a.time_spent ELSE 0 END) / 
        SUM(CASE WHEN a.activity_type IN ('send', 'open') THEN a.time_spent ELSE 0 END) * 100.0, 2
    ) AS send_perc,
    ROUND(
        SUM(CASE WHEN a.activity_type = 'open' THEN a.time_spent ELSE 0 END) / 
        SUM(CASE WHEN a.activity_type IN ('send', 'open') THEN a.time_spent ELSE 0 END) * 100.0, 2
    ) AS open_perc
FROM activities a
JOIN age_breakdown ab ON a.user_id = ab.user_id
GROUP BY ab.age_bucket;
  • Try It: Execute this query to see the percentage breakdown of time spent on sending and opening snaps, grouped by age bucket.

Explanation

  • Percentage Calculation: time_spent / (time_spent send + time_spent open) * 100.0 ensures a floating-point division to avoid integer division issues.
  • ROUND(): Rounds the calculated percentages to two decimal places for better readability.

Summary

In this tutorial, you learned how to:

  • Join tables to enrich your dataset for analysis.
  • Use SUM() with CASE statements to conditionally aggregate data.
  • Calculate percentages of total time spent and round them to two decimal places.

Final Query Recap:

SELECT 
    ab.age_bucket,
    ROUND(
        SUM(CASE WHEN a.activity_type = 'send' THEN a.time_spent ELSE 0 END) / 
        SUM(CASE WHEN a.activity_type IN ('send', 'open') THEN a.time_spent ELSE 0 END) * 100.0, 2
    ) AS send_perc,
    ROUND(
        SUM(CASE WHEN a.activity_type = 'open' THEN a.time_spent ELSE 0 END) / 
        SUM(CASE WHEN a.activity_type IN ('send', 'open') THEN a.time_spent ELSE 0 END) * 100.0, 2
    ) AS open_perc
FROM activities a
JOIN age_breakdown ab ON a.user_id = ab.user_id
GROUP BY ab.age_bucket;

Additional Exercises

Exercise 1: Include Other Activity Types

  • Task: Modify the query to include other activity types such as ‘chat’. How does this change the breakdown?

Conclusion

By following this tutorial, you now know how to use SQL to analyze user behavior on social media platforms. You can apply these techniques to other problems involving conditional aggregation, percentage calculations, and grouping by categories.