LEVEL 1: SQL-ONLY CASE STUDIES (Easy — Warm Up)

🧠 Pehle SQL se shuru karo — interviewer ek scenario dega aur bolega "write a query for this." Yeh building blocks hain — agar yeh sahi kiye toh baaki sab easy lagega.

SQL Case Study 1: Employee Performance Analysis

The Scenario

"An HR team has an employees table and a performance_reviews table. They want to identify top performers, understand department-level patterns, and flag underperformers."

Tables Given

-- Table: employees
-- employee_id | name | department | hire_date | salary | manager_id

-- Table: performance_reviews
-- review_id | employee_id | review_date | score (1-10) | reviewer_id

Questions & Solutions

Q1: Find the average performance score per department, ordered highest to lowest.

SELECT 
    e.department,
    ROUND(AVG(pr.score), 2) AS avg_score,
    COUNT(DISTINCT e.employee_id) AS total_employees
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
GROUP BY e.department
ORDER BY avg_score DESC;

🧠 Sochne ka process: "Department-wise chahiye → GROUP BY department. Average score chahiye → AVG(). Order highest first → ORDER BY DESC."

Q2: Find employees who scored below the company average.

WITH company_avg AS (
    SELECT AVG(score) AS avg_score
    FROM performance_reviews
)
SELECT 
    e.name, 
    e.department,
    AVG(pr.score) AS emp_avg_score,
    ca.avg_score AS company_avg
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
CROSS JOIN company_avg ca
GROUP BY e.name, e.department, ca.avg_score
HAVING AVG(pr.score) < ca.avg_score
ORDER BY emp_avg_score ASC;

🧠 Why CTE? Company average ek hi number hai — CTE mein nikaal lo, phir compare karo. Subquery se bhi ho sakta hai but CTE readable hota hai. Interview mein bolo: "I prefer CTEs for readability."

Q3: Rank employees within each department by their latest review score.

SELECT 
    e.name,
    e.department,
    pr.score AS latest_score,
    DENSE_RANK() OVER (
        PARTITION BY e.department 
        ORDER BY pr.score DESC
    ) AS dept_rank
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
WHERE pr.review_date = (
    SELECT MAX(review_date) 
    FROM performance_reviews pr2 
    WHERE pr2.employee_id = e.employee_id
);

🧠 DENSE_RANK yaad hai? PARTITION BY department = "har department ke andar separately rank karo." Ties same rank deta hai, next rank skip nahi karta (2,2,3 not 2,2,4).

Q4: Find managers whose team's average performance dropped compared to last year.

WITH current_year AS (
    SELECT 
        e.manager_id,
        AVG(pr.score) AS avg_score
    FROM employees e
    JOIN performance_reviews pr ON e.employee_id = pr.employee_id
    WHERE pr.review_date >= '2024-01-01'
    GROUP BY e.manager_id
),
previous_year AS (
    SELECT 
        e.manager_id,
        AVG(pr.score) AS avg_score
    FROM employees e
    JOIN performance_reviews pr ON e.employee_id = pr.employee_id
    WHERE pr.review_date >= '2023-01-01' AND pr.review_date < '2024-01-01'
    GROUP BY e.manager_id
)
SELECT 
    m.name AS manager_name,
    py.avg_score AS last_year_avg,
    cy.avg_score AS this_year_avg,
    ROUND(cy.avg_score - py.avg_score, 2) AS change
FROM current_year cy
JOIN previous_year py ON cy.manager_id = py.manager_id
JOIN employees m ON cy.manager_id = m.employee_id
WHERE cy.avg_score < py.avg_score
ORDER BY change ASC;

🧠 Yeh query complex lagti hai but logic simple hai: "Previous year ka average nikalo, current year ka nikalo, compare karo, jahan drop hua woh dikhao." Multiple CTEs use karo — interviewer impressed hota hai.

SQL Case Study 2: E-Commerce Sales Analysis

The Scenario

"An online retailer wants to understand their sales performance, customer behavior, and product trends using SQL."

Tables Given

-- Table: orders
-- order_id | customer_id | order_date | total_amount | status

-- Table: order_items
-- item_id | order_id | product_id | quantity | unit_price

-- Table: products
-- product_id | product_name | category | brand

-- Table: customers
-- customer_id | name | city | signup_date

Questions & Solutions

Q1: What is the total revenue per month for the last 12 months?

SELECT 
    DATE_TRUNC('month', order_date) AS month,
    SUM(total_amount) AS monthly_revenue,
    COUNT(DISTINCT order_id) AS total_orders,
    COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE status = 'completed'
  AND order_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

🧠 Always filter for completed orders. Cancelled/returned orders ko count karna galat hoga — interviewer yeh check karega ki tumne yeh socha ya nahi.

Q2: Find the top 5 products by revenue, with their category.

SELECT 
    p.product_name,
    p.category,
    SUM(oi.quantity * oi.unit_price) AS total_revenue,
    SUM(oi.quantity) AS units_sold
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.status = 'completed'
GROUP BY p.product_name, p.category
ORDER BY total_revenue DESC
LIMIT 5;

Q3: Calculate month-over-month revenue growth rate.

WITH monthly_revenue AS (
    SELECT 
        DATE_TRUNC('month', order_date) AS month,
        SUM(total_amount) AS revenue
    FROM orders
    WHERE status = 'completed'
    GROUP BY DATE_TRUNC('month', order_date)
)
SELECT 
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
    ROUND(
        (revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 / 
        NULLIF(LAG(revenue) OVER (ORDER BY month), 0),
        2
    ) AS growth_rate_pct
FROM monthly_revenue
ORDER BY month;

🧠 LAG() yaad hai? Previous row ki value deta hai. NULLIF se divide-by-zero error se bachte hain — yeh small detail interview mein brownie points deta hai.

Q4: Identify customers who haven't ordered in the last 90 days (churn candidates).

SELECT 
    c.customer_id,
    c.name,
    c.city,
    MAX(o.order_date) AS last_order_date,
    CURRENT_DATE - MAX(o.order_date) AS days_since_last_order
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.customer_id, c.name, c.city
HAVING MAX(o.order_date) < CURRENT_DATE - INTERVAL '90 days'
ORDER BY days_since_last_order DESC;

🧠 Yeh DecisionTree ka actual project se related hai! Unka churn prediction case study isi concept pe based tha. Interview mein connect karo: "This SQL identifies potential churn candidates, which aligns with your packaging distributor case study."

Q5: Find the average order value (AOV) by customer city, and flag cities that are below the overall average.

WITH city_aov AS (
    SELECT 
        c.city,
        AVG(o.total_amount) AS avg_order_value,
        COUNT(DISTINCT o.order_id) AS order_count
    FROM customers c
    JOIN orders o ON c.customer_id = o.customer_id
    WHERE o.status = 'completed'
    GROUP BY c.city
),
overall AS (
    SELECT AVG(total_amount) AS overall_aov
    FROM orders
    WHERE status = 'completed'
)
SELECT 
    ca.city,
    ca.avg_order_value,
    ca.order_count,
    ov.overall_aov,
    CASE 
        WHEN ca.avg_order_value < ov.overall_aov THEN '⚠️ Below Average'
        ELSE '✅ On Track'
    END AS status_flag
FROM city_aov ca
CROSS JOIN overall ov
ORDER BY ca.avg_order_value DESC;

🧠 CASE WHEN = SQL ka if-else. Business mein flags lagana bahut common hai — "kaunsi city mein focus karna hai?" bol ke actionable insight de rahe ho.

SQL Case Study 3: Customer Retention Analysis

The Scenario

"A subscription-based SaaS company wants to understand their customer retention patterns. They have monthly subscription data."

Table Given

-- Table: subscriptions
-- subscription_id | customer_id | start_date | end_date | plan_type | monthly_price
-- (end_date IS NULL means still active)

Q1: What is the overall churn rate?

SELECT 
    COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) AS churned,
    COUNT(*) AS total,
    ROUND(
        COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) * 100.0 / COUNT(*),
        2
    ) AS churn_rate_pct
FROM subscriptions;

Q2: What is the churn rate by plan type?

SELECT 
    plan_type,
    COUNT(*) AS total_customers,
    COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) AS churned,
    ROUND(
        COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) * 100.0 / COUNT(*),
        2
    ) AS churn_rate_pct
FROM subscriptions
GROUP BY plan_type
ORDER BY churn_rate_pct DESC;

🧠 COUNT(CASE WHEN ...) is a very powerful pattern. Conditional counting bina extra filter ke — ek hi query mein total aur churned dono nikal lo. Ratt lo yeh pattern.

Q3: Calculate customer lifetime (in months) for churned customers.

SELECT 
    plan_type,
    ROUND(AVG(
        EXTRACT(MONTH FROM AGE(end_date, start_date)) + 
        EXTRACT(YEAR FROM AGE(end_date, start_date)) * 12
    ), 1) AS avg_lifetime_months,
    ROUND(AVG(monthly_price * (
        EXTRACT(MONTH FROM AGE(end_date, start_date)) + 
        EXTRACT(YEAR FROM AGE(end_date, start_date)) * 12
    )), 0) AS avg_lifetime_value
FROM subscriptions
WHERE end_date IS NOT NULL
GROUP BY plan_type;

🧠 Lifetime Value (LTV) = monthly price × months active. Very common interview question. Interviewers love when you calculate LTV because it connects data to business revenue.

SQL Case Study 1: Employee Performance Analysis​

The Scenario​

Tables Given​

Questions & Solutions​

SQL Case Study 2: E-Commerce Sales Analysis​

The Scenario​

Tables Given​

Questions & Solutions​

SQL Case Study 3: Customer Retention Analysis​

The Scenario​

Table Given​

SQL Case Study 1: Employee Performance Analysis

The Scenario

Tables Given

Questions & Solutions

SQL Case Study 2: E-Commerce Sales Analysis

The Scenario

Tables Given

Questions & Solutions

SQL Case Study 3: Customer Retention Analysis

The Scenario

Table Given