Skip to main content

LEVEL 1: SQL-ONLY CASE STUDIES (Easy — Warm Up)

🧠 Pehle SQL se shuru karo — interviewer ek scenario dega aur bolega "write a query for this." Yeh building blocks hain — agar yeh sahi kiye toh baaki sab easy lagega.


SQL Case Study 1: Employee Performance Analysis

The Scenario

"An HR team has an employees table and a performance_reviews table. They want to identify top performers, understand department-level patterns, and flag underperformers."

Tables Given

-- Table: employees
-- employee_id | name | department | hire_date | salary | manager_id

-- Table: performance_reviews
-- review_id | employee_id | review_date | score (1-10) | reviewer_id

Questions & Solutions

Q1: Find the average performance score per department, ordered highest to lowest.

SELECT 
e.department,
ROUND(AVG(pr.score), 2) AS avg_score,
COUNT(DISTINCT e.employee_id) AS total_employees
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
GROUP BY e.department
ORDER BY avg_score DESC;

🧠 Sochne ka process: "Department-wise chahiye → GROUP BY department. Average score chahiye → AVG(). Order highest first → ORDER BY DESC."

Q2: Find employees who scored below the company average.

WITH company_avg AS (
SELECT AVG(score) AS avg_score
FROM performance_reviews
)
SELECT
e.name,
e.department,
AVG(pr.score) AS emp_avg_score,
ca.avg_score AS company_avg
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
CROSS JOIN company_avg ca
GROUP BY e.name, e.department, ca.avg_score
HAVING AVG(pr.score) < ca.avg_score
ORDER BY emp_avg_score ASC;

🧠 Why CTE? Company average ek hi number hai — CTE mein nikaal lo, phir compare karo. Subquery se bhi ho sakta hai but CTE readable hota hai. Interview mein bolo: "I prefer CTEs for readability."

Q3: Rank employees within each department by their latest review score.

SELECT 
e.name,
e.department,
pr.score AS latest_score,
DENSE_RANK() OVER (
PARTITION BY e.department
ORDER BY pr.score DESC
) AS dept_rank
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
WHERE pr.review_date = (
SELECT MAX(review_date)
FROM performance_reviews pr2
WHERE pr2.employee_id = e.employee_id
);

🧠 DENSE_RANK yaad hai? PARTITION BY department = "har department ke andar separately rank karo." Ties same rank deta hai, next rank skip nahi karta (2,2,3 not 2,2,4).

Q4: Find managers whose team's average performance dropped compared to last year.

WITH current_year AS (
SELECT
e.manager_id,
AVG(pr.score) AS avg_score
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
WHERE pr.review_date >= '2024-01-01'
GROUP BY e.manager_id
),
previous_year AS (
SELECT
e.manager_id,
AVG(pr.score) AS avg_score
FROM employees e
JOIN performance_reviews pr ON e.employee_id = pr.employee_id
WHERE pr.review_date >= '2023-01-01' AND pr.review_date < '2024-01-01'
GROUP BY e.manager_id
)
SELECT
m.name AS manager_name,
py.avg_score AS last_year_avg,
cy.avg_score AS this_year_avg,
ROUND(cy.avg_score - py.avg_score, 2) AS change
FROM current_year cy
JOIN previous_year py ON cy.manager_id = py.manager_id
JOIN employees m ON cy.manager_id = m.employee_id
WHERE cy.avg_score < py.avg_score
ORDER BY change ASC;

🧠 Yeh query complex lagti hai but logic simple hai: "Previous year ka average nikalo, current year ka nikalo, compare karo, jahan drop hua woh dikhao." Multiple CTEs use karo — interviewer impressed hota hai.


SQL Case Study 2: E-Commerce Sales Analysis

The Scenario

"An online retailer wants to understand their sales performance, customer behavior, and product trends using SQL."

Tables Given

-- Table: orders
-- order_id | customer_id | order_date | total_amount | status

-- Table: order_items
-- item_id | order_id | product_id | quantity | unit_price

-- Table: products
-- product_id | product_name | category | brand

-- Table: customers
-- customer_id | name | city | signup_date

Questions & Solutions

Q1: What is the total revenue per month for the last 12 months?

SELECT 
DATE_TRUNC('month', order_date) AS month,
SUM(total_amount) AS monthly_revenue,
COUNT(DISTINCT order_id) AS total_orders,
COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE status = 'completed'
AND order_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

🧠 Always filter for completed orders. Cancelled/returned orders ko count karna galat hoga — interviewer yeh check karega ki tumne yeh socha ya nahi.

Q2: Find the top 5 products by revenue, with their category.

SELECT 
p.product_name,
p.category,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
SUM(oi.quantity) AS units_sold
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.status = 'completed'
GROUP BY p.product_name, p.category
ORDER BY total_revenue DESC
LIMIT 5;

Q3: Calculate month-over-month revenue growth rate.

WITH monthly_revenue AS (
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(total_amount) AS revenue
FROM orders
WHERE status = 'completed'
GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
ROUND(
(revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 /
NULLIF(LAG(revenue) OVER (ORDER BY month), 0),
2
) AS growth_rate_pct
FROM monthly_revenue
ORDER BY month;

🧠 LAG() yaad hai? Previous row ki value deta hai. NULLIF se divide-by-zero error se bachte hain — yeh small detail interview mein brownie points deta hai.

Q4: Identify customers who haven't ordered in the last 90 days (churn candidates).

SELECT 
c.customer_id,
c.name,
c.city,
MAX(o.order_date) AS last_order_date,
CURRENT_DATE - MAX(o.order_date) AS days_since_last_order
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.customer_id, c.name, c.city
HAVING MAX(o.order_date) < CURRENT_DATE - INTERVAL '90 days'
ORDER BY days_since_last_order DESC;

🧠 Yeh DecisionTree ka actual project se related hai! Unka churn prediction case study isi concept pe based tha. Interview mein connect karo: "This SQL identifies potential churn candidates, which aligns with your packaging distributor case study."

Q5: Find the average order value (AOV) by customer city, and flag cities that are below the overall average.

WITH city_aov AS (
SELECT
c.city,
AVG(o.total_amount) AS avg_order_value,
COUNT(DISTINCT o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.city
),
overall AS (
SELECT AVG(total_amount) AS overall_aov
FROM orders
WHERE status = 'completed'
)
SELECT
ca.city,
ca.avg_order_value,
ca.order_count,
ov.overall_aov,
CASE
WHEN ca.avg_order_value < ov.overall_aov THEN '⚠️ Below Average'
ELSE '✅ On Track'
END AS status_flag
FROM city_aov ca
CROSS JOIN overall ov
ORDER BY ca.avg_order_value DESC;

🧠 CASE WHEN = SQL ka if-else. Business mein flags lagana bahut common hai — "kaunsi city mein focus karna hai?" bol ke actionable insight de rahe ho.


SQL Case Study 3: Customer Retention Analysis

The Scenario

"A subscription-based SaaS company wants to understand their customer retention patterns. They have monthly subscription data."

Table Given

-- Table: subscriptions
-- subscription_id | customer_id | start_date | end_date | plan_type | monthly_price
-- (end_date IS NULL means still active)

Q1: What is the overall churn rate?

SELECT 
COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) AS churned,
COUNT(*) AS total,
ROUND(
COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) * 100.0 / COUNT(*),
2
) AS churn_rate_pct
FROM subscriptions;

Q2: What is the churn rate by plan type?

SELECT 
plan_type,
COUNT(*) AS total_customers,
COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) AS churned,
ROUND(
COUNT(CASE WHEN end_date IS NOT NULL THEN 1 END) * 100.0 / COUNT(*),
2
) AS churn_rate_pct
FROM subscriptions
GROUP BY plan_type
ORDER BY churn_rate_pct DESC;

🧠 COUNT(CASE WHEN ...) is a very powerful pattern. Conditional counting bina extra filter ke — ek hi query mein total aur churned dono nikal lo. Ratt lo yeh pattern.

Q3: Calculate customer lifetime (in months) for churned customers.

SELECT 
plan_type,
ROUND(AVG(
EXTRACT(MONTH FROM AGE(end_date, start_date)) +
EXTRACT(YEAR FROM AGE(end_date, start_date)) * 12
), 1) AS avg_lifetime_months,
ROUND(AVG(monthly_price * (
EXTRACT(MONTH FROM AGE(end_date, start_date)) +
EXTRACT(YEAR FROM AGE(end_date, start_date)) * 12
)), 0) AS avg_lifetime_value
FROM subscriptions
WHERE end_date IS NOT NULL
GROUP BY plan_type;

🧠 Lifetime Value (LTV) = monthly price × months active. Very common interview question. Interviewers love when you calculate LTV because it connects data to business revenue.