Analyzing Business Data with T-SQL

Kevin Feasel (@feaselkl)
https://csmore.info/on/business

Who Am I? What Am I Doing Here?

Motivation

This talk is for developers, DBAs, and analysts who are comfortable with SELECT, JOIN, and GROUP BY and want to level up their T-SQL analytical skills.

You will leave with:

  1. A reusable calendar table you can drop into any SQL Server environment.
  2. Five window function patterns for running totals, period comparison, and ranking.
  3. The APPLY operator pattern for correlated subqueries that JOINs can't handle.

T-SQL Techniques We'll Cover

  • Common Table Expressions (CTEs)
  • Window Functions (OVER, LAG, LEAD)
  • Ranking Functions (ROW_NUMBER, RANK, DENSE_RANK)
  • The APPLY Operator (CROSS APPLY, OUTER APPLY)
  • Calendar Tables and CROSS JOIN Matrices
  • DATETRUNC and Modern Date Functions

What We'll Learn

What We'll Learn

What We'll Learn

Agenda

  1. What is a KPI?
  2. Financial KPIs
  3. Customer KPIs
  4. Sales KPIs
  5. Avidity KPIs

What is a Key Performance Indicator?

A key performance indicator (KPI) is a performance measurement of how well an organization is doing at achieving a specific goal.

KPIs typically follow a pattern:

  • Numeric measurements
  • Tied to an asserted organizational goal
  • Expected to identify whether the organization is meeting its goal
  • Usually has a target--a minimum (or maximum) threshold for identifying success
  • Has a specific grain

The Grain of a Measure

In the data warehousing world, we have the notion of a grain. The grain of a fact (or measurement) is the maximum level of specificity for that measurement. We usually define grain in terms of dimensions, explanatory information which helps provide relevant context to the fact.

Ex: what is the grain for orders at a grocery store?

Grains and Granularity

Once we know the grain of our measure, we know we can't dig any deeper (without making certain potentially-scary assumptions). We can, however, aggregate results and move up in our grain.

We also cannot move "orthogonal" to an existing grain. If we collect order data by customer and date, we cannot later aggregate this data by "missing" features like store location or register number.

We can, however, aggregate if there is a mapping function from our initial grain to the new grain, such as from customer to customer's favorite color.

Aggregations and Additivity

We can aggregate any number, but some aggregations don't make sense. There are three levels of additivity:

  • Additive -- We can sum up data rows without a problem. Ex: order revenue
  • Semi-additive -- We can sum up data rows along certain dimensions but not all dimensions. Ex: bank account balance
  • Non-additive -- We cannot sum up rows along any dimension. Ex: completion percentage, P/E ratio

Facts, Measures, and KPIs

A fact is a record in a dataset which tells us about something which has happened. A measure is some computation or explanation about the fact. A KPI is a measure which ties back to business need.

  • Fact -- We made a sale to Alice on April 9, 2023. We sold her 6 bags of red mulch at a price of $2.99 per bag.
  • Measure -- The total sale price, before tax, was $17.94. We sold this at 1:28 PM. Alice checked out at register 3. The SKU of the red mulch is M123-A.
  • KPI -- Our store has sold $1,800 in red mulch during the month of April. Our target for red mulch sales in April is 3,100.

Agenda

  1. What is a KPI?
  2. Financial KPIs
  3. Customer KPIs
  4. Sales KPIs
  5. Avidity KPIs

Financial KPIs

Revenue, cost, and profit are additive measures -- we can sum them freely across any dimension.

  • Revenue -- Total sales from order lines (aggregation, GROUP BY, HAVING)
  • Cost -- Estimated COGS using Common Table Expressions (CTEs)
  • Profit -- Revenue minus cost, with window functions (OVER clause) for grand totals

Demo Time

Agenda

  1. What is a KPI?
  2. Financial KPIs
  3. Customer KPIs
  4. Sales KPIs
  5. Avidity KPIs

Customer KPIs

Counting customers correctly means classifying their status over time.

  • Customer Counts -- New, retained, inactive, resurrected, or churned (DATETRUNC, CASE)
  • Calendar Tables -- Fill gaps in time series with CROSS JOIN matrices
  • Geography -- LAG/LEAD for period comparison; CROSS JOIN for month-to-month analysis

Demo Time

Agenda

  1. What is a KPI?
  2. Financial KPIs
  3. Customer KPIs
  4. Sales KPIs
  5. Avidity KPIs

Sales KPIs

How quickly do prospects convert to customers?

  • Time to First Conversion -- The APPLY operator (CROSS APPLY, OUTER APPLY)
  • APPLY runs a subquery once per row from the outer table
  • Chain multiple APPLY operations for readable calculations

Demo Time

Agenda

  1. What is a KPI?
  2. Financial KPIs
  3. Customer KPIs
  4. Sales KPIs
  5. Avidity KPIs

Avidity KPIs

Who are our most engaged customers?

  • Top Customers by Month -- TOP WITH TIES and ranking window functions
  • ROW_NUMBER vs RANK vs DENSE_RANK -- three ways to handle ties
  • PARTITION BY for per-group ranking within window functions

Review the code repository for several bonus KPIs and techniques!

Demo Time

Wrapping Up

Over the course of this talk, we have seen how to calculate a variety of KPIs for a retail company using T-SQL. Along the way, we covered:

  • Common Table Expressions for building queries in layers
  • Window functions for running totals and period-over-period comparison
  • The APPLY operator for correlated subqueries
  • Ranking functions for top-N analysis with proper tie handling
  • Calendar tables for filling gaps in time series data

Wrapping Up

To learn more, go here:
https://csmore.info/on/business


And for help, contact me:
feasel@catallaxyservices.com | @feaselkl


Catallaxy Services consulting:
https://CSmore.info/on/contact