My goals in this talk:
A window function is a function applied to a set of rows.
Window functions operate over windows of data.
There are four key elements to a window function:
OVER()
clause: defines that this is a window function and the window we will usePARTITION BY
): filter rowsORDER BY
): sorting when we need an ordered set, but also provides meaning to frames when aggregating dataROWS
, RANGE
): filter within a partition
There are five classes of window function in SQL Server:
MIN()
, MAX()
, AVG()
, COUNT()
, SUM()
, etc.ROW_NUMBER()
, RANK()
, DENSE_RANK()
, NTILE()
LAG()
, LEAD()
, FIRST_VALUE()
, LAST_VALUE()
PERCENT_RANK()
, CUME_DIST()
, PERCENTILE_CONT()
, PERCENTILE_DISC()
STRING_AGG()
Some of the use cases we will cover:
If we define a window function, we can only reference it in the SELECT and ORDER BY clauses:
SQL Server (typically) processes a SELECT statement in the following order (simplifying for space purposes):
Window functions are processed in the SELECT
section, so anything prior to that in a statement can't use the results of the window function.
In other words, WHERE
, GROUP BY
, HAVING
, et al, shape the window, so the window cannot subsequently shape them!
The easiest workaround, and something we've seen already, is to use a common table expression (CTE):
The RANGE
window operator is a lot more powerful in the SQL standards than what we have in T-SQL.
The RANGE
window operator in SQL Server is missing some great support around intervals:
By contrast, the ROWS
window operator has more flexibility.
Some additional things to keep in mind:
APPLY
The rule of thumb around making window functions as fast as possible is to provide an index in the following order:
Remember that the WHERE
clause operates before the window function, so if you have a great filter, put it before the POC.
SQL Server generally operates in row mode, in which one row is processed by an operator at a time.
Introduced in SQL Server 2012, but improved greatly in SQL Server 2016 and even more in 2019.
Batch mode provides:
For batch mode, you need:
If you have SQL Server 2016 or 2017 but don't want to work with a columnstore index, you can take advantage of a trick: create a filtered columnstore index, particularly one on an impossible condition.
Over the course of this talk, we have looked at the concept of window functions, as well as a variety of use cases for them and how to optimize window function queries.
To learn more, go here:
https://csmore.info/on/windows
And for help, contact me:
feasel@catallaxyservices.com | @feaselkl
Catallaxy Services consulting:
https://CSmore.info/on/contact