At its simplest, APPLY
is similar to JOIN
but allows for a dependency between the data sets being merged together.
Microsoft introduced the APPLY
operator in SQL Server 2005, with two variants: CROSS APPLY
and OUTER APPLY
. CROSS APPLY
is semantically similar to INNER JOIN
, whereas OUTER APPLY
is akin to OUTER JOIN
.
“If you don't understand the APPLY operator, your skills are somewhere around the 50th percentile at best.” -- Adam Machanic
APPLY
is not always the best solution, but solves a number of problems in an elegant manner.
We can use sys.dm_exec_connections
to get basic information for each session, but this doesn't give us query text. There is a table-valued function named sys.dm_exec_sql_text([handle] varbinary)
but we need to pass in the handle for each row in our connections DMV.
Unfortunately, you cannot use INNER JOIN
to join to a function which expects a parameter.
APPLY
to execute table-valued functions as though they were tables.OUTER APPLY
can show rows that CROSS APPLY
would filter out.We want to figure out how many times somebody has ordered a water bottle from the Adventure Works store. We'd also like to see which product category and subcategory this belongs to.
APPLY
to return data from ad hoc (derived) functions.APPLY
worked better because early aggregation allowed us to reduce the number of rows going through a nested loop join.We want to see each customer's latest order, PO numbers associated with those orders, and the total due on each order.
There are three set-based options that we can use to perform this operation:
APPLY
There are Row By Agonizing Row (RBAR) solutions available like using a cursor/WHILE loop, but these will not perform well. We might also put something together with temporary tables which could scale, but start with simple.
APPLY
.APPLY
-based solution is inefficient for small loads but scales extremely well.With small data sets, pick the one you prefer most, as they will all perform well enough.
This is a generalization of the "specific child record" problem.
We want to see the average, minimum, maximum, and total prices of each customer's last 5 orders.
There are still three set-based options that we can use to perform this operation:
APPLY
With that said, however, the nested sub-query quickly becomes nasty to write, and we already know from the prior example that it doesn't scale well.
Similar to the specific case, the APPLY
version looks worse early on but scales much better.
Use either for small problem sets, but shift to APPLY
as the sets get larger.
APPLY
More Efficient?NOT ALWAYS!
APPLY
runs once for each element in the "parent" (or left-hand) side. Because each iteration is unique, you typically want to see APPLY
generate nested loop joins.
APPLY
and EfficiencyThe ideal scenario for APPLY
has:
In other words:
This also worked for aggregation because we had one "parent" row.
"Relatively few interesting rows out of a huge total number of rows" is a verbose way of saying "when it is more efficient to seek rather than scan an index." APPLY
does much better when it makes sense to seek.
The APPLY
operator will consequently perform poorly when:
Use other techniques in these cases or see if you can reduce data requirements.
SQL Server calculations tend to be repetitious. The APPLY
operator can help us with that by feeding the columns from an earlier table or derived function into a later derived function. In other words, named columns you create in one APPLY
statement can be used in the next APPLY
statement.
Use the APPLY
operator to remove repetition from calculations.
The APPLY
operator excels at a few activities:
This is not a cure-all operator, though. Be sure to try writing your code a few different ways to see what performs best.
To learn more, go here:
https://csmore.info/on/apply
And for help, contact me:
feasel@catallaxyservices.com | @feaselkl
Catallaxy Services consulting:
https://CSmore.info/on/contact