SQL Server 2012 Analysis Services RC0 introduced a new SQL Server Profiler event class, DAX Query Plan, under the Query Processing event category. This
is an advanced and rich new event class, but there has been no official document
yet. Nonetheless, it has already attracted the attention of some users who are
pushing us to release more information as soon as possible. While waiting for an
official document to come out, I’ll try to find some spare time to temporarily
fill the gap by providing some background information on this event class in a
series of blog posts. As always, my goal is to provide accurate information with
sufficient technical details. There are plenty of other BI professionals who are
eager to help average users to learn this feature through intuitive, practical
examples. I’ll be using the tabular model AdventureWorks for SQL Server 2012 RC0
when I need to demonstrate different aspects of DAX query plans through
examples.
The DAX Query Plan event class has four event subclasses:
- DAX VertiPaq Logical Plan
- DAX VertiPaq Physical Plan
- DAX DirectQuery Algebrizer Tree
- DAX DirectQuery Logical Plan
Trace events of subclasses 1 and 2 are fired when a tabular
database is in VertiPaq mode. Trace events of
subclasses 3 and 4 are fired when a database is in DirectQuery mode. Since most tabular databases are likely to
run in VertiPaq mode, I’ll focus my discussions on the
first two types of events.
Logical Plans and Physical Plans
DAX Formula Engine evaluates a DAX expression in multiple stages and generates several tree data structures along the way, see Figure 1. The new trace event outputs two of the trees to help users investigate logic or performance issues. This is a great leap forward from the dark days of debugging MDX expressions. Logical plan trees show the primitive operations that make up the higher level user functions. The powerful yet sometimes mysterious automatic cross-table filtering becomes explicit in logical plan trees. Properties related to the sparsity of a scalar subtree tell you why DAX Formula Engine chooses one execution plan over another. If poor performance is caused by the Formula Engine, physical plan trees can help you locate the expensive sub-expressions that caused the problem.
Format of
the Query Plans
Let’s first study the general structure and format common to both types of plan trees. Send the following DAX query to the tabular AdventureWorks database:
define measure 'Internet Sales'[Total Sales Amount] = Sum([Sales Amount]) evaluate filter( addcolumns( crossjoin( values('Date'[Calendar Year]), values('Product Category'[Product Category Name]) ), "Total Sales Amount", [Total Sales Amount] ), not isblank([Total Sales Amount]) )
When you execute this query, the logical plan tree and the physical plan tree are shown in Figures 2 and 3 respectively.
As you can see, each plan tree is output as a multi-line
text. Each line represents a single operator node in the tree. The hierarchical
structure of a tree is maintained by indentation. Child nodes show up indented
below their parent nodes. Sibling nodes have the same level of indentation under
their parent. Each line begins with the name of the operator followed by a colon
and properties of the operator starting with the operator type.
Types of Operators
There are two types of logical plan nodes and two types of
physical plan nodes as shown in the table below. We’ll spend a lot more time in
future posts drilling into the details of various operators and their
properties.
Plan
Type
|
Operator
Type
|
Description
|
Logical Plan
|
ScaLogOp
|
Scalar Logical Operator
Outputs a scalar value of type numeric, string, Boolean,
etc.
|
RelLogOp
|
Relational Logical Operator
Outputs a table of columns and rows.
| |
Physical Plan
|
LookupPhyOp
|
Lookup Physical Operator
Given a current row as input, calculates and returns a scalar
value.
|
IterPhyOp
|
Iterator Physical Operator
Given a current row as an optional input, returns a sequence
of rows.
|
Number of Trace Events per Query
Each time the DAX Formula Engine is called to evaluate a DAX expression, a pair of DAX Query Plan events are generated. Therefore, a DAX query (Evaluate statement) triggers exactly two events: a logical plan event and a physical plan event. But an MDX query may produce any number of pairs of events depending on how many times the MDX Formula Engine has to call into the DAX Formula Engine. At the time of this writing, DAX Formula Engine cannot call back into MDX Formula Engine, although this may change in the future.
Event Trigger Points
Ideally, the DAX Formula Engine should generate both the logical plan and the physical plan before any query execution happens so users can capture the plans without being blocked by potentially long-running operations. But this is not the case in the current implementation. Logical plans are built in two stages. The first stage is quick and light-weight, but the second stage may need to execute a portion of the tree therefore potentially expensive. Unfortunately the logical plan event is fired after the second stage is completed, so sometimes users may have to wait for certain time-consuming operations to finish before they can capture the logical plan event. But in most cases, constructing and simplifying a logical plan is a quick process. On the other hand, building a physical plan can often involve expensive operations. Although currently the trace event only shows two types of physical plan nodes: lookup and iterator, there is actually a third type of plan node: spool. A spool plan is when an operator materializes its result in memory by executing its entire subtree. A physical plan tree may contain many nodes built from spools, each of which requires partial execution of a subtree before the entire physical plan tree is fully constructed. In particular, all leaf level nodes which require fetching data from the VertiPaq Engine currently always build spools to store VertiPaq results, therefore, users can see the physical plan event only after all VertiPaq queries have completed.
The new DAX Query
Plan trace event can assist you in writing efficient DAX expressions and
troubleshooting problematic DAX behavior. How you use them is up to you, but
first you need to understand the information contained within the plans and how
to interpret it. Today we have gone over the basics such as types of plans,
format of text, types of plan nodes, and when and how frequently the events are
fired. Next time we are going one step further to examine the various properties
of plan nodes.