Monday, March 21, 2011

The Logic behind the Magic of DAX Cross Table Filtering

Automatic cross filtering between columns of the same table or related tables is a very powerful feature of DAX. It allows a measure to evaluate to different values for different cells in a pivot table even though the DAX expression for the measure does not change. Filter context is the underlying mechanism that enables this magic behavior. But it is also a very tricky concept that even befuddles some DAX experts. Marco Russo and Alberto Ferrari have introduced DAX filter context in Chapter 6 of their book Microsoft PowerPivot for Excel 2010. Marco has also blogged about how Calculate function works. Recently I have run into many questions from advanced DAX users which tell me that people are still confused about how filter context works exactly. And this will be the subject of today’s post.
This post assumes that you already have basic knowledge about measures, row context, filter context, and DAX functions Calculate, Values, All, etc.
A level 200 pop quiz on DAX
If you think you already know how filter context works, let me ask you a couple of level 200 questions on DAX to see if you can explain the nuances of some DAX expressions. If you don’t feel like being challenged now, it is still beneficial to read the questions so you have some examples to better understand the following sections. The questions are based on the data model inside the publicly available sample PowerPivot workbook Contoso Samples DAX Formulas.xlsx. You can download the sample workbook to try out the formulas yourself if you want to, but it is not required to answer the questions.
Question #1.
People have heard that fact tables are automatically filtered by slices on dimension tables, but not the other way around, or in more general terms, if there is a relationship from table A to table B, A is automatically filtered by any slices on columns of B but B is not automatically filtered by any slices on columns of A. So if you select
DimProductSubcategory[ProductSubcategoryName] = “Air Conditioners”
on a pivot table slicer, measure
CountRows(DimProduct)
returns 62 as DimProduct is limited to air conditioners. On the other hand, if you select
DimProduct[ProductLabel] = “0101001”,
measure
CountRows(DimProductSubcategory)
returns 44 instead of just 1 although only a single product is selected. To filter DimProductSubcategory by the selected product label, you can define a measure as
Calculate(CountRows(DimProductSubcategory), DimProduct)
which returns 1. So it seems like when you explicitly add DimProduct as a setfilter argument of Calculate, DimProductSubcategory will be filtered by DimProduct. But if I define a measure as
Calculate(CountRows(DimProductSubcategory), Values(DimProduct[ProductLabel]))  
to explicitly add the column that I know having a slice from the pivot table to the Calculate function , the measure formula returns 44 again. So what makes setfilter expression DimProduct work but Values(DimProduct[ProductLabel]) not work even though the filter only comes from [ProductLabel] column? If you think you have to add foreign key DimProduct[ProductSubcategoryKey] to the filter context in order for DimProductSubcategory to be filtered by DimProduct, you can try
Calculate(CountRows(DimProductSubcategory), Values(DimProduct[ProductSubcategoryKey]))
but it still returns 44. If you have enough patience, you can use Values function to explicitly add all 33 columns in DimProduct one by one as setfilter arguments to Calculate function and you still will get 44 back. So what is the difference between table expression DimProduct and the enumeration of all 33 columns in that table?
Question #2.
There are 2556 records in DimDate table, therefore if you add a measure with expression
CountRows(DimDate)
to a pivot table without any filters, the measure value would be 2556. Now if you add a second measure with expression
Calculate(CountRows(DimDate), FactSales)
to the same pivot table, the measure value would be 1096 since DimDate table is filtered by FactSales table and only dates with sales records are included. But if you add a third measure with expression
Calculate(CountRows(DimDate), All(FactSales))
to the pivot table, the measure value becomes 2556 again. Since this pivot table has no filters anywhere, shouldn’t FactSales and All(FactSales) return the same table? Now add a fourth measure with expression
Calculate(CountRows(DimDate), Filter(All(FactSales), true))
to the pivot table, the measure value becomes 1096 again. All three setfilter arguments return exactly the same table, why would we get back different results?
With these questions in mind, let’s examine the logic foundation upon which the magic world of DAX is built. At the end of the post, you will be able to find a logical explanation to all these seemingly inconsistent results.
The expanded view of a DAX base table
The best way to understand DAX cross table filtering is to think of each base table as extended by its related tables. When a relationship is created from table A to table B, the new A, which is really A left outer join B, includes both columns of A and columns of B. So in DAX, a table reference FactSales really refers to
FastSales
LOJ DimProduct LOJ DimProductSubcategory LOJ DimProductCategory
LOJ DimStore LOJ DimGeography LOJ DimDate LOJ DimChannel LOJ DimPromotion,
where LOJ means left outer join. This interpretation makes it easy to understand some other DAX syntax. For example, in DAX expression
Filter(FactSales, Related(DimProduct[ProductLabel])  = “0101001”),
Related(DimProduct[ProductLabel])  refers to the value of column DimProduct[ProductLabel]  in the extended FactSales table. As a second example, DAX expression
AllExcept(FactSales, DimProduct[ProductLabel])
returns a table with all columns of extended FactSales table except for column DimProduct[ProductLabel].


Build initial filter context
DAX filter context is a stack of tables. At the beginning, the stack is empty. Given a pivot table, a filter context is initially populated by adding slicers and page filters. For each cell in a pivot table, current members of row labels and column labels also add filters to filter context. Other pivot table operations like visual totals add to initial filter context as well but I will keep things simple here. At this point, we have set up an initial filter context in which the measure expression of the current cell is to be evaluated.
Measure invocation
If SumOfSales is the name of a measure and Sum(Sales[Amount]) is its DAX formula, DAX expression
[SumOfSales]
is equivalent to
Calculate(Sum(Sales[Amount]))
 and DAX expression
[SumOfSales](Date[Year] = 2001, Store[Country] = “USA”)
is equivalent to
Calculate(Sum(Sales[Amount]), Date[Year] = 2001, Store[Country] = “USA”).
So the syntax sugar which makes a measure name look like a function name is just a clever way to add tables to filter context before evaluating the expression associated with the measure. Since invoking a measure implicitly calls Calculate, from now on I’ll just focus on Calculate function as the same rules apply equally to measures.
Add tables to filter context
Calculate function performs the following operations:
1.       Create a new filter context by cloning the existing one.
2.       Move current rows in the row context to the new filter context one by one and apply blocking semantics against all previous tables.
3.       Evaluate each setfilter argument in the old filter context and then add setfilter tables to the new filter context one by one and apply blocking semantics against all tables that exist in the new filter context before the first setfilter table is added.
4.       Evaluate the first argument in the newly constructed filter context.
If a new table is added to filter context and it has blocking semantics against some tables already in the filter context, the affected tables are checked one by one, all common columns with the new table are marked as blocked on the existing table.
Let’s look at an example. Assume the current filter context has two filters: one filter is Date[Year] = 2011, the other filter is Store[Country] = “Canada”. We want to evaluate the following expression in the context
AverageX(Distinct(Date[Month]), Calculate(Sum(Sales[Amount]), Store[Country] = “USA”)).
The first argument of AverageX sets a month in row context.  When it comes to Calculate, it first removes the month from row context and adds it to filter context, it does not block anything since there is no [Month] column in existing filters. Next Calculate adds Store[Country] = “USA” to filter context which blocks existing filter Store[Country] = “Canada”. When Sum(Sales[Amount]) is evaluated, Sales table is filtered by the current month in 2011 and stores in USA.

Targets of filter context
After so much effort populating and modifying a filter context, when will the filters be applied? In DAX, the filters in a filter context apply to following DAX table expressions:
1.       A table expression that is simply a table reference, such as FactSales.
2.       Values(Table[Column]).
3.       Distinct(Table[Column]).
In cases of 2 and 3, the Table is filtered by filter context and then distinct values of [Column] are extracted from the filtered table.
So if your expression is
Calculate(SumX(Filter(FactSales, [SalesQuantity] > 1000), [SalesAmount]), Date[Year] = 2011),
the filter context only restricts FactSales and has no effect whatsoever on other parts of the formula. If you image every DAX formula is represented as a tree of parent and child function calls, a filter context is built at the top or in the middle of the tree but takes effect at leaf level table nodes.


Note that DAX function Sum(T[C]) is just a shorthand for SumX(T, [C]), the same is true for other aggregation functions which take a single column reference as argument. Therefore the table in those aggregation functions is filtered by filter context.
Apply filters to a target table
Finally we have identified a target table and are ready to apply filters from filter context. For each filter table in the filter context, we check to see if there are any common columns between the target table and the unblocked columns of the filter table. If there is at least one common column, the target table is semi-joined with the filter table, or in SQL-like terms
SELECT *
FROM TargetTable AS t
WHERE EXISTS
(
SELECT *
FROM FilterTable AS f
WHERE t.CommonColumns = f.CommonColumns
)
Each filter table is applied to the target table independently, so the target table is filtered by all relevant filters.
All, AllExcept, AllNoBlankRow
So far I have said that each setfilter argument of Calculate function returns a table which is added to filter context. Well, that is true as long as the setfilter is not one of the All functions. The All functions should really be renamed as BlockColumns when they are used as setfilter arguments. If one of the All functions is used as the top-level function of setfilter, it only blocks common columns of earlier tables but does not add itself to filter context.
In all other places, including as a sub-expression of a setfilter but not at the top level, All functions behave like any other DAX table expressions and always return a table. One special feature of All functions is that the Table argument inside All(Table), All(Table[Column]), AllExcept(Table, …), AllNoBlankRow(Table), etc. is not filtered by the current filter context.
Pop quiz answers
Answer to question #1.
When the initial filter context contains column DimProduct[ProductLabel], table DimProductSubcategory is not filtered as it does not have that column.
Now look at the next formula
Calculate(CountRows(DimProductSubcategory), DimProduct).
The setfilter argument DimProduct is filtered by [ProductLabel], and then table DimProductSubcategory is filtered by table DimProduct since they both share the columns from table DimProductSubcategory and table DimProductCategory.
Move onto the next two formulas
Calculate(CountRows(DimProductSubcategory), Values(DimProduct[ProductLabel]))
Calculate(CountRows(DimProductSubcategory), Values(DimProduct[ProductSubcategoryKey]))
Both setfilter arguments are a single column table and the column comes from table DimProduct. Since table DimProductSubcategory does not have any column from DimProduct, it is not filtered by filter context. For the same reason, you can add any columns from DimProduct to the filter context and none of them would impact DimProductSubcategory.
Answer to question #2.
In the first formula
Calculate(CountRows(DimDate), FactSales)
Both table DimDate and table FactSales share columns from DimDate, so DimDate is filtered by FactSales.
In the second formula
Calculate(CountRows(DimDate), All(FactSales))
All(FactSales) blocks any columns from FactSales, but since the filter context is empty, it has no effect. When DimDate is evaluated, filter context is still empty.
In the third formula
Calculate(CountRows(DimDate), Filter(All(FactSales), true))
The All function is not at the top level of setfilter argument, table Filter(All(FactSales), true) is added to filter context, table DimDate is filtered by filter context for the same reason as in the first formula.


56 comments:

  1. I really enjoyed this post. You write about this topic very well. There are many cherished moments in life, why not wear a beautiful dress! When looking back on special memories of your child wearing a gorgeous dress, it will make a fond memory.

    online pharmacy

    ReplyDelete
  2. Jeffrey, this is a very interesting post - thank you so much for sharing this knowledge on DAX internals

    I have a question regarding the first example.

    For the following DAX expression

    Calculate(CountRows(DimProductSubcategory), Values(DimProduct[ProductLabel]))

    You said that

    [Since table DimProductSubcategory does not have any column from DimProduct, it is not filtered by filter cont]

    Does that mean that because VALUES return a 1 column table, it does not have the same base table + extended table configuration as a regular table reference would have? In other words, VALUES would only have one column, hence it won't block any other table column [except the one that is returning]?

    Thanks!

    - Javier Guillen

    ReplyDelete
  3. The statement is a bit confusing. Let me clarify. The extended version of DimProductSubcategory does not have any common columns with the non-extended part of DimProduct. Column DimProduct[ProductLabel] comes from the non-extended part of DimProduct, therefore does not filter DimProductSubcategory directly.
    You are right that Values() function returns a single column table therefore can only block one column in the filter context.

    ReplyDelete
    Replies
    1. Can you please tell me why Calculate(CountRows(DimProductSubcategory), Values(DimProduct[ProductSubcategoryKey])) doesn't have common shared column? I think they are both contain ProductSubcategoryKey. I guess the reason why it doesn't filter is because DimProduct[ProductLabel] has no common column with Values(DimProduct[ProductSubcategoryKey]), so DimProduct[ProductLabel] = “0101001” will not filter Values(DimProduct[ProductSubcategoryKey]).

      Delete
    2. never mind, I think I understand now.

      Delete
  4. Hi Jeffrey,
    Firstly amazing post !
    I have a question WRT to Filter functions in a filter context of pivot table.
    Why is it an error if we write Calculate(Sum(Quota),Filter(Budget,Budget[Month]=Date[Month])) When i have Date[Month] on my rows ? Is the filter unable to equate Budget[Month]=Date[Month]to the existing Date[Month] on Rows ? Secondly, If I replace Date[Month] with Max(Date[Month]) it works . Can you share your thoughts ?

    ReplyDelete
  5. A column added to Rows on a pivot table is in the filter context, but row context. A column reference like Date[Month] in DAX is used to get a value in the current row context, you have to use Values(Date[Month]) to retrieve values from the current filter context.
    Max(Date[Month]) is equivalent to MaxX(Date, Date[Month]), here MaxX function starts a new scan over the Date table, hence establishing a new row context from which the second argument Date[Month] can retrieve a value.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. Hi Jeffrey,
    Could you please explain what the term "blocking semantic" means ?

    ReplyDelete
  8. The more recent terminology should be overwrite semantics. That means an inner filter will overwrite an outer filter on the same column. For example, in the following DAX expression with nested Calculate's:

    Calculate(Calculate(Sum(Sales[Amount]), Customer[Name]="Ian"), Customer[Name]="Oliver")

    the outer filter on Oliver is overwritten by the inner filter therefore the expression returns sum of sales amount by Ian. The outer filter on Oliver has no effect (because it's overwritten or blocked) when DAX engine calculates Sum(Sales[Amount]).

    ReplyDelete
  9. I have a question regarding this sentences.

    "DAX filter context is a stack of tables."

    Is a filter like

    "Store[Country] = “USA”

    either as a pivot filter or as a filter in a calculate function, "converted" into the full table from where the column is taken and than added to the filter context?

    ReplyDelete
  10. Store[Country] = "USA" is equivalent to Filter(All(Store[Country]), Store[Country] = "USA"). Therefore, the filter added to the filter context is a table of one column Store[Country] and one row "USA". This is different from Filter(All(Store), Store[Country] = "USA") which is a table containing all columns fore the underlying physical table 'Store'.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. Sorry the new comment notifications have been routed to my Junk Mail box. If you have a question, please post it under my new blog site: pbidax.wordpress.com

      Delete
  14. Hello. Thank you for the article. I am becoming mad about the Calculate function, and I am a bit desperated. In your example:
    AverageX(Distinct(Date[Month]), Calculate(Sum(Sales[Amount]), Store[Country] = “USA”)).
    You say that, after setting a month in the row context as a result of the iterator AverageX, Calculate turns it into the filter context, but why is it that if that expression is not surrounded by the Calculate function? As far as I understand it, it should do filter transition with the row context created by its own first argument. Could you tell me what is the logic of operators and why calculate does context transition with row contexts generated otherwhere? Many thanks.

    ReplyDelete
  15. Hi there. Thanks for the effort to explain this difficult subject. However, I think that Marco Russo's and Alberto Ferrari's attempts at explanation are much more readable for the average user as presented in their ultimate book on DAX (can't remember the full title right now). I find the explanations given by them more user-friendly, even though very detailed (which is a plus as well). The 'blocking semantics' is certainly something that does not make things clearer, unfortunately, as it does not immediately translate well into intuition. The calculate function applies two types of interaction between filters---either filters are put in an AND state or one OVERWRITES the other. OVERWRITING is the right name and immediately forms the right intuition. I hope you'll agree.

    ReplyDelete
  16. You can also utilize the Darkness to stab straight through an enemy. This will result in you ripping the heart scr888 free credit in 2019 straight out of one of your countless foes; consequently, your darkness creature will ultimately eat the heart.I guarantee that a bevy of gamers will flounder their first thirty minutes in the game.

    ReplyDelete

  17. Hi my loved one! I want Slims supplement to say that this article is amazing, nice written and come with almost all important infos. I'd like to look extra posts like this . The Gaming Club bears a license from the government of Gibraltar, and claims to be one of a select few casinos that have a license from the Gibraltar government. A devotee of the Interactive Gaming Council (IGC), The Gaming Club follows every the guidelines laid alongside by the organization, something that has taking into consideration a long artifice in it brute ascribed as a great area to gamble online.

    Everything not quite The Gaming Club feels good; be it the promotions, the huge number of games, the combination banking options on offer, the unbiased security measures, or the fair and blamed gaming practices the casino adopts.

    The Gaming Club motors along on software developed by one of the giants of online gaming software press forward Microgaming. The software it uses is protester and has a range of features designed to enhance your online gambling experience and make you desire to arrive support after all round of gambling you do here.

    Another hallmark of a fine casino is the atmosphere of its customer support team, and The Gaming Club does not disappoint upon this front.
    https://slimssupplement.com

    ReplyDelete
  18. This comment has been removed by the author.

    ReplyDelete
  19. I cannot thank Mr Benjamin service enough and letting people know how grateful I am for all the assistance that you and your team staff have provided and I look forward to recommending friends and family should they need financial advice or assistance @ 1,9% Rate for Business Loan .Via Contact : .  247officedept@gmail.com. WhatsApp...+ 19893943740. Keep up the great work.
    Thanks, Busarakham.

    ReplyDelete
  20. The files:
    Contoso Sample DAX Formulas.xlsx
    DAX in the BI Tabular Model.docx
    can be downloaded from here:
    https://www.microsoft.com/en-us/download/details.aspx?id=28572

    ReplyDelete
  21. Some more concepts understood in filter context like top level All filters in Calculate. Very well described.
    I have a question. I have exports data of a country for one year. Columns are three: Country, Product, and Amount. I have calculated top10 products and their total amount for each country. I want to filter country column also to top10.

    ReplyDelete
  22. Great article,This blog looks great content with Excellent Information.
    Thank You...

    Power BI Online Training

    ReplyDelete
  23. Wish to master the use of SUMX DAX function in Power BI? At Enterprise DNA we have got you covered with our comprehensive Power BI & DAX Master Class. Sign up today and improve productivity with Power BI.

    ReplyDelete
  24. This comment has been removed by the author.

    ReplyDelete
  25. You can do very creative work in a particular field. Exceptional concept That was incredible share. Roy Batty Coat

    ReplyDelete
  26. Great information about wilderness for beginners giving the opportunity for new people.
    Mr Robot Jacket

    ReplyDelete
  27. Hello, could you please tell me which webhost you're using? I've loaded your blog in three different browsers, and I must tell that it loads far faster than the majority. Can you recommend a reputable internet hosting company at a reasonable price? Thank you very much; I really appreciate it! Womens Aviator Jacket

    ReplyDelete
  28. Excellent post. I was checking continuously this blog and I’m impressed!
    Very helpful info specially the last part �� I care for such info a lot.
    I was seeking this certain information for a long time.
    Thank you and best of luck.Click Me Here오피월드


    4CHHE

    ReplyDelete
  29. Hi There,
    Thank you for sharing the knowledgeable blog with us I hope that you will post many more blog with us:-
    All over the world, Buy SGT-78 Online you can connect with our online chemical shop named SmartResearchChemicals at any hour of the day.
    Email:info@onlineresearchchemlab.com
    Click here for more information:- more info

    ReplyDelete
  30. Of course, your article is good enough, 카지노사이트 but I thought it would be much better to see professional photos and videos together. There are articles and photos on these topics on my homepage, so please visit and share your opinions.


    ReplyDelete
  31. Great information about wilderness for beginners giving the opportunity for new people. 4th Hokage Cloak

    ReplyDelete
  32. Buy DNP Online | DNP 2 4 Dinitrophenol for sale
    DNP (2,4-Dinitrophenol), an industrial chemical with various applications, has gained steady popularity as a fat loss tool. Boasting an astounding 50% increase in metabolic rate, it is able to contribute to reported fat losses of 10-12 pounds in 8 days of use.

    visit: https://bestdnpshop.com/product/buy-dnp-online/
    Tel / WhatsApp: +1 (937) 601 - 3401

    ReplyDelete
  33. Thanks for the best blog. it was very useful for me.keep sharing such ideas in the future as well. bella swan jacket

    ReplyDelete
  34. Media Foster is the best SEO company who is known for providing result-oriented white hat & best SEO services at Mohali for all businesses. if you need more information visit our website.출장샵
    출장샵
    출장샵
    출장샵
    출장샵
    출장샵

    ReplyDelete
  35. I really like this article, thank you for sharing this informative article. Can I expect an article in detail for this topic- Ruby On Rails course

    ReplyDelete
  36. "Your blog has given me such insightful information on the subject that it has deepened my understanding. Thank you for sharing!"
    Golang Training

    ReplyDelete
  37. "I really enjoyed reading your blog post; it was very well-written and extremely informative."
    Golang Certification

    ReplyDelete
  38. "I appreciate you sharing such a useful topic.
    Golang Course

    ReplyDelete
  39. I adored your blog post very much! Your observations are so insightful and energizing. It's obvious that you spent a lot of time and effort writing and researching this essay. Your writing is interesting and simple to read, making even difficult subjects seem understandable. Readers like myself who are looking for insightful and well-informed information appreciate you sharing your expertise and viewpoint on this subject. Hopefully you'll write more informative stuff in the future. Continue your excellent job!
    Mulesoft Certification

    ReplyDelete
  40. We appreciate your commitment to producing informative and thought-provoking content. I now consider your blog to be a wonderful resource, and I anxiously anticipate your upcoming posts. Continue your excellent work!
    Mulesoft Training

    ReplyDelete
  41. Great great post; it definitely improved my understanding of the subject.
    CCSP Course

    ReplyDelete
  42. This blog definitely opened my eyes to new perspectives, and I couldn't agree more with what you said.
    CCSP Certification

    ReplyDelete
  43. "I couldn't agree more with your points; this blog really opened my eyes to new ideas!"
    SAP Analytics Cloud Certification

    ReplyDelete
  44. This blog piece was well-written and really interesting, and it made me hungry to check out more of your stuff.
    SAP Analytics Cloud Course

    ReplyDelete
  45. "Great post, I found it to be really insightful and stimulating!
    SAP Analytics Cloud Training

    ReplyDelete
  46. Students who are pursuing BSc from various universities can check their bsc time table online and prepare for their exams accordingly.

    ReplyDelete