How the DAX Calculation Engine Works

No comments

Breaking down the way DAX works

The DAX engine is a complex beast. How it stores the data, then works through it to calculate answers is quite spectacular. Let’s step back for a moment though and try to distill it down into some simple steps, that we can memorise and intuitively start iterating over when we are building out Power BI models and attempting to understand what each formula is calculating for us in tables and charts.

One thing you can forget relatively quickly if diving straight into Power BI is that DAX calculates results in the format of a pivot table. When you’re building out chart and visuals, you can sometimes miss this key point. Every result in a chart is the same as the result stemming from one result, in one row of a pivot table.

DAX calculates every single result in each line of a pivot table individually! This is important to imagine because if you are struggling with a returned result you have to think exactly like this. ‘What is DAX calculating for each individual result?’.

Why is it this way? Well, every single result returned from a DAX formula will have a different ‘environment’ in which it is calculated in. Put simply SUM(Sales[Sales Amount]), could be thrown into a pivot or chart and filtered by either; dates, customers, products etc. All of these are different environments, they filter different tables or the same tables in different ways. In any new environment (or in more correct terminology, ‘context’) DAX will complete slightly different filtering and return different results accordingly.

Even in the same pivot or chart, we can have different contexts operating. Using dates as an example, ‘2012’ is a different context to ‘2013’ so a different result will be returned.

Hopefully, it’s clear why DAX has to calculate every result on its own and why it’s best to think exactly like this when you’re writing DAX or reviewing DAX formulas.

Can we break DAX calculations down into some simple steps? We definitely can, and let’s visualise it at the same time! As you are learning DAX, work through every single step here, and do it all the time. You will speed up your learning curve immensely.

Break it into 3 key steps

1. The Initial Filter Context

What is the environment in which the calculation is operating in? What is the context? What are the initial filters being placed on the underlying tables before any calculation takes place?

Before any calculation happens with a DAX formula you have entered, filters are placed on the underlying tables. This filter can come from numerous places.

Initial Filter Context

All highlighted parts of the above Power BI report are all filters. Some, like the slicers at the top of the page and the page level filter to the right automatically filter EVERY visual and piece of information on the report first up.

Individual visuals like the ‘Sales by Channel’ donut chart or the ‘Sales by Region’ map have an initial filter context of either channel or US state accordingly.

What’s amazing with these visuals, is that the separate sections within each of these individual visuals CAN, and DO filter the other charts as well dynamically when you click on any sub-section. It’s the drill down effect that allows you to get to answers far quicker than traditional reporting and analysis.

The best example of this is if you wanted to see all of these calculations for a particular day. You can click into a section of the ‘Sales Completed by Date’ column chart and then the initial filter context will change for EVERYTHING on the report page. Now that is seriously cool stuff.

Report page filtering

At this point, what is this dynamic filtering actually doing to the DAX calculation engine? It is altering the INITIAL FILTER context. Remember this!

2. Filters flow across tables through relationships

If you have built an efficient data model, you would have branched your data out into separate tables with different granular detail, like customers, products, sales etc.

When any filter it applied to a lookup table (customer or product table), then this will automatically filter through to the fact table (sales table) if you have set up your relationships correctly.

Remember, still no calculation has taken place, we are still only filtering the underlying tables to THEN perform a calculation over it.

Filtering Tables

Using the ‘Date’ table for an example. The column chart showing ‘Sales Completed by Date’ initially place just a filter on the dates table, but because we have linked this to the sales table the filter AUTOMATICALLY flows through to the ‘Sales Orders’ table. This table is totally filtered for each different date prior to any calculation being performed.

3. Evaluate the formula

Now, we know our environment, we understand the context we are in, we have filtered the underlying tables, that have flowed down through relationships. We are now ready to run a calculation.

The formula we have entered now triggers a calculation, like SUM, COUNTROWS, AVERAGEX. All the aggregation functions and iterators have their own way of calculating the result, but they now have the right context to evaluate in.

Formula evaluation

I’ve included a CALCULATE function here (which I know we haven’t really touched on yet). Let’s just focus in on the [Total Sales] measure. [Total Sales] runs separate calculation for EVERY individual calculation across the entire report page.  It sums up the correct column in the ‘Sales Orders’ table multiple times based on the initial filter contexts within which it is operating.

Those include ‘Channel’, ‘Regions’, ‘Warehouse’, ‘Dates’ and others. The card visual is also the result from NO additional filters other that the one coming from the slicer (‘2016’) and page level filter.

This is a lot to take in, but I re-iterate, as things get a little more complex you have to look at each calculation individually to understand the results you are receiving within Power BI.

Breaking down the formula evaluation piece

How the formula is actually evaluated can also depend on some additional context. These concepts are broken out into FILTER CONTEXT and ROW CONTEXT.

Filter context calculates results in a different way to Row context. They actually use two different calculation engines within DAX itself. The engine has been built this way to optimise performance. Running calculations one way versus another can make a huge difference to the performance of your models as they grow or due to the varying sizes of you underlying datasets.

Sometimes also though you have no choice. You must use either or context to work to the calculation you want. This is where your mastery of data modeling comes into play!

Filter context is used when you call functions such as SUM, COUNT, AVERAGE and many others. It likes to see rows of data remaining after the filters have been applied so it can complete one big calculation across the all the rows that remain unfiltered.

Row context is created in a couple of places or scenarios in Power BI. These include; calculated columns and iterators, like SUMX, AVERAGEX and with FILTER. Instead of running bulk calculations over a large set of rows. Row context means that the calculation engine iterates through every single row in a table. The engine then remembers the value calculated for each row calculated during the iteration and then runs the aggregation and calculation over all of those individual values.

How CALCULATE can change the filter context

Now there are some inserts required here that can change things up a bit.

The CALCULATE function…CALCULATE is the ONLY function in DAX that can change the environment (the context) for the filtering of underlying tables.

You will use CALCULATE a lot so this is of utmost importance. CALCULATE can alter (or change) the filter context so that the underlying tables are filtered differently for each individual resulting calculation.

CALCULATE will either ‘override’ an existing filter context, or add to the filter context if it doesn’t already exist.

We will go into CALCULATE in more depth as it is a deep topic, has many uses, and is maybe the most important formula in DAX itself. For now just imagine that when DAX uses the initial filter context to filter the underlying tables, before any calculation is performed, it can be slightly changed or adjusted with the CALCULATE function.

To finish up

Remember absolutely every single individual result you are getting in Power BI, through a chart, table, card, map, matrix etc. is running through these three key steps. You truly have to think a bit like the DAX engine at times to really understand results.

You have to think, what is my context? what tables are being filtered? what calculation am I performing? what results will I be returning? And you have to complete this thought process for each calculated result individually.

Do this many times over, practice A LOT, and you’ll get the hang of it.

We will branch out into these key steps in more detail soon.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s