In this blog, we’ll go over the CALCULATETABLE function and its query plan in DAX studio, and understand how it differs from the FILTER function. By the end, you’ll have a better understanding of how to choose the right function for your data analysis needs.
To demonstrate, we’ll use the DAX code below.
When you run this code, you’ll get a table showing Brand and Color.
In the Query Plans tab, you’ll get a code corresponding to the CrossApply operator in the Formula Engine.
At the same time, you can see that the code is only retrieving the product color and brand in the Server Timings tab.
DAX Studio Server Timings Result For CALCULATETABLE
As an example, let’s filter the product color to only show Red and/or Black.
You can do so by using the FILTER function. When you run it, the Server Timings will show that the FILTER argument isn’t applied to the xmSQL code.
But if you use the CALCULATETABLE function, the query in the Server Timings tab will show that the results have been filtered by product color.
CALCULATETABLE Query Plan In DAX Studio
Let’s look at the Query Plan for each function.
The CALCULATETABLE is on the first line of the Logical Query Plan because it’s a top-level function in the code and therefore doesn’t depend on any requirements.
When you read the query plan of a CALCULATETABLE function, you don’t start with the first argument. You first need to understand the filter context used. So let’s go to line 5.
You can see that the VertiPaq operator requires the product color column because of the applied filter. It then scans this column and uses the Not IsEmpty Filter combination to check the product color against each row.
After the filter context has been prepared, the CROSSJOIN argument is performed.
On hindsight, if you use the FILTER function, you’ll see a different query plan, so the CROSSJOIN argument is performed prior to identifying the filter context.
The downside of this approach is that it doesn’t push the filter condition to the Storage Engine. You won’t be able to use the WHERE clause inside xmSQL which will place a heavy burden on the Formula Engine.
And when you’re working with multiple tables, the filter combination of the CROSSJOIN function will be huge. This slows down the performance of the DAX code. Thus, in most cases, it’s recommended to use the CALCULATETABLE function.
So instead of performing all the computations in the Formula Engine, you can inject the filter condition inside the WHERE class of the XMSQL code.
Context Transition For Similar DAX Functions
This concept is also applicable to the CALCULATE function. Let’s use the DAX code below as an example.
A measure automatically has a CALCULATE function around it.
When you run this code, you’ll get a table showing the Total Sales amount for each color.
In the Server Timings tab, you can see that the code is executing two queries. The first query is calculating the sum of the Sales amount from the Sales table with respect to the Products table.
On the other hand, the second query retrieves the product color again from the Products table. It corresponds to the VALUES function in the DAX code.
The Formula Engine places the data caches of the two queries side by side. It starts by retrieving the values from the second data cache and then performs a lookup in the first data cache.
In the Query Plan tab, the Logical Query Plan scans the product color because of the ADDCOLUMNS argument.
Then, VertiPaq sums up the Total Sales amount depending on the product color. It performs the context transition that converts the product color value into an equivalent filter context.
The filter created by the context transition is translated into a DependOnCols property in the Query Plan. After VertiPaq scans the columns required to compute the Total Sales, it multiplies the Sales Quantity by the Net Price.
The Total Sales are then returned as a result in the form of a currency data type. And finally, ADDCOLUMNS returns a table with two columns containing the Color and Total Sales amount.
The query plan can have a significant impact on the performance of a query, as different query plans may have different levels of efficiency.
When using the CALCULATETABLE and FILTER functions in a DAX query, the query plan may differ depending on the specific scenario and a variety of factors, such as the specific formula expression being used and the capabilities of the query engine.
In general, it’s a good idea to test and compare the performance of different query plans in order to determine the most efficient approach for a given scenario.
All the best,
Enterprise DNA Experts