working with expanded tables

Expanded Tables In Power BI

No comments

In this tutorial, we are going to take a look at a very specific scenario that I’m sure all of you have already encountered while working with DAX – the expanded table. Expanded tables can mess up your calculation if you don’t know how they actually work. You can watch the full video of this tutorial at the bottom of this blog.

For this example, we are using the Contoso database, which contains the sales table, customer table, dates table, products table, products category table, and products subcategory table.

On the left end side, we have one-to-many relationship between the products, category, subcategory and sales tables. We also have one-to-many relationship between customer and sales, as well as dates and sales.

expanded tables

Example #1

Let’s say we want to create a report from this category column in the category table, and use a measure which calculates how many customers there are in the customer table. We want to slice the number for the customer’s table by the category.

expanded tables

If I put a filter on the category table, that filter will go to the subcategory table, then it will reach the products table, and then finally, it will reach and filter the sales table. But that filter will not be able to filter the customer’s table unless we turn on the bidirectional filtering.

Let’s go back to the report to fix this calculation. We can wrap this calculation inside the CALCULATE function, and then write Sales. We can see that this time we are not repeating the same number for each cell of the visual.

expanded tables

To verify if this calculation is correct, we can bring in the customer key from the sales table to aggregate, and then do a Count (Distinct) over that.

expanded tables
expanded tables

You can see that we are returning the same value for each row. By using the sales table inside the CALCULATE function, we were able to fix the calculation.

expanded tables

Example #2

Let’s move on to the next example. For this one, we want to identify the sales amount for the red products in 2007, 2008, or 2009. Depending on the selection of the slicer, I also want to go back one year.

For example, if I select 2009, I want to report the red sales for 2008. If I select 2008, I also want to report the red sales for 2007.

expanded tables

Red Sales #1

The calculation of our Total Sales is basically a sum of the sales data. In the row context, we are going to multiply the quantity by the net price.

expanded tables

Let’s create a new measure and name it as Red Sales. We are going to write CALCULATE, then Total Sales. We are going to FILTER the Sales table to show that the RELATED products’ color is equal to red.

Then we will write SAMEPERIODLASTYEAR over the dates in the date column.

expanded tables

When we bring that calculation in a card visual, we get a blank.

expanded tables

Red Sales #2

What is happening here? Let’s try to rewrite that calculation and see if we are able to get a result. We’re going to create a new measure and name it as Red Sales 2 and use the first part of the calculation for Red Sales 1.

expanded tables

We are going to initiate another CALCULATE over the first CALCULATE.

expanded tables

Let’s bring this measure in the matrix and see the results. If we select 2008, we get 51,947. If we select 2009, we get 24,343. Lastly, if we select 2010, we get 39,724.

expanded tables

Red Sales #3

There is another way to write this calculation. We’ll write a new measure and call it Red Sales 3, then use the CALCULATE function.

We are going to compute the Total Sales and write that for the products, the color is equal to red. Then use the SAMEPERIODLASTYEAR function on the dates.

expanded tables

If we drag measure #3 into the card visual, you can see that these two card values are returning the same value, which is correct.

But there is something wrong with our first calculation, because we are returning a blank and not the correct value.

expanded tables

Let’s see what is actually happening here. Now that we’ve seen a couple of calculations on expanded tables, let’s understand the theory behind them.

Before understanding what an expanded table is, you need to understand that all of the tables that we have here are called base tables.

expanded tables

So when do these tables become expanded tables? Once you create a many-to-one relationship between one table and another table, the base table becomes the expanded table.

Verifying Expanded Tables

But how can we verify that a table expansion is actually happening? Well, you can use the related keyword on any table. If you’re able to access the column from the one side, then you’ll know that a table expansion can happen.

Let’s go to the sales table and create a new calculated column.

Let’s say we want to retrieve the product color from the product table for this particular product key. We’ll use RELATED, which will only give the list of columns in the IntelliSense that can actually be expanded from the sales table.

We can see that the customer table and the sales table have a many-to-one relationship. We can also see the list of customer table columns, date table columns, and product columns.

When we select Products[Color], we can create a new column in the sales table with the help of the RELATED keyword. RELATED only gives us access to the columns of the table to which the base table can actually expand.

If we change the nature of that relationship from many-to-one to many-to-many, that calculation will stop working.

Let’s change the nature of this relationship to many-to-many. We can see at the bottom that we get a warning symbol.

When we go back to the sales table, we see the error message that says the column Products[Color] either doesn’t exist, or doesn’t have a relationship to any table.

In the current context, the warning that we get is not very user-friendly. Basically, it means that the sales table cannot expand to the product table because we are not able to access only one single value for that particular row item.

And since we are using the many-to-many relationship, the table expansion doesn’t happen and RELATED doesn’t work.

Let’s go back to the diagram view and fix this calculation. We’ll change the nature of the relationship to many-to-one and activate that relationship to make the calculation work.

Defining Expanded Tables

Before we start looking at the calculations that we have already done in the report, let’s reiterate the definition of the expanded table.

If you have a data model with a star schema, the fact table will expand to all the tables in the data model if there is a many-to-one relationship between the dimensions and the fact table.

If you have a snowflake schema, then the product subcategory table and the category table will expand to the base table, which in this case is the products table. The sales table is the base table, which expands to all the other tables.

Behind the scene, the sales table will have all the columns in one single table. Note that table expansion is just a logical concept, so it is not going to expand and increase the size of your data model.

The expanded table only comes into the picture when you are referring to a table (which is the base table), and when you have a many-to-one relationship to other tables.

Slicing The Category Column And Using Expanded Tables

Let’s try to fix the calculations that we have already done in the report view. In this example, we are slicing by the category column from the products category table, and we are trying to count how many customers there are.

So what are we actually doing here? We are using the expanded sales table as a reference. Once the filter context contains a value from the category table, that filter will reach the sales table from the subcategory products and directly to the sales.

Since sales is an expanded table and we are using that reference inside the CALCULATE function, then the sales table will contain the column of the customer table as well. When we are applying a filter over the sales table, indirectly, we are also filtering the customer table.

Let’s go back to our red sales calculations and try to understand what is actually happening. We’ll start with the blank calculation. If we select that measure, you can see that we are writing a nested code. We have CALCULATE, then the Total Sales, FILTER over the sales, and SAMEPERIODLASTYEAR.

Let’s break that calculation step by step. First, we need to identify the external filter context that exists out of the calculate.

We have selected 2008 in the Calendar Year Number slicer.

From this filter context, the sales table will be evaluated. The sales table will only contain the row for the calendar year number of 2008 and the product color will be red. We have two filters here: one that is created by the filter context and one that is getting created by the filter.

The SAMEPERIODLASTYEAR is getting evaluated in the filter context where the year is 2008. It is going to receive a list of dates in 2008 and it will shift those dates in 2008 to 2007. The table that is going to be returned by this will only contain the dates for 2007.

Once these two operations are complete, the CALCULATE function prepares the filter context and applies these two filters into the filter context. When we apply that, we have a filter context of the product color being equal to red, and on the year column, we have a filter over 2007 and 2008.

So for this data model, there is no single transaction that exists in two different years. When CALCULATE tries to merge these two filters in an and condition, it will say that the year should be 2008 and 2007, and that the product color should be red.

When we are filtering the sales table with the filter context of year 2008, we are also indirectly filtering the dates table as well. The sales stable has a relationship of many-to-one to the date stable.

Let’s create a new calculation to identify how many rows there are in the dates table for the related dates. We are going to write CALCULATE, then COUNTROWS of the dates table. Then we will FILTER ALL of the sales table, and we are going to say that the RELATED dates in the calendar year number should be equal to 2008.

Inside CALCULATE, we have not referenced any filter over the dates table. We are simply checking that the dates in the calendar year number should be from 2008. Ideally, this filter should not be able to filter the dates table. We are also using the ALL function, which will ignore the filter context that will come from the slicer.

Let’s create a new card for this calculation. you can see that we are returning 348 rows.

So how is it possible that from a dates table of 2500 rows, we are only returning 348 rows? If we’re going to use an expanded table, then we are going to indirectly filter the other table as well, which is connected through a many-to-one relationship.

While we do not have any filter over the current year, we are still limiting the number of rows that are visible for the dimension table that we have on the one side.

When we are returning the sales table, we are also returning a filtered version of the dates table, customer table, product table, product category table, and product subcategory table.

Explaining Red Sales 2

Let’s move on to the next calculation, which is Red Sales 2. We’ll start with the outer CALCULATE because if we’re nesting this function in any scenario, the outer CALCULATE should prepare the filter context for the inner CALCULATE.

On the slicer, we select the calendar year of 2008. When the SAMEPERIODLASTYEAR function gets the date for 2008, it is going to shift those dates in 2007 and it will be the FILTER context for the inner calculate. This inner CALCULATE will evaluate the year 2007.

The FILTER function will filter the sales table, and the sales table will only be limited for the row where the year is 2007. Once we have the rows for 2007, we are going to check that the related products is equal to red.

Unlike the first calculation, we are not returning two different tiers. Those two different tiers are not getting applied on the FILTER context so the table returned by the FILTER function will contain all the row of the sales table.

Once this filter is applied, we are going to get the sales amount for year 2007, as well as the products that are equal to red, whereas in the first calculation, we were returning the year 2008 and 2007.

This time, we have prepared the filter context for the sales table by using the SAMEPERIODLASTYEAR function and by nesting the CALCULATE function inside another CALCULATE.

Explaining Red Sales 3

For the third example, we have a very simple CALCULATE function with two statements: one is product color is equal to red, and then the SAMEPERIODLASTYEAR on dates.

The SAMEPERIODLASTYEAR function will be evaluated in a filter context, which will contain only the dates of 2007. The products’ color will apply a filter context of red over the product table, which will in turn filter the sales table and only contain the rows for the red products.

When these two values are applied in the filter context, they’re going to filter the sales table but not the expanded sales table. you can see that nowhere in the code did we reference the expanded sales table.

This is the ideal scenario that you should use if you’re trying to reference the expanded table. When using the expanded table, the calculations can become really convoluted and sometimes, you are not able to identify why your calculations are returning an incorrect result.

If you do not understand the concept of expanded tables, you might develop this particular code as a model measure, and then deploy it in production. Other users might start creating other measures over this measure, and they might not be able to figure out why the calculations are not working because they do not simply possess that level of understanding about the DAX language.

If you’re working with DAX, always try to put the filter over a single column because when you use a single column, the concept of expanded tables is not applied. For example, when you use PRODUCTS [Color] = “Red”, the filter reaches the sales table, but that filter cannot reach the customer’s table because we are not using the expanded sales table.

When using a single column, we do not refer to the concept of expanded tables. We are simply applying a basic filter over the sales table, which do not propagate to the other tables.

Using DAX Studio To Verify Expanded Tables

Before wrapping up, we’re going to use DAX studio to verify our code for Red Sales 3. Let’s go to the View tab, click on Performance Analyzer, and then start the recording. We’ll refresh that visual, and select a blank card.

From the external tools, we are going to launch DAX studio

We can see that in the results screen at the bottom, we get a blank.

We’ll turn on the Server Timings to understand the queries that are being generated behind the scene. We’ll go to the Server Timings tab, and expand it to see everything.

We are performing a JOIN on the sales table on the left hand side. We are also performing a JOIN on the product key from the product table.

In the WHERE clause, it says that the dates in the calendar year number should be equal to 2008 and that the product color should be red.

Then we have another condition which says Dates[Date], then followed by a range of date. So what are these dates? Ideally, these dates must belong to the 2008, but they are from 2007 when we change the formatting of these numbers in Excel.

This is why we are returning a blank value. We are saying that the calendar year should be 2008 and applying an AND condition that the dates should be in 2007. This 2007 is because of the SAMEPERIODLASTYEAR function we used in the calculation.

Let’s change the measure and reference to Red Sales 2. Once we execute the code, you can see that we are getting the dates from the dates table, then we are filtering the dates column for the calendar year number of 2008.

This time around, we’re able to return a result because we are not applying a filter on the WHERE clause and we’re saying that the dates in the calendar year number should be equal to 2008.

Conclusion

Hopefully, this tutorial was helpful in understanding what an expanded table is and how it can mess up your calculations. Expanded tables are not really intuitive and neither are they easy to understand.

For more information about expanded tables, don’t forget to subscribe to the Enterprise DNA TV channel. We have a huge amount of content coming out all the time from myself and a range of content creators, all dedicated to improving the way that you use Power BI and the Power Platform.

Antriksh

***** Related Links *****
Where To Find The Hidden DAX Trap
Counting Customers Over Time Using DISTINCTCOUNT In Power BI
New Customer Analysis Using Advanced DAX

***** Related Course Modules *****
Optimizing DAX
DAX Optimization Masterclass Using DAX Studio
DAX Studio For Power BI and SSAS – Beginner to Advanced

***** Related Support Forum Posts *****
Error Connecting To Protected Vendor Web Based Data
Do/While In Power Query To Obtain All Records From An API
Custom WorkDay Numbers For A Standard Calendar
For more on expanded tables, please see here…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.