In this tutorial, you’ll learn about the methods of encoding columns using VertiPaq in Power BI. You’ll understand how each method works and how it affects the performance of your DAX.
Each method has its own style of compressing data so that it can give you faster and smoother transactions.
Value Encoding With VertiPaq In Power BI
The first data compression type is Value Encoding. It occurs on integer-type columns.
This is an example of an original cost column:
You can see that there are values inside.
Now, if VertiPaq stores these data in the column as is, it would need 12 bits of memory.
However, VertiPaq looks for ways to minimize how much memory it needs to store data. It will find the minimum value of a column and then encode that column by subtracting 3004 from each value. By doing so, it would then only need 9 bits of memory.
VertiPaq lessened the memory it needs, making your code perform faster.
The next data compression method is Dictionary or Hash Encoding which occurs on non-integers or text values columns.
This is an example of a column with repeating product names:
VertiPaq has a feature that helps compress non-integer columns. It builds a list of distinct values in that column and then assigns a number to them.
You can see that it applies a number that corresponds to the product names.
Next, it replaces the values of the product names with the integer values.
Storing integer values requires lesser bits than storing text values.
This is something that DAX and the Storage Engine does when they’re encoding.
Run Length Encoding With VertiPaq In Power BI
The last encoding type is called Run Length Encoding. It looks for repeating values and stores a start and an end.
This is a sample table with Sales Person ID and Rows:
And this is the table that is encoded using the Run Length Encoding:
It knows that the salesperson with an ID of EMP1017 goes from row 1 to 363.
You’ll see that it has the Row Start, Row End, and it also shows how many repeating values there are in between. So, there’s no need to store EMP1017 362 times because it knows where it starts and where it ends.
Run Length Encoding Using Encoded Columns
Another great thing about VertiPaq is that it could Run Length Encoding on columns that were encoded with Value or Dictionary Encoding.
This is a table that’s encoded using the Dictionary Encoding:
You’ll notice that the NameID column is repeating. It means that you can use the Run Length Encoding with the column.
In this case, VertiPaq will only look for the NameID, Row Start, and Row End.
So instead of storing the integer values 3 times, it now stores it once.
Inside the VertiPaq Storage Engine, there are 3 data compression methods which are responsible for making your DAX code run smoother.
Once you’ve understood how each method works in your calculations, you can use and maximize any of them to improve the performance of your code and easily retrieve the data you need in your reports.