Storage Engine In Power BI – VertiPaq Operators & Queries

by | Power BI

This tutorial will discuss about the VertiPaq operators and storage engine queries in Power BI. You’ll learn how operators work behind the calculation and how the storage engine helps you to optimize your DAX codes’ performance.

The VertiPaq operators and queries allow you to monitor the execution of each code so that you can improve them if necessary.

VertiPaq Operators

VertiPaq has the ability to compute simple queries. It has logical operators that execute your DAX codes to get answers.

These are the VertiPaq logical operators:

  • Scan
  • GroupBy
  • Filter
  • Sum
  • Min
  • Max
  • Count
  • DistinctCount
  • Average
  • Stdev.S
  • Stdev.P
  • Var.S
  • Var.P

These are the Logical and Physical Query Plans generated by a sample query.

storage engine

Whenever a line has _Vertipaq, it means that the VertiPaq engine is doing the work by itself. If you look at the Logical Query Plan, you’ll see that it has the Sum_Vertipaq and Scan_Vertipaq. These two operators work to scan and sum the FactSales’ Quantity to generate an answer.

In the Physical Query Plan, you’ll also see a Sum_Vertipaq which uses a ProjectionSpool. It means that a query is being sent to the storage engine and then returned to the formula engine in a data cache. You’ll also notice that it only has one record.

The Records column shows how many rows are being materialized from the storage engine to the formula engine. It only needs one record to work efficiently. Multiple records take more time for the query to complete.

If the storage engine can’t perform a request, the formula engine takes over. The formula engine can do everything you ask, but it’s much slower.

Storage Engine Queries

This is an introduction to the storage engine queries and how to read and understand them. You can use these information to optimize your DAX query for a better performance. Always remember that the data cache returned by the storage engine to the formula engine is in an uncompressed format.

This is a sample query that will be used.

storage engine

If you run it and open the Physical Query Plan tab, you’ll see the ProjectionSpool Operator. You have to understand how it works in the background of the calculation to produce a result.

storage engine

Storage engine queries are expressed in a pseudo SQL language called xmSQL. It’s not a real programming language. It’s more of a visual representation of what the engine is doing in the background.

If you run the sample query in DAX Studio, it will show the xmSQL. It shows that the query selects the SalesData’s Quantity and sums it from the Sales Data table.

storage engine

You’ll notice that the table name in the query is different from the xmSQL. The xmSQL uses Sales Data instead of FactSales as the table name because DAX doesn’t like to use Fact or DIM as names.

Now, to have faster DAX performance, you need to let the storage engine work as much as possible. It can perform simple math and aggregations as it scans for the data. Since the sample query only uses SUM, the calculation was done in the storage engine.

The best optimization technique is to figure out the best set so that the storage engine does most of the work and the formula engine simply does what’s left.

***** Related Links *****
VertiPaq Storage Engine On Multi-Threaded Environment
VertiPaq Compression And Storage In Power BI
What Is DAX Studio In Power BI Desktop

Conclusion

The storage engine is responsible for making DAX work faster. It allows you to see the process behind the whole calculation with the help of the Logical and Physical Query Plan. You can track which process slows the whole performance down and then optimize it.

If you want to know more about how the VertiPaq works, utilize this tutorial and practice optimizing your DAX codes.

Nick

Related Posts

Understanding Data Models and Visualizations

DAX Table Functions Deep Dive

Explore an in-depth analysis of DAX table functions in Power BI, comparing SUMMARIZE and ADDCOLUMNS, and understanding INTERSECT and EXCEPT for enhanced data manipulation and analysis.