Dataflows have become a crucial feature of business intelligence and feature in the success of an organization-wide Power BI deployment. It’s also one of the key pillars for the effective implementation of Microsoft Power BI.
Surprisingly, Power BI dataflows are also vastly underutilized, and only a few users know their potential value. This blog will reveal what dataflows are, how they work, and why self-service data preparation is highly important for any organization.
Firstly, let’s clarify the basics, and I must note, you must be using a paid service such as Power BI Premium or Pro to utilize dataflow.
So what is a data flow?
What is a Power BI Dataflow?
A dataflow enables data access, transfer, and transformation within Microsft Power BI datasets. You can visualize it as a pipe. But instead of conveying liquid or gas, this pipe conveys data in its rawest form, a new dataflow into a Power BI desktop, file, or report.
You can also think of dataflow as a Power Query in the cloud. It’s taking Power Query outside an individual Power BI report and centralizing it in a cloud-based environment.
From an enterprise perspective, it’s important to utilize dataflows as it allows you to centralize and configure the data architecture of core data sets within the online service experience, irrelevant the data source, cloud, or on-premises.
Creating Date Tables with Power BI Dataflows
Whenever you use the Query Editor in Power BI, what you do within this environment is unique to that specific Power BI report. With dataflows, however, you can strip that out of an individual report and centralize it so that others can use the same transformations and data architecture you did.
This unique functionality makes dataflows ideal for core data sets that people frequently use. It also changed my way of creating date tables inside Power BI.
In the past, I would have used our date table code from the Analyst Hub and copied and pasted it into my file. I would then put it in the Advanced Editor and create my date table with some parameters.
Now though, I would put it into a centralized data flow so that every single person in my organization can use exactly the same date table.
Creating a date table is just one example of the almost endless applications of dataflows. You can also use it repeatedly on all your data sets, whether for customers, products, locations, or more.
In fact, you can put anything with a centralized structure, like filtering tables, into a dataflow, so everyone is working off the same thing and not doing their transformations.
And you can still transform and optimize things within the Power BI visualization experience after grabbing the raw data set from the dataflow. However, it may be much better for everyone to get closer to the same starting point.
Using a Dataflow
Much of the hesitancy in using dataflows is because it’s so unfamiliar. Many users perceive it as a brand-new and complex aspect of Power BI, but it’s not.
It’s very easy to move within a dataflow environment – just go to the Advanced Editor section and copy the existing code.
For instance, after creating a customer table or lookup table, all you need to do in most cases is to copy the Advanced Editor code and paste it into the new code editor in the online experience. It’s that simple!
It’s also important to note that you can have numerous queries within a singular data flow. Think of it as the main pipe connected by numerous other smaller pipes. It’s flowing through into your report or various reports around your organization.
Within the dataflows experience you can use numerous similar things to what you have in Power Bi Desktop and Power Query apps.
You can start by copying and pasting data from your files through the Advanced Editor, centralizing it within a data flow, and reconnecting to the data flow instead of getting the raw data.
Benefits of using Power BI Dataflows
So many users in large organizations use exactly the same data source over and over again in different areas of the business at all hours of the day. This approach can be costly, especially in large implementations and organizations with huge enterprise architectures around their data.
Centralizing data retrieval from a raw data source at any given time without database overloading concerns is just a far more optimized way. And with dataflows, you only need to do it once, which means there’s no need to constantly schedule and update your data all the time.
As an example in the image above, I created different types of date tables in a single dataflow. You will see under the Dates Query three date tables: a comprehensive table (Dates Full), a trimmed down version (Dates Slim), and a smaller one (Dates Small).
Again, the biggest benefit in using dataflow is convenience. For instance, if you want to make data changes for numerous reports, you only need to do it once if the reports link to a dataflow.
Without dataflows, you will have to make changes for every report, which can be pretty hectic, especially if you’re dealing with numerous reports. If you have ten reports, for instance, you’ll have to make ten changes!
Connecting Power BI Dataflows to Workspaces
Connecting your dates, reports, and other tables to dataflows is also very easy. Go to New Source at the top left corner of Power BI, select Power Platform followed by Power BI dataflows, and then click Connect.
You can also connect dataflows to numerous Power BI workspaces, which is pretty handy. You can have a data flow dedicated to a specific workspace but can connect to it from a different workspace.
That said, dataflows bring a lot of scale into how you share the data organization piece internally. In the image above, you can see my eDNA Common Datasets, which is how we have set up our internal reporting with the Power BI service.
Using data flows can do so much for any organization by building different types of data through one workspace and then centrally managing them.
Your data team, engineering team, or head analyst can manage these core data sets, and everyone else can tap into these data pipes for their Power BI files. And those with access to the data pipeline can do their simplified model and build their reports quickly through it.
And if you do this one layer well by building a proper foundation, it can speed up and improve your Power BI deployment’s overall consistency and productivity.
The Final Word
Now you know what dataflows are, how to use dataflows, and why they are a game changer in Power BI deployment in small and large organizations looking for data transformation.
As a final note, always prioritize your data architecture because it’s important in building a strong foundation inside the power bi solution.
Ideally, you want a workspace dedicated to common data sets and then utilize dataflows for that workspace alone. Simply put, you want to create a workspace of data flows.
Once the right data architecture for your core data sets is in place, everyone can tap into the data pipes for their Power BI files. They can even do their simplified model and create reports quickly through it!
It is only a matter of time until you produce some outstanding Power BI dashboards.
Also, just a side note, you can use dataflow in excel, but that is for another article.
Want to learn more about dataflow and how to make the most of your Power BI data with power query, check out the below video for real-world examples and use cases.
What is a Data Lake in Azure?
Azure Data Lake is a cloud-based platform that supports data analytics. It is essentially an unlimited data warehouse that works with Power Apps to hold data of all sizes and types.
What’s The Difference Between Power BI & Power Query?
Well, they are different things. Power Query comes with Power BI and is used to connect the data flow from many data sources.
What is DAX?
DAX, or Data Analysis Expressions, is a common programming language that’s used in Power BI to create custom tables, calculated columns, and measures.
What is ETL?
ETL stands for Extract, Transform & Load. To clarify, Power BI is not an ETL as such, but it is similar in the way that it uses ETL methodology.
What Does CMD Stand For?
CDM refers to the Common Data Model.
What is an Incremental Refresh?
Incremental refresh is a time-saving function within Power BI. Instead of fetching the entire dataset in Power BI, you can set up partitions
What is Schema in Power BI?
Schema refers to the structure and organization of data in a dataset.
What is a DataMart?
In relation to Power BI, a datamart refers to a dataset that has been developed to showcase data from a specific business unit.