Let me discuss some advanced transformations in Power BI through the query editor. You may watch the full video of this tutorial at the bottom of this blog.
Remember that you can combine different actions within the query editor to fully optimize your models. It’s just a matter of familiarizing yourself with the different transformations in Power BI and finding out what scenarios they’re useful for.
Exploring The Different Transformations In Power BI
There are many ways for you to explore the different transformations that you can do in Power BI.
One way to do this is by doing a right click on any of your columns and looking at the different options that would come out.
You can also look at the top of your page and use the ribbon to see the available transformations you can do. The same actions you’ll see by doing a right click can also be found in this area, plus a few additional actions like adding a column.
You can also click the icon on the upper left part of your table, and a list of transformations will show up.
This menu will also show you some additional column and row transformations like Keep Top Rows, Keep Bottom Rows, Keep Range of Rows, etc. It lets you do some filtering as well.
You can also use the dropdown found at the right of your column name. This will show you additional filtering options.
Note that no matter where you access those transformations, every transformation that you do will be recorded under the Applied Steps here on the right.
Adding Data From A Website Into Your Table
To show you an example of some advanced transformations, I’m going to add some data into this table below.
I’m adding this data for some additional analysis that my table isn’t giving me at the moment. In this case, I want updated data on the population in California because the data in my Locations table is outdated.
To start off, I’m going to do a Google search for the California demographics data.
I’m getting a number of results here, so I’m going to choose the one from california-demographics.com.
It shows a lot of information, but the data I’m after is found here in this link that says “See all California cities by population”.
Now, I have a breakdown of all the different cities and their respective populations.
What I’m going to do is copy this page’s URL.
Then I’ll go back to the query editor.
Under the Home ribbon, I’m going to click on New Source, then Web.
What I’m doing here is scraping data from the website I visited earlier.
Note that if you’re going to do the same thing in your model, you have to make sure that the data is presented using a good format.
So I’ll just paste the URL I copied earlier in the space provided here.
Once I click on OK, this window is going to show up. I just need to click on Connect.
Just a quick tip here. Since you’re getting data online, you have to make sure that your internet connection is pretty good. Otherwise, it might take a long time for you to get the data.
Now that I’m connected, the table that I saw on that website earlier has appeared in this window.
Data like this will come in as a snapshot from the original source. So when I embed this into a query, it’s automatically going to get fresh data from the website each time I refresh my table. And when the data on the website is updated, the table is going to be updated as well.
Transforming The Table With The New Data
The new data that I added is now here under Other Queries as Table 1.
I’ve mentioned before that I want every piece of data on my table to be clearly labeled, so I obviously need to change the name of Table 1.
One way to do that is through the Properties section on the right. Just to show you how it’s done, I’m going to rename that table as California Popn Data.
I can also change the name by double clicking on it here on the left pane. This time, I’m going to rename the same table as CA Population Data.
I don’t want that to stay under Other Queries because it’s a supporting table, and I already have an existing folder for Supporting Queries. So all I have to do is to drag that table into the right folder.
Before I move on, I want to double check what this new supporting table includes.
It shows the Rank, City and Population.
Now, I’m going to check on my Locations table.
I see that the city names from my Locations table is the same as the city names in the CA Population Data table.
Because of this, it doesn’t really make sense to have two separate tables. So I’m going to merge them.
I mentioned earlier that my Locations table has outdated population data. By merging the two tables, I’m hoping to replace this old data with updated numbers.
So in my Locations table, I’m going to click on the Name column. Then, I’m going to click on Merge Queries.
In the Merge window, I’m going to click on the Name column under the Locations tab.
On the dropdown menu, I’m going to choose my CA Population Data.
Then, I’m going to choose the City column.
I got a Privacy levels notice, so I’m just going to click through that.
Look at the bottom here. It says that the number of rows on the Name column matches the number of rows on the City column.
This makes it easy to merge these tables together. It means there’s enough data in the CA Population Data that covers all of the iterations in the Locations table where the raw data is.
There are a lot of different ways that these tables can be merged. Those options can be found in this dropdown menu under Join Kind.
I normally use Left Outer, but you can experiment with what works for you best.
Once I hit OK, you’ll see that my Applied Steps show that the merge was successful.
Here’s the new column that resulted from that merge. Notice that the rows say “Table” instead of displaying data.
That’s because the entire table was merged into this old table. So I’m going to check what data is in it by clicking on the dropdown beside the column name.
I definitely don’t need the Rank. I don’t need the City names as well, because I already have that in my raw data. So I’m just going to untick those and retain the Population data.
Now, I have up-to-date population data in my table.
That means I won’t be needing my old Population column anymore, so I’m going to remove that.
As for the fresh column that I added, I’m going to rename it Population Data.
Since I don’t need that entire table I imported from the web to be loaded into my model every time I open it, I’m going to disable the load by doing a right click and unticking “Enable load”.
I just need to make sure that “Include in report refresh” remains ticked so that the data I retained in my table is still there each time I refresh my model.
***** Related Links *****
My Practice Tips When Using The Power BI Advanced Editor
How To Download Exchange Rates In Power BI
Showcasing And Understanding Anomalies In Power BI
Conclusion
With these transformations, I can get even better insights from my model.
For example, I can now compare sales information across different regions based on population information. I can check what percentage of the total population in California is buying my products. I can check the impact of the population in one city versus another in terms of sales.
There is so much you can do within the query editor. This is just a beginner’s course, but once you move to more advanced courses, you’ll find even more amazing transformations in Power BI.
All the best
Sam