I’m going to talk about simple Power BI transformations you can do around the query editor. You may watch the full video of this tutorial at the bottom of this blog.
The query editor allows you to clean up your data and transform it from its rawest form. Although it allows you to do some really advanced transformations, I want to focus on the simple ones you can do.
Through these simple transformations, you can better optimize your data in Power BI.
Cleaning Up The Names Of The Data
The first thing I always do is clean up the names of my data. Depending on what kind of report I’m after, I decide whether I want proper names. I usually get rid of underscores and abbreviations as well.
Basically, I want names that are logical. People have to take one look at them and understand what’s in each particular table or column.
Take this first table here as an example. It’s called Customer_Data.
Since I don’t like having symbols like underscores on the names, all I need to do is double click on that and change it into the name I want.
Alternatively, I could also go to Properties and change the name there.
I’ll change Location_Data and Product_Data too because I want to be consistent in terms of the name format.
Aside from the names of the tables, I also go through each table and work across the column names. Again, I have to be sure that all the column names make sense.
The biggest reason why I want those column names to be specific and as clear as possible is because visualization titles are automatically created based on column names. So, if any of my column names aren’t clear enough, this could have an impact on the way the data is presented in my visualizations.
To change any of the column names, I just need to double click on the name and change it. For example, I’m changing this Cost column into Product Cost.
This way, it’s clear what kind of cost I’m talking about.
Cleaning Up The Data
The next thing I need to do is clean up data that I don’t need.
This Locations table is a good example. There are columns here that I don’t really need, like this column for Area Code.
I’m going to leave the Population column because I need that data. But I also don’t think the columns from Households to Time Zone are important.
To highlight these columns without removing the highlight on the column for Area Code, I’m going to hold down the Ctrl button and click on the columns I want to add to my current selection.
From there, I just need to do a right click and choose Remove Column.
Tracking Applied Steps In The Query Editor
The Applied Steps pane on the right side of your screen is a helpful tool as you work on simple Power BI transformations.
As an example, let’s look at the Applied Steps pane before I removed the columns I didn’t need.
But the moment I removed those columns, look at what happened next.
The last transformation I did was recorded under Applied Steps. In fact, it shows everything I’ve done so far in a sequential manner.
Now that all those steps have been recorded, all of these transformations will automatically happen each time I refresh the data.
This also allows me to go back and undo any transformation I made that didn’t really work out as I had planned. But, if I change my mind and realize that I need those columns after all, I can simply click on the X beside that step.
Once I do that, my query will go back to its previous state.
This shows that you can actually play around with the query editor without having to worry about making a mess of things. You know you can easily retrace your steps and undo any unnecessary moves you made.
Knowing this, just right click on any of the columns and explore any of the different transformations on the list.
This will allow you to master what the query editor allows you to do without having to worry that you’ll end up spoiling your data since you can undo the steps anyway.
Appending Queries In The Query Editor
Earlier, I worked on Customers, Locations and Products. But if you could remember, there were three other tables there, all of them for Sales.
I’m going to update those names too since I now have more recent data.
If I click on each of these three tables though, I can see that I have the same data structure in all of them. It’s basically the same data but from different time frames.
It’s unnecessary to have three different tables or queries for this. This means that it would be more efficient for me to append the data.
So first, I’ll select Sales_2018 and then click on Append Queries under the Home ribbon.
That will pull up this window.
I’ll tick on “Three or more tables” and a list of tables will show up underneath. I just need to add the tables that I want to append to the box on the right and click on OK.
I’m going to check if these tables have been merged into a single one. So I’ll look at the column showing the Purchase Date.
Since I have over 4,000 rows of data here, I’ll just click on the dropdown button and click on Sort Descending to display the most recent data first.
This shows that the tables have been appended. As you can see, I’m still on the Sales_2018 table but I now have data from 2020.
Now that I have all my sales data in this single table, I’m going to change the name to Sales.
Of course, I can’t just delete the tables for 2019 and 2020 because they still are queries. I still need the data inside these table.
So what I’m going to do instead of deleting them is to right click and untick Enable Load.
And because I still need the data from those tables to be included when I do a refresh, I have to make sure that “Include in report refresh” remains ticked even if it’s going to end up grayed out once I disable the load.
Once that’s done, I have effectively got rid of some dead weight in my model. This is a great way to optimize my tables.
Checking The Data Types
It’s important to check whether each column is showing the correct data type. This can be seen through these little icons on the column name.
Power BI is usually really good at figuring out what type of data you have in each column. But on the rare occasion that it shows the incorrect data type, it could have a big impact on how calculations are done in your reports.
For example, if a numeric column is tagged as a text column by mistake, you could have problems with something as simple as getting the SUM of all the data.
The good news is it’s easy to change the data type. Just click on the icon and choose the right type from the list that comes up.
Other Power BI Transformations
In this next example, I’m going to show a few more transformations that can be done in the process of creating a short code for California.
I’ll start by doing a right click on the column, then I’m going to choose Duplicate Column.
Now, I have an exact duplicate.
I’m going to do another transformation by right clicking and then clicking on Transform and choosing Uppercase.
This makes all the entries in that column show up in CAPS.
This time, I’m going to split the duplicate column.
To do that, I just need to do a right click and click on Split Column. I’ll also do that based on the number of characters.
I’m going to go for 4 characters here. This gives me the short code CALI.
I don’t need the 2nd part of the column that was split, so I’m going to remove it.
Then I’m going to rename the 1st part and call it State Short Code.
To make it more organized, I’m going to drag that right next to the additional column for State.
The M Code And The Advanced Editor
Everything I’ve done up to this point is noted down under the Applied Steps section.
The interesting thing is that I can actually see what’s going on in each of these steps because of what’s called the M code. This code can be seen in this formula bar at the top.
I can also see every step in the Advanced Editor.
Once the Advanced Editor opens up, I can see all the codes being written as I work on all the previous transformations.
Of course, this is something that you should worry about once you get to the more advanced side of Power BI. For now, the simple transformations I worked on are more than enough to optimize your reports.
These Power BI transformations may be simple, but having a good grasp of how they work can help with your transition to more advanced stuff later on.
Also, don’t forget the importance of the Applied Steps section – this is going to be your best bet in making sure you don’t get lost while you optimize your data.
All the best,