Python and Power BI
This tutorial will discuss about creating and running date tables in Power BI with Python scripts. You’ll learn how to create data frames and import them to Power BI using the Python Script Loader and Jupyter Notebook. You’ll also learn how to export them out into a CSV or Excel format.
How to use Python Script in Power BI
Importing Packages In Power BI With Python Scripts
In this tutorial, you’ll be creating a Date table using Pandas and the Jupyter Notebook. The first thing you nee to do is import packages. Inside the scripting cell in Jupyter, input import pandas as pd.
To open a new scripting cell, press Shift + Enter. Create a date range using the date.range() function. If you press Shift + Tab, you’ll see what the function needs.
Pressing Shift + Tab three times will show you a documentation of the function which includes the needed parameters.
Writing Scripts In Jupyter
For this demonstration, input the following script.
After you run the script, you will see a list of dates.
Setting Parameters
If you change the end date to 1/2/2021 and the default frequency from D (Daily) to H (Hourly), the list of dates will show an hourly range.
Another thing you can do is put periods. Remove both end and freq parameters and input periods = 3. Once done, run the script. You’ll then see 3 days and a daily frequency.
You can also use the variables start and end in the function. If you run the script, a dates list will also be created.
However, the format of the dates list is incorrect. You need to transform it into a table format for Power BI. A data frame is an equivalent structure that you can use. Wrap the date.range() function with the DataFrame function.
After that, you’ll have a date table in the correct structure.
You’ll notice that the table doesn’t have a column name. So, input a column name using the column parameter.
Rerun the script to show the column name.
Adding Data On The Date Table
If you want to create complex data frames, you can use time intelligence functions.
Now that you have created a data frame, resave the script in a variable and name it date_table. Then, call that variable at the end of the script.
To add more data to the table, input date_table[] and place the data you want to add in the square brackets. In this example, Year was added.
To get the value of the Year, you need to determine the datatype. In another cell, input date_table.info() and press Shift + Enter. You’ll then see information about the datatype which is a DateTime type. Since it’s a DateTime type, you can use different attributes or methods from the data frame.
Next, open a new cell and input the following.
The dt.year function gives you access to different functions and methods in the data frame. After running the script, you’ll have a table with the values of Year.
From there, copy the syntax and place it in the main scripting cell. Equate the script with the date_table[‘Year’] and run the script. After that, you’ll see a Year column in the date table.
Adding Month And Time Columns
If you want to add months to the table, copy the syntax and paste it on the next line. Next, change the column name to Month and the dt.year to dt.month. You’ll then see a Month column in the table. You can continue adding data to make a very deep table.
In the pd.date_range function documentation, there is a tz or time zone parameter. So to add a time zone, input tz=’EST’ in the function. In this example, the time zone used is the Eastern Standard Time (EST).
Next, copy the syntax of the Month data column and paste it in the next line. Change the name to Time and the dy.month to dt.time. Then, run the script.
You’ll then see a Time column with values of 00:00:00 in your date table.
If you add and set the frequency to H, the values in the Time column will have a 1-hour interval.
Adding Quarter And Day Of Week Columns
After adding Time, add Quarter column. Copy the Time data column and paste it in the next line. Change the name to Quarter and the dt.time to dt.quarter. Then, run the script. You’ll then see a Quarter column in the date table.
Lastly, add the Day Of Week column. Copy the same syntax and change the name to Day Of Week and the dt.quarter to dt.dayofweek. Click run and you’ll then see the Day Of Week column in the table.
Adding Data Using STRFTIME()
The dt method can be used to get additional customizations in your date table. There’s also another type of method that you can use and it’s called strftime().
This is a sample code with the strftime() and a format code inside the parentheses. In this example, %a is passed inside the function. If you run the script, it will give you the abbreviated days of the week.
Copy that syntax and paste it into the main scripting cell. Save it as a variable and name it DW Abb. Then, run the script. You’ll then see that the data has been added to the date table.
You can find all the format codes that you can use with the strftime() in docs.python.org.
If you find a format that you can use in your date table, copy its directive and use it in your cell. This example used the %p to add AM or PM in the date table.
Copy the previous code with the strftime() and change the name to AM_PM and the format code to %p. After you run the syntax, you’ll see the AM_PM column added in the table.
Setting Index For The Date Table
Once you’ve completed your data table, you can copy and paste the code in Power BI. You can also create an Excel sheet or a CSV file using the data frame function to.csv(). Inside the parentheses, write the file name of the Date table. In this syntax, Date_Table.csv is used.
The Date table file is in your working directory. To access that file, input pwd in a new cell and click Run. It will then show you the address of the file.
Locate the file and open it. You’ll then see your Date table.
In the A column, you’ll see the index. You need to set a column as the index to avoid problems once it goes into Power BI. Go back to Jupyter and input date_table.set_index () in the main scripting cell. Inside the parentheses, input a column that will be set to index. In this example, the Date column is used.
Now, you can resave the variable or use another parameter called inplace. In this case, inplace is used. If the parameter is set to True, the file will be saved.
After running the script, you can then see that the Date column is the new index.
Bringing The Table To Power BI with Python
You can bring the table to Power BI by opening the CSV file in the Get Data option. Click Get Data and select Text/CSV. From there, locate the file in the working directory and open it.
Another way is to use the Enter Data option.
Then, input Date Table in the table name, and press Load.
Next, click Transform Data in the Home tab.
Then, go to the Transform tab and click Run Python Script.
Go back to your Jupyter Notebook and copy the syntax from the cell. Then, paste it into the script editor.
If you press OK, you will get an error saying that ‘pd’ is not defined.
Importing Pandas To Power BI With Python Scripts
The reason for the error is that pandas wasn’t brought in inside the script editor. So, input import pandas as pd at the beginning of the script syntax and press OK.
You’ll have the dataset and the Date_Table in the first column.
If you click the Table value beside dataset, it will show you an empty table.
To fix this, click the gear icon beside the Run Python Script step in the Applied Steps pane.
Next, at the bottom of the code, equate dataset with the date_table and press OK.
You’ll then see the table in the dataset.
Resetting Index For The Date Table In Power BI With Python
The next thing to do is reset the index. Click the gear icon beside the Run Python Script step again and input date_table.reset_index(inplace=True). That function will pull out the Date that was set to index.
Using the inplace parameter again will run and set the new index. If you go back to the table, you’ll then see the Date column.
***** Related Links *****
Enter Data In Power BI: How To Use It For Data Entry
Data Loading And Transformation Best Practices
Python I For Power BI Users – New Course In The Enterprise DNA Education Platform
Conclusion
Data tables and reports can be made outside the Power BI Desktop. You can create and import them using the Python Script Loader and Jupyter Notebook. They’re an alternative way to load data and learn a new programming language to enhance your data development skills.
Gaelim