In today’s tutorial, I’m going to show how you can use path and text functions in Power BI to extract certain words from a given text string. You can watch the full video of this tutorial at the bottom of this blog.
In this example, we have two rows of data sets and we want to use text functions to be able to extract the first name, which is Antriksh and Brian, then the middle name, and then the final name.
So let’s get started. I will show you how to do that by using both different sets of functions that are available in DAX. I’m going to perform the activity in the DAX Studio so that you can see at each step what the code that I’m writing is actually going to return.
DAX Codes Using Text Functions In Power BI
Inside DAX Studio, we’ll write our code. The first thing we need to do is to write the keyword DEFINE, and then COLUMN. I’m creating a virtual column that exists only for the duration of this session that I opened in the DAX Studio. Any column that I create here would not actually persist in the dataset that we have in the Power BI.
Next, I’m going to provide the table name in which that column will get created. The column’s name would be First Name.
To extract the first name from the data set that we have already seen, I need to find the position of the first space. Once I have that position, I will extract the first character. I will extract the word that starts from position one until the position of that first space.
To do that, I’m going to create a variable (FirstSpace), and that would be basically a search operation. So I’ll write Names, then Data, which is the original column in my database. I can then RETURN the FirstSpace variable so that we can see what it is going to return.
To be able to visualize the result, we need to write EVALUATE, which will fire the DAX code and provide the table name that we have in the model.
If I press f5, you can see that against each name, we have a certain number search. It says nine in the first row and six in the second.
In the first row, the name Antriksh only contains eight characters and it is returning one-to-many because it is counting the position of the space. To get eight, I can write minus one. To return the first word, I’ll use the LEFT function and write the text string from which I want to extract the first name, and that is Names Data. And then, I want to provide the number of characters that I want to extract. I’ll write FirstSpace and then close this.
If I click f5, you can see that I have retrieved the first name from that string. That’s how you can use the LEFT text function for extracting the first name.
Now, let’s extract the middle keyword or middle name from your own text string. Note that there are multiple ways of extracting a middle keyword, but I’m going to use the most basic one that is very easy to understand.
I’ll create another column, Names [Middle Name]. Then, I’m going to create a variable, which will hold the string that is being currently iterated by the DAX code. So that would hold the full string. So, the FullName would be equal to Names [Data].
And then, I need to store in a variable the name that I’ve already extracted in the previously computed column. To do that, I’ll write VAR FirstName and I’ll use the column, Names[FirstName].
Next, I’ll create another variable, which will be named as NewText. I will simply remove the first name from the FullName variable, so the resulting variable or the resulting column would contain only DAX Sharma and R Julius.
One more thing that we need to take care of is that leading space before DAX and R. To remove that, I’ll add TRIM in the code. TRIM removes any additional spaces that are in your text strings.
The next thing to do is to identify the position of the space that exists after the DAX and the R keyword. And for that, I can write a variable, StartPos (start position), and that will basically be a search for the empty space in the NewText variable. Then, I need to subtract one, so that it returns only three, and one in the case of R and not two.
The last variable that I need to create is RESULT, wherein I’m only going to extract the value DAX and R using the LEFT function. Now we have retrieved the DAX and R keywords that we wanted to extract from the original string.
Next, we’re going to extract the last name from the original string. To do that, I’ll create another column, Names[LastName]. Then, I’ll first create a variable (FullName) that will store that original string.
The next variable that I’m going to create would be ReplaceSpaceWithTilde, which will replace every space that exists in the original string with a specific character that I think would not actually exist in the original string. I will later use that character to identify the position of the last space. With that, you can see that all of the space has been replaced with the tilde character (ꓥ).
Next, I’m going to create another variable (RemoveSpace), which will remove the space that I have in the original string. I want to find the space, and then replace it with nothing. The next variable, PosLastTilde (position last tilde) will basically subtract the length of RemoveSpace from the length of the ReplaceSpaceWithTilde variable. That gives us 2.
And now, I’ll write more codes to replace that with the tilde character.
Now all I need to do is to use that to extract the word, Sharma. I’ll create a variable, Result, and I’m going to use the MID function.
Using Path Functions To Shorten DAX Codes
Now let’s move on and see how we can use the path functions to shorten the amount of code that we have written with text functions in Power BI. You will be surprised to see that even though path functions are not actively designed for these kinds of operations but by using those functions, you are able to convert all these codes.
So, I’m going to create a new session of DAX Studio by pressing controlling and connecting to the Power BI file. I’ll specify the DEFINE and the column keyword, and then the path in the table name and the name of the virtual column, which will only persist for this session.
Then, I’ll use the SUBSTITUTE function, and I need to path in the original column in the first argument, and then substitute the space with the pipe operator. Path functions use the pipe operator to extract the child from the parent-child hierarchy.
Next, let’s return the result of this column by using EVALUATE and the Names and the name of the table. I press f5, and you’ll see that we have received a string or another column, which contains every character that we already have in the original string delimited by the pipe operator (|).
The path function is going to use that pipe operator to extract the first middle and the last character. So, I’ll create another column that is going to rely on the result of the second column. I can specify Column Names [First Name], and I’m going to use the function, PATHITEM.
All I need to do in the first argument is to specify the column that we have already computed in the previous column. In the second argument, I need to provide a keyword or the character that I want to extract. Then, I’ll provide the number, 1. For the Column, Middle Name, I’ll provide the number, 2, and 3 for Last Name.
I’ve shown you how to use text functions in Power BI within DAX codes to extract certain words from a given text string. I also demonstrated how to use path functions to shorten those codes. So by now, you can see how easy it is to extract words from a string by using the path functions. Whereas with text functions, we had to write a lot of codes.
In the case where there are multiple words in a given string, you can use another path function. Instead of manually calculating how many delimiters there are, and then providing that count of the delimiter in the PATHITEM function, you can use PATHITEMREVERSE function.
If you have questions about this tutorial, you can ask in the comment section below. Check out the links below and watch the full video as well for more details on this topic.
All the best!
Enterprise DNA Experts