In this tutorial, you’ll learn how to chain functions together using the dplyr pipe operator in the R programming language.
These are some of the basic functions in R:
The dplyr in R can be viewed as a grammar of data manipulation. Notice how each function is a verb, and when placed together, they constitute a form of command. All of these functions are meant to work together.
As shown in the last row of the table, the pipe operator in R is represented by %>% which allows you to connect functions together.
In this tutorial, you’ll learn how to run common dplyr functions and then use the pipe operator to chain them together.
Using The Pipe Operator In R To Simplify Code
Open the R program. In the blank script, you need to call in a library using the tidyverse and Lahman libraries.
For this example, let’s find the average, min, and max wins for each team since the year 2000.
You can write the code in a number of ways.
The first is to continuously reassign teams. You need to filter teams by yearID and then group them by team ID. To get the mean, min, and max, you need to use the SUMMARIZE ( ) function.
When you run the R code, you’ll get a table showing the teamID, mean, min, and max.
However, this code contains too many keystrokes.
So, let’s try the other way to get the results in one go, and that’s by using the pipe operator.
The keyboard shortcut for the pipe operator is CTRL+SHIFT+M. This allows you to pass your teams data frame into the next steps.
For the code, you don’t need to reassign teams to each function. You only need to use the pipe operator between each function to carry over the data frame to the entire code.
When you run it, you’ll get the same results as the previous method.
The pipe operator allows you to streamline and simplify your code. However, using this operator takes some time to get used to. But once you’ve understood how it works, creating an R script becomes an easier task.
Making Changes To The Code
Using a pipe operator also makes it easier to make changes to your R code.
For example, if you want to add more commands, you only need to incorporate another line of code and chain it to the existing code using the pipe operator.
Keep in mind that this isn’t assigning the results to an object. It’s only taking the teams data frame and running it through these functions to generate an output.
To assign the results to an object, you need to use the arrow operator (<-).
***** Related Links *****
Add, Remove, & Rename Columns In R Using dplyr
Arrange, Filter, & Group Rows In R Using dplyr
Data Frames In R: Learning The Basics
Conclusion
The pipe operator allows you to streamline your code in R. It helps eliminate the process of having to continuously reassign variables and data throughout your R script. Along with the column and row operators in the tidyverse library, it enables users to easily manipulate data in R.
This is one of the advantages of using the tidyverse library. It’s a great tool for users who deal with statistics and data science.
All the best,
George