Mastering Reusable Code and Analysis in R

by Sam McKay, CFA | R

Table of Contents

Setting Up Your R Environment

Step 1: Install R

Download R from the CRAN website (https://cran.r-project.org/)
Follow the installation instructions for your operating system.

Step 2: Install RStudio

Download RStudio from the official website (https://www.rstudio.com/)
Follow the installation instructions for your operating system.

Step 3: Open RStudio

Launch RStudio from your applications or start menu.

Step 4: Set Up Your Working Directory

Set your working directory where your projects and scripts will be stored.

# Set the working directory to a folder of your choice
setwd("path/to/your/folder")

# Verify the working directory
getwd()

Step 5: Install Required Packages

Install any packages you’ll be using for your analysis.

# Example of installing commonly used packages
install.packages(c("tidyverse", "data.table", "ggplot2"))

# Load the installed packages
library(tidyverse)
library(data.table)
library(ggplot2)

Step 6: Create a Project in RStudio

Click File -> New Project...
Choose New Directory -> Empty Project
Name your project and specify a location
Click Create Project

Step 7: Create R Script

Click File -> New File -> R Script
Write your initial R code and save the script

# Example of a simple R script
print("Hello, R!")

Step 8: Run R Script

Highlight the code you want to run
Click Run or press Ctrl+Enter (Windows/Linux) or Cmd+Enter (Mac)

By following these steps, you will have a fully functional R environment set up and ready for efficient coding and analysis.

Project Structure and Organization

Here is a practical implementation of structuring an R project to ensure efficient, reusable code, and to perform analysis effectively.

ProjectRoot/

data/
- raw/
  - Contains raw data files (e.g., data.csv, data2.csv)
- processed/
  - Contains processed data files (e.g., data_cleaned.csv)
docs/
- Contains documentation files (e.g., README.md, detailed analysis reports in .md or .Rmd)
R/
- data_preprocessing.R
  - Script for data cleaning and preprocessing functions
- analysis.R
  - Script for conducting analysis functions
- visualization.R
  - Script for visualization functions
notebooks/
- EDA.Rmd
  - Exploatory Data Analysis notebook
- analysis_report.Rmd
  - Analysis report notebook
tests/
- test_data_preprocessing.R
  - Unit tests for data preprocessing functions
- test_analysis.R
  - Unit tests for analysis functions
- test_visualization.R
  - Unit tests for visualization functions
scripts/
- run_preprocessing.R
  - Script to execute data preprocessing
- run_analysis.R
  - Script to execute the main analysis
- run_visualization.R
  - Script to execute data visualization
config/
- config.yml
  - Configuration file for setting parameters used across the project
.gitignore
- Ignore unnecessary files and folders, such as temp files and large datasets
```
/data/raw/
/data/processed/
/.RData
/.Rhistory
```
README.md
- High-level project description, how to run scripts, dependencies, etc.

Example Scripts

R/data_preprocessing.R

# Data Preprocessing Functions
clean_data <- function(data) {
  # Function to clean data
  data <- na.omit(data)
  data <- data[data$value > 0, ]
  return(data)
}

R/analysis.R

# Analysis Functions
perform_analysis <- function(cleaned_data) {
  # Function to perform the analysis
  summary_stats <- summary(cleaned_data)
  return(summary_stats)
}

R/visualization.R

# Visualization Functions
plot_data <- function(cleaned_data) {
  # Function to plot data
  plot(cleaned_data$value, main = "Cleaned Data Plot", xlab = "Index", ylab = "Value")
}

scripts/run_preprocessing.R

# Script to Execute Data Preprocessing
source("R/data_preprocessing.R")

data <- read.csv("data/raw/data.csv")
cleaned_data <- clean_data(data)
write.csv(cleaned_data, "data/processed/data_cleaned.csv", row.names = FALSE)

scripts/run_analysis.R

# Script to Execute Analysis
source("R/analysis.R")

cleaned_data <- read.csv("data/processed/data_cleaned.csv")
analysis_results <- perform_analysis(cleaned_data)
print(analysis_results)

scripts/run_visualization.R

# Script to Execute Data Visualization
source("R/visualization.R")

cleaned_data <- read.csv("data/processed/data_cleaned.csv")
plot_data(cleaned_data)

Configuration Example

config/config.yml

data:
  raw_path: "data/raw/"
  processed_path: "data/processed/"

analysis:
  significance_level: 0.05

visualization:
  plot_title: "Analysis Results"
  x_label: "X-Axis"
  y_label: "Y-Axis"

Example of .gitignore

/data/raw/
/data/processed/
/.RData
/.Rhistory

Keep this structure consistent to maintain an organized and efficient workflow throughout your project.

Writing Simple Custom Functions in R

Example 1: Simple Addition Function

# Function to add two numbers
add <- function(a, b) {
  return(a + b)
}

# Usage
sum <- add(10, 5)
print(sum)  # Output: 15

Example 2: Function to Calculate the Square of a Number

# Function to square a number
square <- function(x) {
  return(x * x)
}

# Usage
result <- square(4)
print(result)  # Output: 16

Example 3: Function with Default Argument

# Function to multiply two numbers with a default for the second parameter
multiply <- function(a, b = 1) {
  return(a * b)
}

# Usage
product1 <- multiply(10, 5)
print(product1)  # Output: 50

product2 <- multiply(10)
print(product2)  # Output: 10

Example 4: Function to Check if a Number is Even or Odd

# Function to check even or odd
is_even <- function(num) {
  return(num %% 2 == 0)
}

# Usage
check1 <- is_even(4)
print(check1)  # Output: TRUE

check2 <- is_even(7)
print(check2)  # Output: FALSE

Example 5: Function to Return Multiple Values

# Function to return a vector of multiple values
calculate <- function(x, y) {
  sum <- x + y
  difference <- x - y
  product <- x * y
  return(c(sum, difference, product))
}

# Usage
values <- calculate(10, 5)
print(values)  # Output: 15 5 50

Implementing Control Structures in R

If-Else Statements

x <- 10

# If x is greater than 5, print "x is greater than 5", else print "x is 5 or less"
if (x > 5) {
  print("x is greater than 5")
} else {
  print("x is 5 or less")
}

If-Else If-Else Ladder

x <- 10

# Check multiple conditions
if (x > 10) {
  print("x is greater than 10")
} else if (x == 10) {
  print("x is exactly 10")
} else {
  print("x is less than 10")
}

For Loop

# Iterating through a sequence from 1 to 5
for (i in 1:5) {
  print(i)
}

While Loop

x <- 1

# Print numbers from 1 to 5
while (x <= 5) {
  print(x)
  x <- x + 1
}

Repeat Loop

x <- 1

# Print numbers from 1 to 5, should include a break condition
repeat {
  print(x)
  x <- x + 1
  if (x > 5) {
    break
  }
}

Switch Statement

# Define a variable
day <- "Tuesday"

# Print the day type based on the value of `day`
day_type <- switch(day,
  "Monday" = "Weekday",
  "Tuesday" = "Weekday",
  "Wednesday" = "Weekday",
  "Thursday" = "Weekday",
  "Friday" = "Weekday",
  "Saturday" = "Weekend",
  "Sunday" = "Weekend",
  "Invalid day"
)
print(day_type)

Apply Family Functions

lapply

# List of numeric vectors
lst <- list(a = 1:3, b = 4:6)

# Apply sum function to each vector in the list
result <- lapply(lst, sum)
print(result)

sapply

# Simpler version of lapply, returns a vector
result <- sapply(lst, sum)
print(result)

tapply

# Compute the mean of grouped data
data <- c(1, 2, 2, 3, 4, 4, 4, 5)
group <- c("A", "A", "B", "B", "A", "A", "B", "B")

result <- tapply(data, group, mean)
print(result)

mapply

# Apply a function to multiple arguments
result <- mapply(sum, 1:5, 6:10)
print(result)

Implementing these control structures will help you write more efficient and reusable code in R.

Using the apply Family of Functions in R

Using apply()

# Sample matrix
mat <- matrix(1:9, nrow = 3, byrow = TRUE)

# Applying a function to rows
row_sums <- apply(mat, 1, sum)

# Applying a function to columns
col_means <- apply(mat, 2, mean)

Using lapply()

# Sample list
my_list <- list(a = 1:5, b = 6:10)

# Applying a function to each element of the list
list_mean <- lapply(my_list, mean)

Using sapply()

# Sample list
my_list <- list(a = 1:5, b = 6:10)

# Applying a function to each element and returning a vector
vec_mean <- sapply(my_list, mean)

Using tapply()

# Sample data
values <- c(1, 2, 3, 4, 5, 6)
groups <- c("A", "A", "B", "B", "C", "C")

# Applying a function to subsets of a vector
group_sums <- tapply(values, groups, sum)

Using mapply()

# Sample vectors
vec1 <- c(1, 2, 3)
vec2 <- c(4, 5, 6)

# Applying a function in parallel
sum_vec <- mapply(sum, vec1, vec2)

Using vapply()

# Sample list
my_list <- list(a = 1:5, b = 6:10)

# Applying a function with a specified return type
vec_mean <- vapply(my_list, mean, numeric(1))

These are practical examples you can implement directly in your existing R scripts.

Error Handling and Debugging in R

Error Handling

Functions for Error Handling

Using tryCatch to Handle Errors

safeDivide <- function(x, y) {
  tryCatch({
    result <- x / y
    return(result)
  }, warning = function(war) {
    message("Warning: ", conditionMessage(war))
    return(NA)
  }, error = function(err) {
    message("Error: ", conditionMessage(err))
    return(NA)
  }, finally = {
    message("Clean up code here")
  })
}
# Example Usage
safeDivide(10, 2)  # Should return 5
safeDivide(10, 0)  # Should handle division by zero

Using stop, warning, message for Custom Errors

customFunc <- function(a, b) {
  if (!is.numeric(a) || !is.numeric(b)) {
    stop("Both arguments must be numeric")
  }
  if (b == 0) {
    warning("Division by zero, returning NA")
    return(NA)
  }
  
  result <- a / b
  message("Division successful")
  return(result)
}
# Example Usage
customFunc(10, 2)  # Division successful
customFunc(10, 0)  # Division by zero
customFunc(10, "a")  # Error: Both arguments must be numeric

Debugging

Using `print` and `cat` for Debugging

debugFunction <- function(vec) {
  total <- 0
  for (val in vec) {
    cat("Value: ", val, "\n")  # Debug: print each value
    total <- total + val
  }
  print(paste("Total Sum: ", total))  # Debug: print total sum
  return(total)
}
# Example Usage
debugFunction(c(1, 2, 3))  # Expect detailed output of the operations

Using `traceback` to Trace Errors

errorProneFunction <- function(x) {
  return(log(x))
}

# Calling the function with an invalid argument
errorProneFunction("a")

# Immediately after the error
traceback()
# Will output the call stack

Using `debug` and `browser`

Using debug

exampleDebugFunction <- function(x) {
  y <- x + 1
  z <- y * 2
  return(z)
}

# Setting debug
debug(exampleDebugFunction)
# Call the function
exampleDebugFunction(10)  # Will enter debug mode and step through
# To stop debugging
undebug(exampleDebugFunction)

Using browser for Step-by-Step Execution

exampleBrowserFunction <- function(x) {
  browser()  # Execution will pause here
  y <- x + 1
  z <- y * 2
  return(z)
}
# Call the function
exampleBrowserFunction(10)  # Console will enter interactive debugging mode

Using `options(error=recover)`

# Set this option to allow error recovery mode
options(error = recover)

# Calling a function that will error
errorProneFunction("a")

# R will enter a recovery mode allowing you to inspect the error state

These are practical methods for error handling and debugging in R that you can immediately incorporate into your R projects.

Creating and Using R Packages: A Practical Implementation

Step 1: Set Up Package Skeleton

# Load necessary library
library(devtools)

# Create a package directory skeleton in the current working directory
create_package("myPackage")

Step 2: Add Functions to Your Package

# Navigate to the R directory in the package to add R scripts
setwd("myPackage/R")

# Create a simple function in a new R script
writeLines(
'my_function <- function(x) {
  return(x^2)
}', con = "my_function.R"
)

Step 3: Document Functions

# Document the function using roxygen2 syntax by adding comments
writeLines(
'## my_function
## This function squares a number.
## @param x A numeric value.
## @return The square of x.
## @export

my_function <- function(x) {
  return(x^2)
}', con = "my_function.R"
)

Step 4: Generate Documentation

# Load roxygen2 library
library(roxygen2)

# Generate documentation
roxygenize("myPackage")

Step 5: Build the Package

# Build and install the package
setwd("..")  # Go back to the package's root directory
build()
install()

Step 6: Use the Package

# Load the package
library(myPackage)

# Use the function from the package
result <- my_function(5)
print(result)  # Output should be 25

Step 7: Adding Other Elements (Optional)

Adding Vignettes

# Create a vignette placeholder
use_vignette("my_vignette")

# Edit the vignette file created under vignettes/ to add detailed documentation

Adding Tests

# Create a test directory and a test file
use_testthat()
use_test("my_function")

# Write a test case in tests/testthat/test-my_function.R
writeLines(
'test_that("my_function works correctly", {
  expect_equal(my_function(2), 4)
  expect_equal(my_function(3), 9)
})', con = "tests/testthat/test-my_function.R"
)

# Run tests
devtools::test()

This series of commands and code snippets will create a basic R package and illustrate how to add, document, test, and use functions within it.

Implementing Reusable Data Wrangling Functions

Load Necessary Libraries

library(dplyr)
library(tidyr)

Data Wrangling Functions

Function: Filter Rows by Condition

filter_rows <- function(data, condition) {
  data %>%
    filter(condition)
}

Function: Select Specific Columns

select_columns <- function(data, columns) {
  data %>%
    select(all_of(columns))
}

Function: Rename Columns

rename_columns <- function(data, new_names) {
  data %>%
    rename(!!!new_names)
}

Function: Mutate Existing Columns

mutate_columns <- function(data, ...) {
  data %>%
    mutate(...)
}

Function: Summarize Data

summarize_data <- function(data, ...) {
  data %>%
    summarise(...)
}

Function: Pivot Data (Long to Wide)

pivot_to_wide <- function(data, names_from, values_from) {
  data %>%
    pivot_wider(names_from = {{names_from}}, values_from = {{values_from}})
}

Function: Pivot Data (Wide to Long)

pivot_to_long <- function(data, cols, names_to, values_to) {
  data %>%
    pivot_longer(cols = all_of(cols), names_to = names_to, values_to = values_to)
}

Function: Handle Missing Data (NA)

handle_na <- function(data, method = "remove") {
  if (method == "remove") {
    data %>%
      drop_na()
  } else if (method == "fill") {
    data %>%
      replace_na(list_fill)
  } else {
    stop("Invalid method")
  }
}

Usage Examples

# Sample Data
data <- tibble(
  id = 1:5,
  score = c(10, NA, 8, NA, 9),
  group = c("A", "B", "A", "B", "A")
)

# Filter Rows
filtered_data <- filter_rows(data, score > 8)

# Select Columns
selected_data <- select_columns(data, c("id", "score"))

# Rename Columns
renamed_data <- rename_columns(data, list(new_score = "score"))

# Mutate Columns
mutated_data <- mutate_columns(data, score2 = score * 2)

# Summarize Data
summarized_data <- summarize_data(data, avg_score = mean(score, na.rm = TRUE))

# Pivot to Wide
pivoted_wide_data <- pivot_to_wide(data, names_from = group, values_from = score)

# Pivot to Long
pivoted_long_data <- pivot_to_long(pivoted_wide_data, cols = c("A", "B"), names_to = "group", values_to = "score")

# Handle Missing Data (Remove NAs)
cleaned_data <- handle_na(data)

# Handle Missing Data (Fill NAs)
filled_data <- handle_na(data, method = "fill")

Each function above is designed for reuse across various data wrangling tasks. Adjust inputs as needed to fit specific datasets and requirements.

Writing Reusable Visualization Functions in R

# Load necessary libraries for visualization
library(ggplot2)

# Create a function for plotting scatter plots
scatter_plot <- function(data, x_var, y_var, title="Scatter Plot", x_label=NULL, y_label=NULL, color_var=NULL) {
  p <- ggplot(data, aes_string(x=x_var, y=y_var, color=color_var)) +
    geom_point() +
    ggtitle(title) +
    xlab(ifelse(is.null(x_label), x_var, x_label)) +
    ylab(ifelse(is.null(y_label), y_var, y_label))
  return(p)
}

# Create a function for plotting bar charts
bar_chart <- function(data, x_var, y_var, title="Bar Chart", x_label=NULL, y_label=NULL, fill_var=NULL) {
  p <- ggplot(data, aes_string(x=x_var, y=y_var, fill=fill_var)) +
    geom_bar(stat="identity", position="dodge") +
    ggtitle(title) +
    xlab(ifelse(is.null(x_label), x_var, x_label)) +
    ylab(ifelse(is.null(y_label), y_var, y_label))
  return(p)
}

# Create a function for plotting histograms
histogram_plot <- function(data, x_var, title="Histogram", x_label=NULL) {
  p <- ggplot(data, aes_string(x=x_var)) +
    geom_histogram(binwidth=30, fill="blue", color="black", alpha=0.7) +
    ggtitle(title) +
    xlab(ifelse(is.null(x_label), x_var, x_label))
  return(p)
}

# Create a function for plotting line charts
line_plot <- function(data, x_var, y_var, title="Line Plot", x_label=NULL, y_label=NULL, group_var=NULL) {
  p <- ggplot(data, aes_string(x=x_var, y=y_var, group=group_var, color=group_var)) +
    geom_line() +
    ggtitle(title) +
    xlab(ifelse(is.null(x_label), x_var, x_label)) +
    ylab(ifelse(is.null(y_label), y_var, y_label))
  return(p)
}

# Example usage with the built-in mtcars dataset:
# scatter_plot(mtcars, "wt", "mpg", title="Weight vs. MPG")
# bar_chart(mtcars, "cyl", "mpg", title="Cylinders vs. MPG", fill_var="cyl")
# histogram_plot(mtcars, "mpg", title="Distribution of MPG")
# line_plot(economics, "date", "unemploy", title="Unemployment Over Time")

Ensure proper handling of libraries and data to suit your specific project needs.

Creating Documentation for Your Functions in R

Documenting with roxygen2

Install and Load roxygen2 Package

install.packages("roxygen2")
library(roxygen2)

Prepare Your Function for Documentation

#' Title: Add Two Numbers
#'
#' Description: This function takes two numeric inputs and returns their sum.
#'
#' @param x A numeric value.
#' @param y A numeric value.
#'
#' @return The sum of x and y.
#'
#' @examples
#' add_numbers(5, 7)
#' add_numbers(10.5, 2.5)
#'
#' @export
add_numbers <- function(x, y) {
  return(x + y)
}

Generate Documentation Using roxygen2
- Ensure your function is saved in an R script inside the R/ directory of your package.
```
# In your R/ file, ensure your function and comments are saved
```
- Use roxygen2 to compile the documentation:
```
roxygen2::roxygenize("path_to_your_package")
```
- This command will generate or update the man/ directory with the .Rd files.

Documenting Inline Comments

Adding Simple Roxygen Comments

#' Calculate Factorial Using Recursion
#'
#' This function calculates the factorial of a number using a recursive approach.
#'
#' @param n A non-negative integer.
#' @return The factorial of the input integer.
#' @examples
#' factorial(5)
#' @export
factorial <- function(n) {
  if (n == 0) return(1)
  else return(n * factorial(n - 1))
}

Store Documentation
- Ensure any new changes are documented by re-running:
```
roxygen2::roxygenize("path_to_your_package")
```

By following these steps, you can ensure your functions are documented in a manner that’s consistent and useful for users of your R package.

Version Control with Git and GitHub

Step 1: Initialize a Git Repository

Open a terminal or command prompt.
Navigate to your R project directory.
Initialize a new git repository:
```
git init
```

Step 2: Create a `.gitignore` File

Inside your project directory, create a file named .gitignore and add common R-specific files to ignore:
```
.Rhistory
.Rdata
.Ruserdata
.Rproj.user
```

Step 3: Commit Your Code

Add all files to the git staging area:
```
git add .
```
Commit the files with a meaningful message:
```
git commit -m "Initial commit"
```

Step 4: Create a Repository on GitHub

Go to GitHub.
Create a new repository, do not initialize with a README, .gitignore, or license (since the local repo already has them).

Step 5: Link Local Repository to GitHub

Copy the remote repository URL from GitHub.

Add the GitHub repository as the remote origin in your local repository:

git remote add origin https://github.com/your_username/your_repo_name.git

Step 6: Push Local Repository to GitHub

Push the current contents to the remote repository:
```
git push -u origin master
```

Practical Example: Making Changes and Pushing to GitHub

Make changes to your R code.
Check the status of your repository:
```
git status
```
Add the changes to the staging area:
```
git add .
```
Commit the changes:
```
git commit -m "Describe your changes"
```
Push the changes to the remote repository:
```
git push
```

Practical Example: Viewing Commit History

View the commit history:
```
git log
```

Practical Example: Cloning a GitHub Repository

Copy the repository URL from GitHub.

Clone the repository to your local machine:

git clone https://github.com/your_username/your_repo_name.git

Step 7: Creating Branches and Merging

Create a new branch:
```
git checkout -b new-feature
```
Switch to an existing branch (e.g., master):
```
git checkout master
```
Merge a branch into master:
```
git merge new-feature
```

This concludes the practical implementation of version control with Git and GitHub for your R project.

Case Studies and Best Practices

Case Study 1: Data Cleaning and Visualization

Description

Clean a raw dataset and make an insightful visualization.

Implementation

Data Cleaning Function

clean_data <- function(df) {
  # Remove rows with missing values
  df <- na.omit(df)
  
  # Convert columns to appropriate types
  df$Date <- as.Date(df$Date, format="%Y-%m-%d")
  df$Value <- as.numeric(df$Value)
  
  return(df)
}

Data Visualization Function

library(ggplot2)

visualize_data <- function(df) {
  ggplot(df, aes(x = Date, y = Value)) +
    geom_line() +
    ggtitle("Time Series Data Visualization") +
    xlab("Date") +
    ylab("Value")
}

Use Case

raw_data <- read.csv("raw_data.csv")

cleaned_data <- clean_data(raw_data)

visualize_data(cleaned_data)

Case Study 2: Machine Learning Workflow

Description

Implement a machine learning workflow including data splitting, model training, and evaluation.

Implementation

Data Splitting Function

library(caret)

split_data <- function(df, train_ratio = 0.7) {
  trainIndex <- createDataPartition(df$target, p = train_ratio, list = FALSE)
  train <- df[trainIndex, ]
  test <- df[-trainIndex, ]
  return(list(train = train, test = test))
}

Model Training Function

train_model <- function(train_data) {
  model <- train(target ~ ., data = train_data, method = "rf")
  return(model)
}

Model Evaluation Function

evaluate_model <- function(model, test_data) {
  predictions <- predict(model, test_data)
  confusion <- confusionMatrix(predictions, test_data$target)
  return(confusion)
}

Use Case

data <- read.csv("dataset.csv")

split <- split_data(data)

model <- train_model(split$train)

eval_result <- evaluate_model(model, split$test)

print(eval_result)

Case Study 3: Reusable Data Wrangling Function

Description

Implement a reusable function for common data wrangling tasks.

Implementation

Wrangling Function

library(dplyr)

wrangle_data <- function(df) {
  df <- df %>%
    filter(!is.na(Value)) %>%
    mutate(NormalizedValue = (Value - min(Value)) / (max(Value) - min(Value)))
  return(df)
}

Use Case

raw_data <- read.csv("wrangling_data.csv")

wrangled_data <- wrangle_data(raw_data)

head(wrangled_data)

Conclusion

These case studies showcase the application of best practices in writing efficient and reusable code for data cleaning, visualization, machine learning workflows, and data wrangling using R. Implement these solutions in your projects to enhance your data analysis capabilities.

Mastering Reusable Code and Analysis in R

Setting Up Your R Environment

Step 1: Install R

Step 2: Install RStudio

Step 3: Open RStudio

Step 4: Set Up Your Working Directory

Step 5: Install Required Packages

Step 6: Create a Project in RStudio

Step 7: Create R Script

Step 8: Run R Script

Project Structure and Organization

Example Scripts

Configuration Example

Example of .gitignore

Writing Simple Custom Functions in R

Example 1: Simple Addition Function

Example 2: Function to Calculate the Square of a Number

Example 3: Function with Default Argument

Example 4: Function to Check if a Number is Even or Odd

Example 5: Function to Return Multiple Values

Implementing Control Structures in R

If-Else Statements

If-Else If-Else Ladder

For Loop

While Loop

Repeat Loop

Switch Statement

Apply Family Functions

lapply

sapply

tapply

mapply

Using the apply Family of Functions in R

Using apply()

Using lapply()

Using sapply()

Using tapply()

Using mapply()

Using vapply()

Error Handling and Debugging in R

Error Handling

Functions for Error Handling

Debugging

Using print and cat for Debugging

Using traceback to Trace Errors

Using debug and browser

Using options(error=recover)

Creating and Using R Packages: A Practical Implementation

Step 1: Set Up Package Skeleton

Step 2: Add Functions to Your Package

Step 3: Document Functions

Step 4: Generate Documentation

Step 5: Build the Package

Step 6: Use the Package

Step 7: Adding Other Elements (Optional)

Adding Vignettes

Adding Tests

Implementing Reusable Data Wrangling Functions

Load Necessary Libraries

Data Wrangling Functions

Function: Filter Rows by Condition

Function: Select Specific Columns

Function: Rename Columns

Function: Mutate Existing Columns

Function: Summarize Data

Function: Pivot Data (Long to Wide)

Function: Pivot Data (Wide to Long)

Function: Handle Missing Data (NA)

Usage Examples

Writing Reusable Visualization Functions in R

Creating Documentation for Your Functions in R

Documenting with roxygen2

Documenting Inline Comments

Version Control with Git and GitHub

Step 1: Initialize a Git Repository

Step 2: Create a .gitignore File

Step 3: Commit Your Code

Step 4: Create a Repository on GitHub

Step 5: Link Local Repository to GitHub

Step 6: Push Local Repository to GitHub

Using `print` and `cat` for Debugging

Using `traceback` to Trace Errors

Using `debug` and `browser`

Using `options(error=recover)`

Step 2: Create a `.gitignore` File