Chapter 54 Non-Standard Evaluation

What You’ll Learn:

  • Non-standard evaluation (NSE)
  • Tidy evaluation
  • Quasiquotation
  • Common metaprogramming errors

Difficulty: ⭐⭐⭐ Advanced

54.1 Introduction

NSE allows functions to capture and manipulate code:

library(dplyr)
library(rlang)

# Standard evaluation
filter(mtcars, cyl == 4)
#>                 mpg cyl  disp hp drat    wt  qsec vs am gear carb cyl_factor
#> Datsun 710     22.8   4 108.0 93 3.85 2.320 18.61  1  1    4    1          4
#> Merc 240D      24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2          4
#> Merc 230       22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2          4
#> Fiat 128       32.4   4  78.7 66 4.08 2.200 19.47  1  1    4    1          4
#> Honda Civic    30.4   4  75.7 52 4.93 1.615 18.52  1  1    4    2          4
#> Toyota Corolla 33.9   4  71.1 65 4.22 1.835 19.90  1  1    4    1          4
#> Toyota Corona  21.5   4 120.1 97 3.70 2.465 20.01  1  0    3    1          4
#> Fiat X1-9      27.3   4  79.0 66 4.08 1.935 18.90  1  1    4    1          4
#>  [ reached 'max' / getOption("max.print") -- omitted 3 rows ]

# The 'cyl == 4' is captured as an expression
# Not evaluated immediately

54.2 Tidy Evaluation

💡 Key Insight: Embrace and Inject

# Problem: variables don't work
my_filter <- function(df, condition) {
  filter(df, condition)  # Error!
}

# Solution: embrace with {{}}
my_filter <- function(df, condition) {
  filter(df, {{ condition }})
}

my_filter(mtcars, cyl == 4)
#>                 mpg cyl  disp hp drat    wt  qsec vs am gear carb cyl_factor
#> Datsun 710     22.8   4 108.0 93 3.85 2.320 18.61  1  1    4    1          4
#> Merc 240D      24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2          4
#> Merc 230       22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2          4
#> Fiat 128       32.4   4  78.7 66 4.08 2.200 19.47  1  1    4    1          4
#> Honda Civic    30.4   4  75.7 52 4.93 1.615 18.52  1  1    4    2          4
#> Toyota Corolla 33.9   4  71.1 65 4.22 1.835 19.90  1  1    4    1          4
#> Toyota Corona  21.5   4 120.1 97 3.70 2.465 20.01  1  0    3    1          4
#> Fiat X1-9      27.3   4  79.0 66 4.08 1.935 18.90  1  1    4    1          4
#>  [ reached 'max' / getOption("max.print") -- omitted 3 rows ]

# For column names
my_select <- function(df, col) {
  select(df, {{ col }})
}

my_select(mtcars, mpg)
#> Error in select(df, {: unused argument ({
#>     {
#>         col
#>     }
#> })

54.3 Quasiquotation

# Inject values with !!
threshold <- 20
mtcars %>%
  filter(mpg > !!threshold)
#>                 mpg cyl  disp  hp drat    wt  qsec vs am gear carb cyl_factor
#> Mazda RX4      21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4          6
#> Mazda RX4 Wag  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4          6
#> Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1          4
#> Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1          6
#> Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2          4
#> Merc 230       22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2          4
#> Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1          4
#> Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2          4
#>  [ reached 'max' / getOption("max.print") -- omitted 6 rows ]

# Inject names with :=
name_col <- "efficiency"
mtcars %>%
  mutate(!!name_col := mpg / wt)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb cyl_factor
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4          6
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4          6
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1          4
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1          6
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2          8
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1          6
#> Duster 360        14.3   8  360 245 3.21 3.570 15.84  0  0    3    4          8
#>                   efficiency
#> Mazda RX4           8.015267
#> Mazda RX4 Wag       7.304348
#> Datsun 710          9.827586
#> Hornet 4 Drive      6.656299
#> Hornet Sportabout   5.436047
#> Valiant             5.231214
#> Duster 360          4.005602
#>  [ reached 'max' / getOption("max.print") -- omitted 25 rows ]

# Splice multiple arguments with !!!
group_vars <- c("cyl", "gear")
mtcars %>%
  group_by(!!!syms(group_vars)) %>%
  summarize(mean_mpg = mean(mpg))
#> `summarise()` has grouped output by 'cyl'. You can override using the `.groups`
#> argument.
#> # A tibble: 8 × 3
#> # Groups:   cyl [3]
#>     cyl  gear mean_mpg
#>   <dbl> <dbl>    <dbl>
#> 1     4     3     21.5
#> 2     4     4     26.9
#> 3     4     5     28.2
#> 4     6     3     19.8
#> 5     6     4     19.8
#> 6     6     5     19.7
#> 7     8     3     15.0
#> 8     8     5     15.4

54.4 Common Errors

54.4.1 Error: object not found

# Problem: forgot to embrace
my_mutate <- function(df, new_col, expr) {
  mutate(df, new_col = expr)
}

my_mutate(mtcars, efficiency, mpg / wt)
#> Error in `mutate()`:
#> ℹ In argument: `new_col = expr`.
#> Caused by error:
#> ! object 'wt' not found

Solution: Use {{}} and :=

my_mutate <- function(df, new_col, expr) {
  mutate(df, {{new_col}} := {{expr}})
}

my_mutate(mtcars, efficiency, mpg / wt)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb cyl_factor
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4          6
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4          6
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1          4
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1          6
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2          8
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1          6
#> Duster 360        14.3   8  360 245 3.21 3.570 15.84  0  0    3    4          8
#>                   efficiency
#> Mazda RX4           8.015267
#> Mazda RX4 Wag       7.304348
#> Datsun 710          9.827586
#> Hornet 4 Drive      6.656299
#> Hornet Sportabout   5.436047
#> Valiant             5.231214
#> Duster 360          4.005602
#>  [ reached 'max' / getOption("max.print") -- omitted 25 rows ]

54.5 Summary

Key Takeaways:

  1. {{}} - Embrace and inject
  2. !! - Unquote single value
  3. !!! - Unquote-splice multiple
  4. := - Dynamic names
  5. NSE - Enables dplyr’s clean syntax

Quick Reference:

# Embrace columns
function(df, col) {
  df %>% select({{ col }})
}

# Inject values
x <- 5
filter(df, value > !!x)

# Dynamic names
name <- "new_col"
mutate(df, !!name := expression)

# Splice multiple
cols <- c("a", "b")
select(df, !!!syms(cols))