If you've been programming in R for a while, you probably know that there are several ways to specify function arguments:

  • by position
  • by (complete) name
  • by partial name
  • a combination of the above
do_something <- function(some_arg, another_arg) {
  message("some_arg: ", some_arg, " | another_arg: ", another_arg)
}

# by position
do_something("a", "b")
# some_arg: a | another_arg: b

# by complete name:
do_something(some_arg = "a", another_arg = "b")
# some_arg: a | another_arg: b

# by complete name (reversed order of arguments)
do_something(another_arg = "b", some_arg = "a")
# some_arg: a | another_arg: b

# by partial matching:
do_something(so = "a", a = "b")
# some_arg: a | another_arg: b

# by partial matching (reversed order of arguments)
do_something(a = "b", so = "a")
# some_arg: a | another_arg: b

The use of partial matching is recommended only for quick live tests, not for code that is going to be used by others (including yourself in the near future...). In addition, any arguments that appear after ...(ellipsis) on the argument list must be named explicitly and cannot be partially matched or matched positionally.

Order of arguments

Changing the order of arguments when calling a function can improve readability. Here is an example:

# default order of arguments for DT::renderDT
DT::renderDT(
  expr = {
    DT::datatable(
      data = iris,
      options = list(lengthChange = FALSE),
      rownames = FALSE
    )
  },
  server = FALSE
)

# in practice, expr can become quite lengthy
# => changing the order of arguments can make the code easier to read
DT::renderDT(
  server = FALSE,
  expr = {
    DT::datatable(
      data = iris,
      options = list(lengthChange = FALSE),
      rownames = FALSE
    )
  }
)

Defining default arguments based on other arguments

Function arguments in R are evaluated lazily, i.e. they’re only evaluated if they’re actually used:

do_something <- function(a, b) {
  message("do_something")
}

# the following call runs ok, although the mandatory b argument is not given
do_something(1)
# do_something

This has the following consequences on default function arguments:

  • the default value can be defined in terms of other arguments
  • default arguments can even be defined in terms of variables created within the function
do_something <- function(a = 3, b = a * 2) {
  message("a: ", a, " | b: ", b)
}

# if no argument is given, the default arguments are used
do_something()
# a: 3 | b: 6

do_something(a = 4, b = 10)
# a: 4 | b: 10

# but specifying only the first argument results into error
do_something(a = 10)
# Error in message("a: ", a, " | b: ", b) : object 'a' not found

# specifying the second argument based on the first one also results into error:
do_something(a = 10, b = a * 4)
# Error in message("a: ", a, " | b: ", b) : object 'a' not found

# example from http://adv-r.had.co.nz/Functions.html#function-arguments
# default arguments can even be defined in terms of variables created within the function
# just DON'T DO THIS
h <- function(a = 1, b = d) {
  d <- (a + 1) ^ 2
  c(a, b)
}
h()
# 1 4

Consider using purrr::partial instead of default arguments

I find default arguments rarely useful. They might look like being useful when a function has many arguments, but the danger of not specifying the arguments which are actually needed is too big for me.

(Also, if your function has too many arguments, the problem might be your function...)

Consider the merge function:

?merge
# merge(x, y, by = intersect(names(x), names(y)),
#       by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
#       sort = TRUE, suffixes = c(".x",".y"), no.dups = TRUE,
#       incomparables = NULL, ...)

It happened to me that I wrongly forgot to specify any of the all arguments. Now, every time I see this function in code, I wonder if this was by intention or is this a bug.

One way to avoid this question is to define meaningful function names using purrr::partial:

dt_1 <- data.table::data.table(
  id = letters[1:3],
  surname = LETTERS[1:3]
)

dt_2 <- data.table::data.table(
  id = letters[1:2],
  age = 40:41
)

# a full outer join:
merge(
  x = dt_1,
  y = dt_2,
  all = TRUE,
  by = "id"
)
#    id surname age
# 1:  a       A  40
# 2:  b       B  41
# 3:  c       C  NA

# if I forget to define the all argument
# I get an inner join.
# question: how do I know if this is by intention or if this is a bug?
merge(
  x = dt_1,
  y = dt_2,
  by = "id"
)
#    id surname age
# 1:  a       A  40
# 2:  b       B  41

full_outer_join <- purrr::partial(.f = merge, all = TRUE)
full_outer_join(x = dt_1, y = dt_2, by = "id")
#    id surname age
# 1:  a       A  40
# 2:  b       B  41
# 3:  c       C  NA

 

https://xkcd.com/1349/

xkcd_shouldnt_be_hard

 

Make a promise. Show up. Do the work. Repeat.