(Don't read the title, which belongs to the categories how to read a book and remember everything / how to learn a programming language in three days / how to loose weight by doing this five minutes per day... What follows is a seed. It needs you and your effort to grow. With diligence and luck, you might thank me for the flower later...)
Here is a glimpse into functional programming in R for the future me.
Filter
Filter selects a subset of a sequence based on a predicate.
ll <- list(1, 2, "a", "b")
Filter(is.numeric, ll)
# [[1]]
# [1] 1
#
# [[2]]
# [1] 2
The equivalent function in the purrr package is keep:
purrr::keep(.x = ll, .p = is.numeric)
# [[1]]
# [1] 1
#
# [[2]]
# [1] 2
Here two examples from the book Functional Programming in R by Thomas Mailund:
is_even <- function(x){
x %% 2 == 0
}
unlist(Filter(is_even, 1:10))
#[1] 2 4 6 8 10
Filtering is more powerful with closures:
larger_than <- function(x){
function(y){
y > x
}
}
larger_than(1:10)(3)
#TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
`%>%` <- magrittr::`%>%`
Filter(larger_than(3), 1:10) %>%
unlist(.)
#[1] 4 5 6 7 8 9 10
Negate
Negate negates a predicate function.
isTRUE(TRUE)
# TRUE
Negate(isTRUE)(TRUE)
# FALSE
ll <- list(1, 2, "a", "b")
Filter(f = Negate(is.numeric), x = ll)
# [[1]]
# [1] "a"
#
# [[2]]
# [1] "b"
larger_than <- function(x){
function(y){
y > x
}
}
larger_than(1:10)(3)
# TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Negate(larger_than(1:10))(3)
# FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
`%>%` <- magrittr::`%>%`
Filter(larger_than(3), 1:10) %>%
unlist(.)
#[1] 4 5 6 7 8 9 10
Filter(Negate(larger_than(3)), 1:10) %>%
unlist(.)
# [1] 1 2 3
The equivalent function in the purrr package is discard:
ll <- list(1, 2, "a", "b")
purrr::discard(.x = ll, .p = is.numeric)
# [[1]]
# [1] "a"
#
# [[2]]
# [1] "b"
Map
The Map function evaluates a function for each element in a vector or list and returns a vector or list with the results:
is_even <- function(x){
x %% 2 == 0
}
`%>%` <- magrittr::`%>%`
Map(is_even, 1:5) %>%
unlist(.)
# [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
Map(isTRUE, list(FALSE, TRUE, NULL, NA, "", 1)) %>%
unlist(.)
# FALSE TRUE FALSE FALSE FALSE FALSE
purrr::map(.x = 1:5, .f = is_even) %>%
unlist(.)
#[1] FALSE TRUE FALSE TRUE FALSE
Similar to the Filter function, Map can be combined with closures:
add <- function(x) {
function(y) {
x + y
}
}
Map(add(3), 1:10) %>%
unlist(.)
# 4 5 6 7 8 9 10 11 12 13
purrr::map(.x = 1:10, .f = add, y = 3) %>%
unlist(.)
[1] 4 5 6 7 8 9 10 11 12 13
One can pass named arguments to a Map call via a list provided to the MoreArgs parameter:
add <- function(x, y) {
x + y
}
Map(f = add, 1:10, MoreArgs = list(y = 3)) %>%
unlist(.)
# [1] 4 5 6 7 8 9 10 11 12 13
If you want to loop over multiple inputs, use purrr::map2:
purrr::map2(.x = 1:10, .y = 3, .f = `+`) %>%
unlist(.)
[1] 4 5 6 7 8 9 10 11 12 13
Applications with data.table
dt <- data.table::data.table(
market = letters[1:3],
symbol = LETTERS[1:3],
price = 10:12,
ts = bit64::as.integer64(1:3)
)
dt
# market symbol price ts
# 1: a A 10 1
# 2: b B 11 2
# 3: c C 12 3
Filter(is.numeric, dt)
# price ts
# 1: 10 1
# 2: 11 2
# 3: 12 3
purrr::keep(.x = dt, .p = is.numeric)
# price ts
# 1: 10 1
# 2: 11 2
# 3: 12 3
Filter(Negate(is.numeric), dt)
# market symbol
# 1: a A
# 2: b B
# 3: c C
purrr::discard(.x = dt, .p = is.numeric)
# market symbol
# 1: a A
# 2: b B
# 3: c C
Convert columns of type integer64 to numeric:
str(dt)
# Classes ‘data.table’ and 'data.frame': 3 obs. of 4 variables:
# $ market: chr "a" "b" "c"
# $ symbol: chr "A" "B" "C"
# $ price : int 10 11 12
# $ ts :integer64 1 2 3
cols_to_convert <- names(Filter(bit64::is.integer64, dt))
dt[, (cols_to_convert) := lapply(.SD, as.numeric), .SDcols = cols_to_convert]
str(dt)
# Classes ‘data.table’ and 'data.frame': 3 obs. of 4 variables:
# $ market: chr "a" "b" "c"
# $ symbol: chr "A" "B" "C"
# $ price : int 10 11 12
# $ ts : num 1 2 3
Create a column based on other columns:
dt[, id := Reduce(x = .SD, f = paste), .SDcols = c("market", "symbol")]
str(dt)
# Classes ‘data.table’ and 'data.frame': 3 obs. of 5 variables:
# $ market: chr "a" "b" "c"
# $ symbol: chr "A" "B" "C"
# $ price : int 10 11 12
# $ ts : num 1 2 3
# $ id : chr "a A" "b B" "c C"
my_paste <- function(...){
paste(..., sep = "@")
}
dt[, id2 := Reduce(x = .SD, f = my_paste), .SDcols = c("market", "symbol")]
dt
# market symbol price ts id id2
# 1: a A 10 1 a A a@A
# 2: b B 11 2 b B b@B
# 3: c C 12 3 c C c@C
dt[
, id3 := purrr::reduce(.x = .SD, .f = paste, sep = "@"),
.SDcols = c("market", "symbol")
]
dt
# market symbol price ts id id2 id3
# 1: a A 10 1 a A a@A a@A
# 2: b B 11 2 b B b@B b@B
# 3: c C 12 3 c C c@C c@C
What if you want to use only base function and not the purrr package, but you still want to be able to freely select the "sep" argument? The base Reduce function does not offer the possibility to give any arguments to the argument ".f".
However, we can use closures:
# this does not work:
my_paste_2 <- function(..., sep) {
paste(..., sep = sep)
}
# but this does:
my_paste_3 <- function(..., sep){
function(...){
paste(..., sep = sep)
}
}
dt[
, id4 := Reduce(x = .SD, f = my_paste_3(sep = "|")),
.SDcols = c("market", "symbol")
]
dt
# market symbol price ts id id2 id3 id4
# 1: a A 10 1 a A a@A a@A a|A
# 2: b B 11 2 b B b@B b@B b|B
# 3: c C 12 3 c C c@C c@C c|C
Too much?
https://sketchplanations.com/cognitive-overhead
No? Here you can find more food for thought: http://www.pointer.io/
Do you have any other applications in mind? Let me know in the comments, I'd be happy to add them.
Make a promise. Show up. Do the work. Repeat.