Here another result of interrupted thoughts. Does it happen to you to come into situations where it's obvious to you what the result will be, but not to the other persons involved?
Train, a young man with a bicycle. He leaves the bicycle leaned against the closed door, with the bicycle lock hanging on the handlebars, and hurries to one of seats in the back of the wagon.
I stare at the bicycle, then at him, and wonder how can he leave the bicycle like that, most probably it will fall down soon after the train starts to move, and most probably at the first station, when the door will open, the lock will fall outside the train...
I think for a second that I should draw this to his attention. Then again: why should I care more about his stuff than he does? I say nothing, still lost... The train moves, the bicycle goes slowly to the ground, at the first station, the door opens, the lock falls outside in the space between the wagon and the platform... some time later, the young man comes, lifts up his bicycle, notices that the lock disappeared, I try to explain to him what happened, he shrugs and gets out...
Yet another moment where I wish there were manuals about dealing with people...
I just re-read a book written by Eric Berne, the creator of transactional analysis, written in 1964 (yes, I'm really up-to-date with what happens in the world...), called "Games people play". Among other issues, the book discusses what people do when they meet and after the introductory small talk, like "Don't you think the walls are perpendicular tonight?".
Other things I read about:
- books were originally quite expensive because they were written by hand. People would think hard if something is worth writing, contrary to nowadays. (And yes, I also thought twice, I would have written this blog entry also by hand...)
- "Platon's three sieves of truth"
- "The humble programmer"
- people communicate mostly by talking, seeing, reading, writing. While the lucky ones among us spend many years in school learning how to write and read, there is no curriculum about learning how to listen. I noticed that sometimes in a conversation just waiting a few seconds before replying can make a difference...
I kept waiting for a while, since I heard nothing from you, here is me talking to myself (I find it fascinating how I can avoid even listening to myself...). As I recently switched from a group working mainly with data.table o a group using the tidyverse approach, here is a little survival guide.
Filter a data.frame by a variable called like the column name using the bang bang operator
df <- tibble::tribble(
~c1, ~c2,
"A", 1,
"B", 2,
"B", 10
)
c1 <- "B"
`%>%` <- magrittr::`%>%`
# filter using the injection operator (also called 'bang bang operator')
# for more infos: ?`!!`
df %>%
dplyr::filter(c1 == !!c1)
A word of caution in case you use the negation operator and you have missing values in your column:
df <- tibble::tribble(
~c1, ~c2,
"A", 1,
NA_character_, 2,
"B", 10
)
df %>%
dplyr::filter(c1 != !!c1)
# A tibble: 1 × 2
c1 c2
<chr> <dbl>
1 A 1
# this is probably not what you want.
# compare with this:
df %>%
dplyr::filter(is.na(c1) | c1 != !!c1)
# A tibble: 2 × 2
c1 c2
<chr> <dbl>
1 A 1
2 NA 2
Find maximum of a value by group using slice_max()
df <- tibble::tribble(
~grp, ~value,
"A", 1,
"A", 3,
NA_character_, 2,
"B", 10,
"B", 10,
"B", 8
)
`%>%` <- magrittr::`%>%`
df %>%
dplyr::group_by(grp) %>%
dplyr::slice_max(value)
# A tibble: 4 × 2
# Groups: grp [3]
grp value
<chr> <dbl>
1 A 3
2 B 10
3 B 10
4 NA 2
# Note that you have ties for "B". Use the argument "with_ties" in order to
# return the first "n" rows:
df %>%
dplyr::group_by(grp) %>%
dplyr::slice_max(value, with_ties = FALSE) %>%
dplyr::ungroup()
# A tibble: 3 × 2
# Groups: grp [3]
grp value
<chr> <dbl>
1 A 3
2 B 10
3 NA 2
Find maximum of a value by group using row_num()
df <- tibble::tribble(
~grp, ~value,
"A", 1,
"A", 3,
NA_character_, 2,
"B", 10,
"B", 10,
"B", 8
)
`%>%` <- magrittr::`%>%`
df %>%
dplyr::group_by(grp) %>%
dplyr::arrange(dplyr::desc(value)) %>%
dplyr::filter(dplyr::row_number() == 1) %>%
dplyr::ungroup()
mutate: call a function which creates multiple columns
You can create multiple columns simultaneously if you use a function which returns a data.frame:
df <- data.frame(
stringsAsFactors = FALSE,
c1 = 1:3,
c2 = 10:12
)
some_function <- function(c1, c2) {
data.frame(
stringsAsFactors = FALSE,
c10 = c1 * 10,
c11 = c2 / 10
)
}
df %>%
dplyr::mutate(
some_function(c1 = c1, c2 = c2)
)
# c1 c2 c10 c11
# 1 10 10 1.0
# 2 11 20 1.1
# 3 12 30 1.2
select only columns which have no missing value
tibble::tribble(
~c1, ~c2,
2, 1,
1, NA_real_
) %>%
dplyr::select(
tidyselect::where(~ !any(is.na(.x)))
)
# A tibble: 2 × 1
# c1
# <dbl>
# 1 2
# 2 1
Replace missing characters with empty strings
tibble::tribble(
~c1, ~c2,
"a", NA_character_,
NA_character_, "b"
) %>%
dplyr::mutate(
dplyr::across(
.cols = tidyselect::where(is.character),
.fns = ~ tidyr::replace_na(., "")
)
)
# A tibble: 2 × 2
# c1 c2
# <chr> <chr>
# 1 "a" ""
# 2 "" "b"
Dynamically paste columns
cols_to_paste <- c("c1", "c3")
tibble::tribble(
~c1, ~c2, ~c3,
"a", "b", "Z",
"d", "u", "Q"
) %>%
tidyr::unite(
col = "new_column",
tidyselect::all_of(cols_to_paste),
remove = FALSE
)
# A tibble: 2 × 4
# new_column c1 c2 c3
# <chr> <chr> <chr> <chr>
# 1 a_Z a b Z
# 2 d_Q d u Q
Use the walrus operator (":=") to dynamically assign variable names
new_col_name <- "carb2"
mtcars %>%
dplyr::mutate(
{{new_col_name}} := carb * 10
) %>%
head(n = 3)
# mpg cyl disp hp drat wt qsec vs am gear carb carb2
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 40
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 40
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 10
Keep only rows containing missing values
tibble::tribble(
~c1, ~c2,
NA_character_, 1,
"a", NA_real_,
"b", 10
) %>%
dplyr::filter(
!stats::complete.cases(.)
)
# A tibble: 2 × 2
# c1 c2
# <chr> <dbl>
# 1 NA 1
# 2 a NA
Make a promise. Show up. Do the work. Repeat.