CRAN currently lists more than 16000 packages: https://cran.r-project.org/web/packages/. There might be several packages, suited for a specific task, how do you decide on which one to use? There are some obvious criteria:

  • the package was recommended to you by somebody else
  • the package is updated on a regular basis (check the code in the repository for this)
  • you read the documentation and the package suits your needs

But there is another aspect which you should pay attention to. This is a lesson I learned the hard way, this is how the story goes.

Day 1 (sometimes October 2020): 
The users of the app displaying some data of the current day say: The data is only shown after 1 a.m. o'clock. Between midnight and 1 o'clock, we see the yesterday's data.'
The app developer (that is me): Please make sure you update your profile by logging out and logging in again. 
The users: We already did this, but we'll do it again and report.

Day 2:
The users: Logging out and in again did not help, the data was updated only from 1 a.m. See attached screenshots.
The app developer: We'll investigate and get back to you.

Investigate what? There was nothing complicated, a trivial SQL query via the R package RPostgres , something like:

SELECT * FROM a_table
WHERE the_date = CURRENT_DATE

What could go wrong here? I had no idea, so sure enough, close to midnight, me in front of the computer:

  • 23:58 p.m. o'clock: fire up the query, data ok
  • 00:00 a.m.  o'clock: fire up the query, data is not for today, but for yesterday. But why?

And yes, I verified beforehand that the computers had the correct timestamps, the SQL database had the correct default timezone, which was UTC + 1.
It look like it had something to do with the timezone, but where and why?
  
First wild searches in the internet (what keywords should I use?) gave me no answers.

https://xkcd.com/979/

wisdom of the ancients

RPostgres had a timezone bug, which hit me because I used CURRENT_DATE. So the query was working all the time with the UTC timezone, and not with the local timezone. Ok, I'm clever (I'm pretty sure that the developers of my app would beg to differ), I can fix this, I just need to adjust the query:

glue::glue("SELECT * FROM a_table WHERE the_date = {Sys.Date()}")

 and that's it.

This worked now. But applying the fix and releasing the app took me more than 20 minutes, so it was past 01:00 a.m. So the users could confirm only the next night that the app works as expected.

Note: the RPostgres bug was fixed at the beginning of January 2021. 

The lesson I learned: 

  • as an R package user: when you choose an R package, is not enough to read the documentation and the vignettes, if any. More important I think is to look for the open issues, and check if any of the them might affect your code. I hope this advice will help you and spare you a midnight. Or two.
  • as an R package developer: Apart from testing your package carefully: If there are open issues which might trigger wrong results, it might help to drop a hint in the package documentation. Such that the users are warned.

Make a promise. Show up. Do the work. Repeat.