In this exercise we will learn how to manipulate dates in R using the package lubridate. We will also use the package zoo to calculate the rolling mean of a variable.
In our example, we will use Covid-19 data from Recife in Pernambuco, Brazil.
This data was downloaded from the portal Brasil.IO.
covid <- read.csv("data/raw/covid19-dd7bc8e57412439098d9b25129ae6f35.csv")
# First checking the class
class(covid$date)
[1] "character"
[1] "Date"
# Now we can make numeric operations
range(covid$date)
[1] "2020-03-12" "2022-03-27"
First, we will create a column containing the number of new cases.
ggplot(covid) +
geom_line(aes(x = date, y = new_confirmed)) +
theme_minimal()
Oops. We have negative cases and will substitute the negative values per zero.
covid$new_confirmed[covid$new_confirmed < 0] <- 0
Let’s try again.
ggplot(covid) +
geom_line(aes(x = date, y = new_confirmed)) +
theme_minimal() +
labs(x = "Date", y = "New cases")