Introduction to biological diversity analyses

class: center, middle, inverse, title-slide

.title[
# Introduction to biological diversity analyses
]
.subtitle[
## Serrapilheira/ICTP-SAIFR Training Program in Quantitative Biology and Ecology
]
.author[
### Andrea Sánchez-Tapia & Sara Mortara
]
.date[
### 28 July 2022
]

---

## Ecological community data as multivariate data

---

## CESTES database

[Jeliazkov et al 2020 Sci Data](https://doi.org/10.1038/s41597-019-0344-7)

---
## Today

+ Short exercises

+ __Pseudocode:__ a _verbal model_ of the steps you need to take to solve a problem

+ Code

+ Collaborative notes and answers [__here__](https://hackmd.io/@andreasancheztapia/comp_methods/edit)

---
## Species abundance, frequency, richness

1. Which are the 5 most abundant species overall in the dataset?

2. How many species are there in each site? (Richness)

3. Which the species that is most abundant in each site?

---
## Diversity indices

+ Shannon diversity index

$$
H = - \sum_{n=1}^{S}p_i \ln p_i
$$

+ Simpson's diversity index would be
$$
1 - \sum p_i^2
$$
+ Inverse Simpson 
$$
1/\sum p_i^2 
$$
---
## Creating functions

+ Create code to calculate Shannon and Simpson diversity in comm dataset (leave in notes!)

+ Let's create _functions_ from this code

---

```r
comm <- read.csv("data/raw/cestes/comm.csv")
```

```r
dim(comm)
```

```
## [1] 97 57
```

```r
head(comm[,1:6])
```

```
##   Sites sp1 sp2 sp3 sp4 sp5
## 1     1   0   0   0   0   0
## 2     2   0   1   0   0   0
## 3     3   0   1   0   0   0
## 4     4   0   0   0   0   0
## 5     5   0   0   0   0   0
## 6     6   0   0   0   0   0
```