class: center, middle, inverse, title-slide .title[ # Introduction to biological diversity analyses ] .subtitle[ ## Serrapilheira/ICTP-SAIFR Training Program in Quantitative Biology and Ecology ] .author[ ### Andrea Sánchez-Tapia & Sara Mortara ] .date[ ### 28 July 2022 ] --- ## Ecological community data as multivariate data <img src="figs/4thcorner.jpg" width="480" style="display: block; margin: auto;" /> --- ## CESTES database <img src="figs/cestesdatabase.png" width="800" style="display: block; margin: auto;" /> [Jeliazkov et al 2020 Sci Data](https://doi.org/10.1038/s41597-019-0344-7) --- ## Today + Short exercises + __Pseudocode:__ a _verbal model_ of the steps you need to take to solve a problem + Code + Collaborative notes and answers [__here__](https://hackmd.io/@andreasancheztapia/comp_methods/edit) --- ## Species abundance, frequency, richness 1. Which are the 5 most abundant species overall in the dataset? 2. How many species are there in each site? (Richness) 3. Which the species that is most abundant in each site? --- ## Diversity indices + Shannon diversity index $$ H = - \sum_{n=1}^{S}p_i \ln p_i $$ + Simpson's diversity index would be $$ 1 - \sum p_i^2 $$ + Inverse Simpson $$ 1/\sum p_i^2 $$ --- ## Creating functions + Create code to calculate Shannon and Simpson diversity in comm dataset (leave in notes!) + Let's create _functions_ from this code --- ```r comm <- read.csv("data/raw/cestes/comm.csv") ``` ```r dim(comm) ``` ``` ## [1] 97 57 ``` ```r head(comm[,1:6]) ``` ``` ## Sites sp1 sp2 sp3 sp4 sp5 ## 1 1 0 0 0 0 0 ## 2 2 0 1 0 0 0 ## 3 3 0 1 0 0 0 ## 4 4 0 0 0 0 0 ## 5 5 0 0 0 0 0 ## 6 6 0 0 0 0 0 ```