get_summary calculates summaries of density data, i.e. total densities and number of taxa.

get_summary(data, descriptor, taxon, value, averageOver, taxonomy = NULL, 
            subset, what=c("density", "taxa", "occurrence"), 
            wide.output = FALSE)

Arguments

data

data.frame to use for extracting the arguments descriptor, taxon, value, averageOver. Can be missing.

descriptor

variable(s) *where* the data were taken, e.g. sampling stations. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. If data is missing: a vector, a list, a data.frame or a matrix (with one or multiple columns). It can be of type numerical, character, or a factor. In theory, descriptor can also be one number, NA or missing; however, care needs to be taken in case this combined with subset and averageOver.

taxon

variables describing *what* the data are; it gives the taxonomic name (e.g. species). If data is not missing: one column from data. If data is missing: a list (or data.frame with one column), or a vector. When a data.frame or a list the "name" will be used in the output; when a vector, the argument name will be used.

value

variable that contains the *values* of the data, usually density. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. If data is missing: a vector, a list, a data.frame or a matrix (with one or multiple columns). it should be of the same length (or have the same number of rows) as (the number of rows of) descriptor and taxon. Should contain numerical values. Should always be present.

averageOver

*replicates* over which averages need to be taken. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. Else a vector, a list, a data.frame or a matrix (with one or multiple columns). It can be of type numerical, character, or a factor. Can be absent.

taxonomy

taxonomic information; first column will be matched with taxon, regardless of its name.

subset

logical expression indicating elements to keep: missing values are taken as FALSE. If NULL, or absent, then all elements are used. Note that the subset is taken *after* the number of samples to average per descriptor is calculated, so this will also work for selecting certain taxa that may not be present in all replicates over which should be averaged.

what

the summary statistic to be returned, one of "density": total density per descriptor, "taxa": the number of taxa per descritpor, "occurrence": the number of occurrences of each species.

wide.output

when TRUE, will recast the output in wide format (the default is long format). This only makes sense when descriptor is a matrix or data.frame with more > 1 column; in this case, the last column will make up the columns.+

Value

get_summary returns a list, containing the mean densities (density) and the number of taxa (taxa) for each descriptor, and the number of descriptors in which the taxa were found (occurrence).

In case wide.output is FALSE, this will be as a data.frame in long format (descriptor, value); when TRUE, the descriptor will feature both in the rows (all bot the last descriptor) and in the columns (the last descriptor).

Author

Karline Soetaert <karline.soetaert@nioz.nl> Olivier Beauchard

See also

. MWTL for the data sets

map_key for simple plotting functions.

get_density for functions working with density data.

get_trait_density for functions combining density and traits.

get_Db_index for extracting bioturbation and bioirrigation indices.

extend_trait for functions working with traits.

get_trait

Examples


## ====================================================
## A small dataset with replicates
## ====================================================

# 2 stations, 2 replicates for st.a, one replicate for st.b
Bdata.rep <- data.frame(
  station   = c("st.a","st.a","st.a","st.b","st.b","st.b"),
  replicate = c(     1,     1,    2,     1,     1,     1),
  species   = c("sp.1","sp.2","sp.1","sp.3","sp.4","sp.5"),
  density   = c(     1,     2,    3,     3,     1,     3)
)
Bdata.rep
#>   station replicate species density
#> 1    st.a         1    sp.1       1
#> 2    st.a         1    sp.2       2
#> 3    st.a         2    sp.1       3
#> 4    st.b         1    sp.3       3
#> 5    st.b         1    sp.4       1
#> 6    st.b         1    sp.5       3

##-----------------------------------------------------
## Summary statistics
##-----------------------------------------------------

# keep replicates
  get_summary(data        = Bdata.rep, 
              descriptor  = cbind(station, replicate), 
              value       = density, 
              taxon       = species, 
              wide.output = FALSE)
#> $density
#>   station replicate density
#> 1    st.a         1       3
#> 2    st.a         2       3
#> 3    st.b         1       7
#> 
#> $taxa
#>   station replicate species
#> 1    st.a         1       2
#> 2    st.a         2       1
#> 3    st.b         1       3
#> 
#> $occurrence
#>   species occurrence
#> 1    sp.1          2
#> 2    sp.2          1
#> 3    sp.3          1
#> 4    sp.4          1
#> 5    sp.5          1
#> 

# average over replicates (uses pipe |> to pass data)
  Bdata.rep |>  
   get_summary(descriptor  = station, 
               averageOver = replicate,
               value       = density, 
               taxon       = species, 
               wide.output = FALSE)
#> $density
#>   station density
#> 1    st.a       3
#> 2    st.b       7
#> 
#> $taxa
#>   station species
#> 1    st.a       2
#> 2    st.b       3
#> 
#> $occurrence
#>   species occurrence
#> 1    sp.1          1
#> 2    sp.2          1
#> 3    sp.3          1
#> 4    sp.4          1
#> 5    sp.5          1
#> 

## ====================================================
## Northsea dataset
## ====================================================

NSsumm <- with(MWTL$density, 
   get_summary(descriptor  = station,
               averageOver = year,
               taxon       = taxon,
               value       = density))
head(NSsumm$density)
#>   descriptor     value
#> 1  BREEVTN02 1215.5389
#> 2  BREEVTN03  768.3955
#> 3  BREEVTN04 1519.1157
#> 4  BREEVTN05 1944.8397
#> 5  BREEVTN06 1203.5062
#> 6  BREEVTN07 1230.1772
head(NSsumm$taxa)
#>   descriptor taxon
#> 1  BREEVTN02    88
#> 2  BREEVTN03    74
#> 3  BREEVTN04    69
#> 4  BREEVTN05    66
#> 5  BREEVTN06    60
#> 6  BREEVTN07    65
head(NSsumm$occurrence)
#>                   taxon occurrence
#> 1 Abludomelita obtusata         12
#> 2             Abra alba         70
#> 3           Abra nitida         34
#> 4       Abra prismatica         30
#> 5           Abra tenuis          3
#> 6 Abyssoninoe hibernica          9