summary.Rdget_summary calculates summaries of density data, i.e. total densities and number of taxa.
get_summary(data, descriptor, taxon, value, averageOver, taxonomy = NULL,
subset, what=c("density", "taxa", "occurrence"),
wide.output = FALSE)data.frame to use for extracting the arguments descriptor, taxon, value, averageOver. Can be missing.
variable(s) *where* the data were taken, e.g. sampling stations. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. If data is missing: a vector, a list, a data.frame or a matrix (with one or multiple columns). It can be of type numerical, character, or a factor. In theory, descriptor can also be one number, NA or missing; however, care needs to be taken in case this combined with subset and averageOver.
variables describing *what* the data are; it gives the taxonomic name (e.g. species). If data is not missing: one column from data.
If data is missing: a list (or data.frame with one column), or a vector. When a data.frame or a list the "name" will be used in the output; when a vector, the argument name will be used.
variable that contains the *values* of the data, usually density. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. If data is missing: a vector, a list, a data.frame or a matrix (with one or multiple columns). it should be of the same length (or have the same number of rows) as (the number of rows of) descriptor and taxon. Should contain numerical values. Should always be present.
*replicates* over which averages need to be taken. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. Else a vector, a list, a data.frame or a matrix (with one or multiple columns). It can be of type numerical, character, or a factor. Can be absent.
taxonomic information; first column will be matched with taxon, regardless of its name.
logical expression indicating elements to keep: missing values are taken as FALSE. If NULL, or absent, then all elements are used. Note that the subset is taken *after* the number of samples to average per descriptor is calculated, so this will also work for selecting certain taxa that may not be present in all replicates over which should be averaged.
the summary statistic to be returned, one of "density": total density per descriptor, "taxa": the number of taxa per descritpor, "occurrence": the number of occurrences of each species.
when TRUE, will recast the output in wide format (the default is long format). This only makes sense when descriptor is a matrix or data.frame with more > 1 column; in this case, the last column will make up the columns.+
get_summary returns a list, containing the mean densities (density)
and the number of taxa (taxa) for each descriptor, and the number of descriptors in which the
taxa were found (occurrence).
In case wide.output is FALSE, this will be as a data.frame in long
format (descriptor, value); when TRUE, the descriptor will feature both
in the rows (all bot the last descriptor) and in the columns (the last descriptor).
. MWTL for the data sets
map_key for simple plotting functions.
get_density for functions working with density data.
get_trait_density for functions combining density and traits.
get_Db_index for extracting bioturbation and bioirrigation indices.
extend_trait for functions working with traits.
## ====================================================
## A small dataset with replicates
## ====================================================
# 2 stations, 2 replicates for st.a, one replicate for st.b
Bdata.rep <- data.frame(
station = c("st.a","st.a","st.a","st.b","st.b","st.b"),
replicate = c( 1, 1, 2, 1, 1, 1),
species = c("sp.1","sp.2","sp.1","sp.3","sp.4","sp.5"),
density = c( 1, 2, 3, 3, 1, 3)
)
Bdata.rep
#> station replicate species density
#> 1 st.a 1 sp.1 1
#> 2 st.a 1 sp.2 2
#> 3 st.a 2 sp.1 3
#> 4 st.b 1 sp.3 3
#> 5 st.b 1 sp.4 1
#> 6 st.b 1 sp.5 3
##-----------------------------------------------------
## Summary statistics
##-----------------------------------------------------
# keep replicates
get_summary(data = Bdata.rep,
descriptor = cbind(station, replicate),
value = density,
taxon = species,
wide.output = FALSE)
#> $density
#> station replicate density
#> 1 st.a 1 3
#> 2 st.a 2 3
#> 3 st.b 1 7
#>
#> $taxa
#> station replicate species
#> 1 st.a 1 2
#> 2 st.a 2 1
#> 3 st.b 1 3
#>
#> $occurrence
#> species occurrence
#> 1 sp.1 2
#> 2 sp.2 1
#> 3 sp.3 1
#> 4 sp.4 1
#> 5 sp.5 1
#>
# average over replicates (uses pipe |> to pass data)
Bdata.rep |>
get_summary(descriptor = station,
averageOver = replicate,
value = density,
taxon = species,
wide.output = FALSE)
#> $density
#> station density
#> 1 st.a 3
#> 2 st.b 7
#>
#> $taxa
#> station species
#> 1 st.a 2
#> 2 st.b 3
#>
#> $occurrence
#> species occurrence
#> 1 sp.1 1
#> 2 sp.2 1
#> 3 sp.3 1
#> 4 sp.4 1
#> 5 sp.5 1
#>
## ====================================================
## Northsea dataset
## ====================================================
NSsumm <- with(MWTL$density,
get_summary(descriptor = station,
averageOver = year,
taxon = taxon,
value = density))
head(NSsumm$density)
#> descriptor value
#> 1 BREEVTN02 1215.5389
#> 2 BREEVTN03 768.3955
#> 3 BREEVTN04 1519.1157
#> 4 BREEVTN05 1944.8397
#> 5 BREEVTN06 1203.5062
#> 6 BREEVTN07 1230.1772
head(NSsumm$taxa)
#> descriptor taxon
#> 1 BREEVTN02 88
#> 2 BREEVTN03 74
#> 3 BREEVTN04 69
#> 4 BREEVTN05 66
#> 5 BREEVTN06 60
#> 6 BREEVTN07 65
head(NSsumm$occurrence)
#> taxon occurrence
#> 1 Abludomelita obtusata 12
#> 2 Abra alba 70
#> 3 Abra nitida 34
#> 4 Abra prismatica 30
#> 5 Abra tenuis 3
#> 6 Abyssoninoe hibernica 9