get Summary for station x taxon data (getSummary)

getSummary calculates summaries of density data, i.e. total densities and number of taxa.

Usage

getSummary(descriptor, taxon, value, averageOver = NULL, taxonomy = NULL, 
        subset, what=c("density", "taxa", "occurrence"), 
        wide.output = FALSE)

Arguments

descriptor: name(s) of the descriptor, i.e. *where* the data were taken, e.g. station names. Either a vector, a list, a data.frame or matrix (and with multiple columns). It can be of type numerical, character, or a factor. When a data.frame or a list the "names" will be used in the output; when a vector, the argument name will be used in the output.
taxon: vector describing *what* the data are; it gives the taxonomic name (e.g. species). Should be of the same length as (the number of rows of) descriptor. Can be a list (or data.frame with one column), or a vector. When a data.frame or a list the "name" will be used in the output; when a vector, the argument name will be used.
value: vector or list that contains the *values* of the data, usually density. Should be of the same length as (the number of rows of) descriptor and taxon. For function getDensity, value can also be a multi-column data.frame or matrix.
averageOver: vector with *replicates* over which averages need to be taken. Should be of the same length as (the number of rows of) descriptor.
taxonomy: taxonomic information; first column will be matched with taxon, regardless of its name.
subset: logical expression indicating elements to keep: missing values are taken as FALSE. If NULL, or absent, then all elements are used. Note that the subset is taken *after* the number of samples to average per descriptor is calculated, so this will also work for selecting certain taxa that may not be present in all replicates over which should be averaged.
what: the summary statistic to be returned, one of "density": total density per descriptor, "taxa": the number of taxa per descritpor, "occurrence": the number of occurrences of each species.
wide.output: when TRUE, will recast the output in wide format (the default is long format). This only makes sense when descriptor is a matrix or data.frame with more > 1 column; in this case, the last column will make up the columns.+

Value

getSummary returns a list, containing the mean densities (density) and the number of taxa (taxa) for each descriptor, and the number of descriptors in which the taxa were found (occurrence).

In case wide.output is FALSE, this will be as a data.frame in long format (descriptor, value); when TRUE, the descriptor will feature both in the rows (all bot the last descriptor) and in the columns (the last descriptor).

Author

Karline Soetaert <karline.soetaert@nioz.nl> Olivier Beauchard

Examples


## ====================================================
## A small dataset with replicates
## ====================================================

# 2 stations, 2 replicates for st.a, one replicate for st.b
Bdata.rep <- data.frame(
  station   = c("st.a","st.a","st.a","st.b","st.b","st.b"),
  replicate = c(     1,     1,    2,     1,     1,     1),
  species   = c("sp.1","sp.2","sp.1","sp.3","sp.4","sp.5"),
  density   = c(     1,     2,    3,     3,     1,     3)
)
Bdata.rep
#>   station replicate species density
#> 1    st.a         1    sp.1       1
#> 2    st.a         1    sp.2       2
#> 3    st.a         2    sp.1       3
#> 4    st.b         1    sp.3       3
#> 5    st.b         1    sp.4       1
#> 6    st.b         1    sp.5       3

##-----------------------------------------------------
## Summary statistics
##-----------------------------------------------------

with (Bdata.rep,  
  getSummary(descriptor  = cbind(station, replicate), 
             value       = density, 
             taxon       = species, 
             wide.output = FALSE))
#> $density
#>   station replicate density
#> 1    st.a         1       3
#> 2    st.a         2       3
#> 3    st.b         1       7
#> 
#> $taxa
#>   station replicate taxa
#> 1    st.a         1    2
#> 2    st.a         2    1
#> 3    st.b         1    3
#> 
#> $occurrence
#>   taxon occurrence
#> 1  sp.1          2
#> 2  sp.2          1
#> 3  sp.3          1
#> 4  sp.4          1
#> 5  sp.5          1
#> 

with (Bdata.rep,  
  getSummary(descriptor  = station, 
             averageOver = replicate,
             value       = density, 
             taxon       = species, 
             wide.output = FALSE))
#> $density
#>   descriptor density
#> 1       st.a       3
#> 2       st.b       7
#> 
#> $taxa
#>   descriptor taxa
#> 1       st.a    2
#> 2       st.b    3
#> 
#> $occurrence
#>   taxon occurrence
#> 1  sp.1          1
#> 2  sp.2          1
#> 3  sp.3          1
#> 4  sp.4          1
#> 5  sp.5          1
#> 

## ====================================================
## Northsea dataset
## ====================================================

NSsumm <- with(MWTL$density, 
   getSummary(descriptor  = station,
              averageOver = year,
              taxon       = taxon,
              value       = density))
head(NSsumm$density)
#>   descriptor   density
#> 1  BREEVTN02 1215.5389
#> 2  BREEVTN03  768.3955
#> 3  BREEVTN04 1519.1157
#> 4  BREEVTN05 1944.8397
#> 5  BREEVTN06 1203.5062
#> 6  BREEVTN07 1230.1772
head(NSsumm$taxa)
#>   descriptor taxa
#> 1  BREEVTN02   88
#> 2  BREEVTN03   74
#> 3  BREEVTN04   69
#> 4  BREEVTN05   66
#> 5  BREEVTN06   60
#> 6  BREEVTN07   65
head(NSsumm$occurrence)
#>                   taxon occurrence
#> 1 Abludomelita obtusata         12
#> 2             Abra alba         70
#> 3           Abra nitida         34
#> 4       Abra prismatica         30
#> 5           Abra tenuis          3
#> 6 Abyssoninoe hibernica          9

Usage

Arguments

Value

Author

See also

Examples