getTraitDensity.Rdget_trait_density combines (descriptor x taxon) density data with (taxon x trait) data to a (descriptor x trait) data set.
get_trait_density(data, descriptor, taxon, value, averageOver,
wide = NULL, descriptor_column = 1, trait, taxon_column = 1,
trait_class = NULL, trait_score = NULL, taxonomy = NULL,
scalewithvalue = TRUE, verbose = FALSE)data.frame to use for extracting the arguments descriptor, taxon, value, averageOver. Can be missing.
variable(s) *where* the data were taken, e.g. sampling stations. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. If data is missing: a vector, a list, a data.frame or a matrix (with one or multiple columns). It can be of type numerical, character, or a factor. In theory, descriptor can also be one number, NA or missing; however, care needs to be taken in case this combined with subset and averageOver.
variables describing *what* the data are; it gives the taxonomic name (e.g. species). If data is not missing: one column from data.
If data is missing: a list (or data.frame with one column), or a vector. When a data.frame or a list the "name" will be used in the output; when a vector, the argument name will be used.
variable that contains the *values* of the data, usually density. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. If data is missing: a vector, a list, a data.frame or a matrix (with one or multiple columns). it should be of the same length (or have the same number of rows) as (the number of rows of) descriptor and taxon. Should contain numerical values. Should always be present.
*replicates* over which averages need to be taken. If data is not missing: one or more column(s) from data; use cbind or data.frame to select more columns. Else a vector, a list, a data.frame or a matrix (with one or multiple columns). It can be of type numerical, character, or a factor. Can be absent.
density data, in *WIDE* format. This is a data.frame or matrix with (descriptor x taxon) information. If NULL, this data.frame will be calculated from the descriptor, taxon, (replicate) and value data. The number of descriptor columns are specified with descriptor_column. If a data.frame then the first column usually contains the descriptor name; in this case the dimensions of wide are (number of descriptors x number of species+1), and descriptor_column=1. It is also allowed to have the descriptors as row.names of the data.frame -this requires setting descriptor_column=0.
(taxon x trait) data or (descriptor x trait) data, in *WIDE* format, and containing numerical values only. Traits can be fuzzy coded. The number of columns with taxonomic information is specified with taxon_column. The default is to have the first column contain the name of the taxon, and taxon_column=1. It is also allowed to have the taxa as row.names of the data.frame; in this case, taxon_column=0. See last example for how to deal with categorical traits.
indices to trait levels, a vector. The length of this vector should equal the number of columns of trait or wide minus the value of taxon_column. The order should be conform the order in the trait columns (not checked). If present, this -together with trait_score- will be used to convert the trait matrix from fuzzy to crisp format.
trait values or scores, a vector. Should be of same length as trait_class
when TRUE, will standardize with respect to total density, so that the average trait value is obtained (not the summed value). Note that total density will be estimated only for those taxa whose traits are known.
when TRUE, will write warnings to the screen.
taxonomic information (the relationships between the taxa), a data.frame; first column will be matched with taxon (in the data.frames data and trait). The subsequent columns should have increasing taxonomic level. (e.g. the column order should be *species, genus, family, ...*. The taxonomic relations from data and trait are used to estimate traits of taxa that are not accounted for, and that will be estimated based on taxa at the nearest taxonomic level. See details.
position(s) or name(s) of the column(s) that holds the descriptor of the density data set (data.frame wide), and the taxa in the trait data set (data.frame trait). The default is to have the first column holding the descriptors or taxa. If NULL, or 0, then there is no separate column with names, so the row.names of the dataset are used as descriptor or taxon names.
The taxonomy is used to fill in the gaps of the trait information, assuming that closely related taxa will share similar traits. This is done in two steps:
The trait database is first extended with information on higher taxonomic levels (using *extend_trait*). The traits for a taxonomic level are estimated as the average of the traits at the lower level. For instance, traits on genus level will be averages of known traits of all species in the database belonging to this genus.
Then, for each taxon that is not present in the trait database, the traits on the closest taxonomic level are used. For instance, for an unrecorded species, it is first checked if the trait is known on genus level, if not, family level and so on.
get_trait_density returns the descriptor x trait density matrix.
Depending on whether argument data is passed or not,
the output columns may be labelled differently:
if data is passed: the original names in data will be kept
if data is not passed: the names will only be kept if explicitly passed.
see example labeled "use data argument or explicit input".
MWTL for the data sets.
map_key for simple plotting functions.
get_density for functions working with density data.
get_Db_index for extracting bioturbation and bioirrigation indices.
extend_trait for functions working with traits.
When traits of certain taxa are not found, they are put equal to 0 and
calculation will proceed; this will be notified only when verbose is set
to TRUE.
When that happens, the taxa that were ignored can be found in attributes(..)$notrait
## ====================================================
## A small dataset with replicates
## ====================================================
# 2 stations, 2 replicates for st.a, one replicate for st.b
Bdata.rep <- data.frame(
station = c("st.a","st.a","st.a","st.b","st.b","st.b"),
replicate = c( 1, 1, 2, 1, 1, 1),
species = c("sp.1","sp.2","sp.1","sp.3","sp.4","sp.5"),
density = c( 1, 2, 3, 3, 1, 3)
)
Bdata.rep
#> station replicate species density
#> 1 st.a 1 sp.1 1
#> 2 st.a 1 sp.2 2
#> 3 st.a 2 sp.1 3
#> 4 st.b 1 sp.3 3
#> 5 st.b 1 sp.4 1
#> 6 st.b 1 sp.5 3
##-----------------------------------------------------
## averaging over replicates - long format
##-----------------------------------------------------
Blong <- get_density(
data = Bdata.rep,
descriptor = station,
taxon = species,
averageOver = replicate,
value = density)
Blong
#> station species density
#> 1 st.a sp.1 2
#> 2 st.a sp.2 1
#> 3 st.b sp.3 3
#> 4 st.b sp.4 1
#> 5 st.b sp.5 3
##-----------------------------------------------------
## From long to wide format, averaging over replicates
##-----------------------------------------------------
Bwide <- l2w_density (
data = Bdata.rep,
value = density,
descriptor = station,
averageOver = replicate,
taxon = species)
Bwide
#> station sp.1 sp.2 sp.3 sp.4 sp.5
#> 1 st.a 2 1 0 0 0
#> 2 st.b 0 0 3 1 3
##-----------------------------------------------------
## Small dataset: fuzzy-coded traits
##-----------------------------------------------------
# Note: no data for "sp.4"
Btraits <- data.frame(
taxon = c("sp.1","sp.2","sp.3","sp.5","sp.6"),
T1_M1 = c(0 , 0 , 0 , 0.2 , 1),
T1_M2 = c(1 , 0 , 0.5 , 0.3 , 0),
T1_M3 = c(0 , 1 , 0.5 , 0.5 , 0),
T2_M1 = c(0 , 0 , 1 , 0.5 , 1),
T2_M2 = c(1 , 1 , 0 , 0.5 , 0)
)
# The metadata for this trait
Btraits.lab <- data.frame(
colname = c("T1_M1", "T1_M2", "T1_M3", "T2_M1", "T2_M2"),
trait = c("T1" , "T1" , "T1" , "T2" , "T2"),
modality = c("M1" , "M2" , "M3" , "M1" , "M2"),
score = c(0 , 0.5 , 1 , 0.2 , 2)
)
##-----------------------------------------------------
## Small dataset: taxonomy
##-----------------------------------------------------
Btaxonomy <- data.frame(
species = c("sp.1","sp.2","sp.3","sp.4","sp.5","sp.6"),
genus = c( "g.1", "g.2", "g.2", "g.2", "g.3", "g.4"),
family = c( "f.1", "f.1", "f.1", "f.1", "f.2", "f.3"),
order = c( "o.1", "o.1", "o.1", "o.1", "o.2", "o.2"),
class = c( "c.1", "c.1", "c.1", "c.1", "c.1", "c.1")
)
## ====================================================
## Use of get_trait_density: long and wide input
## ====================================================
# input in wide format (Bwide)
cwm.1 <- get_trait_density (
wide = Bwide,
trait = Btraits)
cwm.1
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
# input in long format (Blong)
cwm.2 <- get_trait_density (
data = Blong,
taxon = species,
descriptor = station,
value = density,
trait = Btraits)
cwm.2
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
# input in original long format (Bdata.rep) - average over replicates
cwm.3 <- get_trait_density (
data = Bdata.rep,
taxon = species,
descriptor = station,
averageOver = replicate,
value = density,
trait = Btraits)
cwm.3
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
# input in original long format (Bdata.rep) - keep replicates (as descriptor)
cwm.3b <- get_trait_density (
data = Bdata.rep,
taxon = species,
descriptor = data.frame(station, replicate),
value = density,
trait = Btraits)
cwm.3b
#> station replicate T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 1 0.0 0.3333333 0.6666667 0.00 1.00
#> 2 st.b 1 0.1 0.4000000 0.5000000 0.75 0.25
#> 3 st.a 2 0.0 1.0000000 0.0000000 0.00 1.00
## ====================================================
## use data argument or explicit input
## ====================================================
# -----------------------------------------------------
# use data argument
# -----------------------------------------------------
cwm.2 <- get_trait_density (
data = Blong,
taxon = species,
descriptor = station,
value = density,
trait = Btraits)
cwm.2 # keeps the names of the original data
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
# -----------------------------------------------------
# use with() to create a data environment
# -----------------------------------------------------
cwm.2b <- with(Blong, get_trait_density (
taxon = species,
descriptor = station,
value = density,
trait = Btraits))
cwm.2b # Note: first column is called "descriptor"
#> descriptor T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
# explicitly name the descriptor argument
cwm.2c <- with(Blong, get_trait_density (
taxon = species,
descriptor = list(station = station),
value = density,
trait = Btraits))
cwm.2c # called "station"
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
# -----------------------------------------------------
# explicit arguments
# -----------------------------------------------------
cwm.2d <- get_trait_density (
taxon = Blong$species,
descriptor = Blong$station,
value = Blong$density,
trait = Btraits)
cwm.2d
#> descriptor T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
cwm.2d <- get_trait_density (
taxon = Blong$species,
descriptor = list(station = Blong$station),
value = Blong$density,
trait = Btraits)
cwm.2d
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.0 0.6666667 0.3333333 0.00 1.00
#> 2 st.b 0.1 0.4000000 0.5000000 0.75 0.25
## ====================================================
## use of scalewithvalue
## ====================================================
# if verbose=TRUE, a warning will be given, as there is no trait information
# for sp4 - this affects the trait density for station "b"
cwm.trait.1 <- get_trait_density (
wide = Bwide,
trait = Btraits,
trait_class = Btraits.lab$trait,
trait_score = Btraits.lab$score,
scalewithvalue = TRUE,
verbose = FALSE)
cwm.trait.1
#> station T1 T2
#> 1 st.a 0.6666667 2.00
#> 2 st.b 0.7000000 0.65
attributes(cwm.trait.1)$notrait # species that was ignored
#> [1] "sp.4"
# same, but total trait values
cwm.trait.2 <- get_trait_density (
wide = Bwide,
trait = Btraits,
trait_class = Btraits.lab$trait,
trait_score = Btraits.lab$score,
scalewithvalue = FALSE,
verbose = FALSE)
cwm.trait.2
#> station T1 T2
#> 1 st.a 2.0 6.0
#> 2 st.b 4.2 3.9
## ====================================================
## pass taxonomy to estimate traits for unrecorded species
## ====================================================
cwm.trait.2 <- get_trait_density (
wide = Bwide,
trait = Btraits,
trait_class = Btraits.lab$trait,
trait_score = Btraits.lab$score,
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
cwm.trait.2
#> station T1 T2
#> 1 st.a 0.6666667 2.0000000
#> 2 st.b 0.7250000 0.7142857
attributes(cwm.trait.2)$notrait # none
#> [1] NA
##-----------------------------------------------------
## Same but keeping fuzzy scores
##-----------------------------------------------------
cwm.trait.3 <- get_trait_density (
wide = Bwide,
trait = Btraits,
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
cwm.trait.3
#> station T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.00000000 0.6666667 0.3333333 0.0000000 1.0000000
#> 2 st.b 0.08571429 0.3785714 0.5357143 0.7142857 0.2857143
## ====================================================
## categorical traits
## ====================================================
# Note: no data for "sp.4"
Bcategory <- data.frame(
taxon = c("sp.1","sp.2","sp.3","sp.5","sp.6"),
C1 = c( "A", "B", "A", "C", "C")
)
if (FALSE) { # \dontrun{
# this will not work, as trait should be numerical
cwm.trait.4 <- get_trait_density (
wide = Bwide,
trait = Bcategory,
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
} # }
# convert categorical traits to numerical values
crisp2fuzzy(Bcategory)
#> taxon C1_A C1_B C1_C
#> 1 sp.1 1 0 0
#> 2 sp.2 0 1 0
#> 3 sp.3 1 0 0
#> 4 sp.5 0 0 1
#> 5 sp.6 0 0 1
# this will work, as categorical traits -> numerical
cwm.trait.4 <- get_trait_density (
wide = Bwide,
trait = crisp2fuzzy(Bcategory),
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
cwm.trait.4
#> station C1_A C1_B C1_C
#> 1 st.a 0.6666667 0.33333333 0.0000000
#> 2 st.b 0.5000000 0.07142857 0.4285714