Functions for combining station x taxon and trait data.
getTraitDensity.Rd
getTraitDensity
combines (descriptor x taxon) density data with (taxon x trait) data to a (descriptor x trait) data set.
Usage
getTraitDensity(descriptor, taxon, value, averageOver = NULL,
wide = NULL, d.column = 1, trait, t.column = 1,
trait.class = NULL, trait.score = NULL, taxonomy = NULL,
scalewithvalue = TRUE, verbose = FALSE)
Arguments
- descriptor
name(s) of the descriptor, i.e. *where* the data were taken, e.g. station names. Either a vector, a list, a data.frame or matrix (and with multiple columns). It can be of type numerical, character, or a factor. When a data.frame or a list the "names" will be used in the output; when a vector, the argument name will be used in the output.
- taxon
vector describing *what* the data are; it gives the taxonomic name (e.g. species). Should be of the same length as (the number of rows of)
descriptor
. Can be a list (or data.frame with one column), or a vector. When a data.frame or a list the "name" will be used in the output; when a vector, the argument name will be used.- value
vector or list that contains the *values* of the data, usually density. Should be of the same length as (the number of rows of)
descriptor
andtaxon
. For functiongetDensity
,value
can also be a multi-column data.frame or matrix.- averageOver
vector with *replicates* over which averages need to be taken. Should be of the same length as (the number of rows of)
descriptor
.- wide
density data, in *WIDE* format. This is a data.frame or matrix with (descriptor x taxon) information. If
NULL
, this data.frame will be calculated from thedescriptor
,taxon
, (replicate
) andvalue
data. The number of descriptor columns are specified withd.column
. If adata.frame
then the first column usually contains the descriptor name; in this case the dimensions ofwide
are (number of descriptors x number of species+1), andd.column=1
. It is also allowed to have the descriptors as row.names of the data.frame -this requires settingd.column=0
.- trait
(taxon x trait) data or (descriptor x trait) data, in *WIDE* format, and containing numerical values only. Traits can be fuzzy coded. The number of columns with taxonomic information is specified with
t.column
. The default is to have the first column contain the name of the taxon, andt.column=1
. It is also allowed to have the taxa asrow.names
of the data.frame; in this case,t.column=0
. See last example for how to deal with cgtegorical traits.- trait.class
indices to trait levels, a vector. The length of this vector should equal the number of columns of
trait
orwide
minus the value oft.column
. If present, this -together withtrait.score
- will be used to convert the trait matrix from fuzzy to crisp format.- trait.score
trait values or scores, a vector. Should be of same length as
trait.class
- scalewithvalue
when TRUE, will standardize with respect to total density, so that the average trait value is obtained (not the summed value). Note that total density will be estimated only for those taxa whose traits are known.
- verbose
when TRUE, will write warnings to the screen.
- taxonomy
taxonomic information (the relationships between the taxa), a data.frame; first column will be matched with
taxon
, regardless of its name. This is used to estimate traits of taxa that are not accounted for, and that will be estimated based on taxa at the nearest taxonomic level. See details.- d.column, t.column
position(s) or name(s) of the column(s) that holds the descriptor of the density data set (data.frame
wide
), and the taxa in the trait data set (data.frametrait
). The default is to have the first column holding the descriptors or taxa. IfNULL
, or0
, then there is no separate column with names, so therow.names
of the dataset are used as descriptor or taxon names.
Details
The taxonomy is used to fill in the gaps of the trait information, assuming that closely related taxa will share similar traits. This is done in two steps:
The trait database is first extended with information on higher taxonomic levels (using *extendTrait*). The traits for a taxonomic level are estimated as the average of the traits at the lower level. For instance, traits on genus level will be averages of known traits of all species in the database belonging to this genus.
Then, for each taxon that is not present in the trait database, the traits on the closest taxonomic level are used. For instance, for an unrecorded species, it is first checked if the trait is known on genus level, if not, family level and so on.
See also
MWTL for the data sets.
mapBtrait for simple plotting functions.
getDensity for functions working with density data.
getDbIndex for extracting bioturbation and bioirrigation indices.
extendTrait for functions working with traits.
Note
When traits of certain taxa are not found, they are put equal to 0 and
calculation will proceed; this will be notified only when verbose
is set
to TRUE
.
When that happens, the taxa that were ignored can be found in attributes(..)$notrait
Examples
## ====================================================
## A small dataset with replicates
## ====================================================
# 2 stations, 2 replicates for st.a, one replicate for st.b
Bdata.rep <- data.frame(
station = c("st.a","st.a","st.a","st.b","st.b","st.b"),
replicate = c( 1, 1, 2, 1, 1, 1),
species = c("sp.1","sp.2","sp.1","sp.3","sp.4","sp.5"),
density = c( 1, 2, 3, 3, 1, 3)
)
Bdata.rep
#> station replicate species density
#> 1 st.a 1 sp.1 1
#> 2 st.a 1 sp.2 2
#> 3 st.a 2 sp.1 3
#> 4 st.b 1 sp.3 3
#> 5 st.b 1 sp.4 1
#> 6 st.b 1 sp.5 3
##-----------------------------------------------------
## From long to wide format, averaging over replicates
##-----------------------------------------------------
Bwide <- with (Bdata.rep,
l2wDensity (value = density,
descriptor = station,
averageOver = replicate,
taxon = species))
Bwide
#> descriptor sp.1 sp.2 sp.3 sp.4 sp.5
#> 1 st.a 2 1 0 0 0
#> 2 st.b 0 0 3 1 3
##-----------------------------------------------------
## Small dataset: fuzzy-coded traits
##-----------------------------------------------------
# Note: no data for "sp.4"
Btraits <- data.frame(
taxon = c("sp.1","sp.2","sp.3","sp.5","sp.6"),
T1_M1 = c(0 , 0 , 0 , 0.2 , 1),
T1_M2 = c(1 , 0 , 0.5 , 0.3 , 0),
T1_M3 = c(0 , 1 , 0.5 , 0.5 , 0),
T2_M1 = c(0 , 0 , 1 , 0.5 , 1),
T2_M2 = c(1 , 1 , 0 , 0.5 , 0)
)
# The metadata for this trait
Btraits.lab <- data.frame(
colname =c("T1_M1","T1_M2","T1_M3","T2_M1","T2_M2"),
trait =c("T1" ,"T1" ,"T1" ,"T2" ,"T2"),
modality =c("M1" ,"M2" ,"M3" ,"M1" ,"M2"),
score =c(0 , 0.5 , 1 , 0.2 , 2)
)
##-----------------------------------------------------
## Small dataset: taxonomy
##-----------------------------------------------------
Btaxonomy <- data.frame(
species = c("sp.1","sp.2","sp.3","sp.4","sp.5","sp.6"),
genus = c( "g.1", "g.2", "g.2", "g.2", "g.3", "g.4"),
family = c( "f.1", "f.1", "f.1", "f.1", "f.2", "f.3"),
order = c( "o.1", "o.1", "o.1", "o.1", "o.2", "o.2"),
class = c( "c.1", "c.1", "c.1", "c.1", "c.1", "c.1")
)
##-----------------------------------------------------
## Community weighted mean score
##-----------------------------------------------------
# if verbose=TRUE, a warning will be given, as there is no trait information
# for sp4 - this affects the trait density for station "b"
cwm.trait <- getTraitDensity (
wide = Bwide,
trait = Btraits,
trait.class = Btraits.lab$trait,
trait.score = Btraits.lab$score,
scalewithvalue = TRUE,
verbose = FALSE)
cwm.trait
#> descriptor T1 T2
#> 1 st.a 0.6666667 2.00
#> 2 st.b 0.7000000 0.65
attributes(cwm.trait)$notrait # species that was ignored
#> [1] "sp.4"
# same, but total trait values
cwm.trait.2 <- getTraitDensity (
wide = Bwide,
trait = Btraits,
trait.class = Btraits.lab$trait,
trait.score = Btraits.lab$score,
scalewithvalue = FALSE,
verbose = FALSE)
cwm.trait.2
#> descriptor T1 T2
#> 1 st.a 2.0 6.0
#> 2 st.b 4.2 3.9
# Traits from all taxa in the dataset, including high-level information
cwm.trait.2 <- getTraitDensity (
wide = Bwide,
trait = Btraits,
trait.class = Btraits.lab$trait,
trait.score = Btraits.lab$score,
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
cwm.trait.2
#> descriptor T1 T2
#> 1 st.a 0.6666667 2.0000000
#> 2 st.b 0.7250000 0.7142857
attributes(cwm.trait.2)$notrait
#> [1] NA
# Same but keeping fuzzy scores
cwm.trait.3 <- getTraitDensity (
wide = Bwide,
trait = Btraits,
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
cwm.trait.3
#> descriptor T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1 st.a 0.00000000 0.6666667 0.3333333 0.0000000 1.0000000
#> 2 st.b 0.08571429 0.3785714 0.5357143 0.7142857 0.2857143
##-----------------------------------------------------
## categorical traits
##-----------------------------------------------------
# Note: no data for "sp.4"
Bcategory <- data.frame(
taxon = c("sp.1","sp.2","sp.3","sp.5","sp.6"),
C1 = c( "A", "B", "A", "C", "C")
)
if (FALSE) {
# this will not work, as trait should be numerical
cwm.trait.4 <- getTraitDensity (
wide = Bwide,
trait = Bcategory,
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
}
crisp2fuzzy(Bcategory)
#> taxon C1_A C1_B C1_C
#> 1 sp.1 1 0 0
#> 2 sp.2 0 1 0
#> 3 sp.3 1 0 0
#> 4 sp.5 0 0 1
#> 5 sp.6 0 0 1
# this will work, as categorical traits -> numerical
cwm.trait.4 <- getTraitDensity (
wide = Bwide,
trait = crisp2fuzzy(Bcategory),
taxonomy = Btaxonomy,
scalewithvalue = TRUE)
cwm.trait.4
#> descriptor C1_A C1_B C1_C
#> 1 st.a 0.6666667 0.33333333 0.0000000
#> 2 st.b 0.5000000 0.07142857 0.4285714