Functions for working with trait and taxonomic data.

getTrait gets trait information for (unique) taxa.

fuzzy2crisp translate fuzzy traits into crisp traits.

crisp2fuzzy translate crisp traits into fuzzy traits. Note: this makes sense only for categorical traits.

extendTrait expands taxon x trait data based on their taxonomic composition.

Usage

getTrait(taxon, trait, trait.class=NULL, trait.score=NULL, 
         taxonomy = NULL, t.column = 1, 
         standardize = FALSE, verbose = FALSE)

extendTrait(trait, taxonomy, t.column = 1) 

fuzzy2crisp(trait, trait.class, trait.score, t.column = 1, 
        standardize = TRUE)

crisp2fuzzy (trait, t.column = 1)

Arguments

taxon: vector with the taxon names for which to get the trait information.
trait: (taxon x trait) data or (descriptor x trait) data, in WIDE format. Traits can be fuzzy coded. In the default setting, the first column contains the name of the taxon, and t.column=1. It is also allowed to have the taxa as row.names of the data.frame - set t.column=0.
trait.class: indices to trait classes, a vector. The length of this vector should equal the number of columns of trait minus the descriptor columns (nrow(trait) - t.column). If not NULL, the fuzzy traits will be converted to crisp values.
trait.score: trait values or scores, a vector. Should be of same length as trait.class.
standardize: when TRUE, will standardize the trait modalities, so that for each trait class they sum to 1. Only used when trait.class is given a value.
verbose: when TRUE, will write warnings to the screen.
taxonomy: taxonomic information (the relationships between the taxa), a data.frame; first column will be matched with taxon, regardless of its name. This is used to estimate traits of taxa that are not accounted for, and that will be estimated based on taxa at the nearest taxonomic level. See details.
t.column: position(s) or name(s) of the column(s) that holds the taxon of the data set. The default is to have the first column holding the taxon. If NULL, or 0, then there is no separate column with names, so the row.names of the dataset are used as the taxon names.

Value

extendTrait returns the trait matrix, augmented with higher taxonomic levels,

getTrait returns the taxon x trait matrix, in fuzzy format (if the trait database was in fuzzy format). However, when trait.class is not NULL, the matrix will be in crisp format.

fuzzy2crisp returns the trait matrix, in crisp format (not fuzzy coded),

crisp2fuzzy returns the trait matrix, in fuzzy format. If the original,fuzzy trait matrix was categorical, the output will include a description of each column, in its attribute description, use metadata(...) to extract this.

Author

Karline Soetaert <karline.soetaert@nioz.nl> Olivier Beauchard

Details

The taxonomy is used to fill in the gaps of the trait information, assuming that closely related taxa will share similar traits. This is done in several steps:

In function extendTrait the traits are extended with information on higher taxonomic levels, provided that information is not yet in the trait database. The traits for a taxonomic level are estimated as the average of the traits at the lower level. For instance, traits on genus level will be averages of known traits of all species in the database belonging to this genus.

In function *getTrait*, the trait database is first extended with information on higher taxonomic levels (using *extendTrait*). Then, for each taxon that is not present in the trait database, the traits on the closest taxonomic level are used. For instance, for an unrecorded species, it is first checked if the trait is known on genus level, if not, family level and so on.

for taxon.getTrait, only the traits for unique taxa are returned. Traits will be NA if traits were not present in the trait database and that could not be derived based on taxonomic closeness. The list of taxa whose traits remain unknown can be found in attributes(..)$notrait.

It is best, for subsequent calculations, to remove the NAs in the result, or to put them to 0 (when the remaining unassigned taxa are deemed to be acceptable).

Examples


## ====================================================
## Small dataset: taxonomy
## ====================================================

Btaxonomy <- data.frame(
  species = c("sp.1","sp.2","sp.3","sp.4","sp.5","sp.6"),
  genus   = c( "g.1", "g.2", "g.2", "g.2", "g.3", "g.4"),
  family  = c( "f.1", "f.1", "f.1", "f.1", "f.2", "f.3"),
  order   = c( "o.1", "o.1", "o.1", "o.1", "o.2", "o.2"),
  class   = c( "c.1", "c.1", "c.1", "c.1", "c.1", "c.1")
  )

## ====================================================
## Small dataset: fuzzy-coded traits
## ====================================================

Btraits <- data.frame(
  species = c("sp.1","sp.2","sp.3","sp.5","sp.6"),
  T1_M1   = c(0     , 0    ,   0  , 0.2  ,     1),
  T1_M2   = c(1     , 0    , 0.5  , 0.3  ,     0),
  T1_M3   = c(0     , 1    , 0.5  , 0.5  ,     0),
  T2_M1   = c(0     , 0    ,   1  , 0.5  ,     1),
  T2_M2   = c(1     , 1    ,   0  , 0.5  ,     0)
)

# The 'metadata' of this trait database
Btraits.lab <- data.frame(
  colname  =c("T1_M1","T1_M2","T1_M3","T2_M1","T2_M2"),
  trait    =c("T1"   ,"T1"   ,"T1"   ,"T2"   ,"T2"),
  modality =c("M1"   ,"M2"   ,"M3"   ,"M1"   ,"M2"), 
  score    =c(0      , 0.5   , 1     , 0.2   , 2)
)

##-----------------------------------------------------
## Show traits
##-----------------------------------------------------

# sp.4 is not in Btraits 
getTrait (taxon      = "sp.4", 
          trait      = Btraits)
#>   taxon T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1  sp.4    NA    NA    NA    NA    NA

# sp.4 traits derived from taxonomic tree (g.2)
getTrait (taxon      = "sp.4", 
          trait      = Btraits, 
          taxonomy   = Btaxonomy)
#>   species T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1    sp.4     0  0.25  0.75   0.5   0.5

getTrait (taxon      = c("g.2"), 
          trait      = Btraits, 
          taxonomy   = Btaxonomy)
#>   species T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1     g.2     0  0.25  0.75   0.5   0.5

# g.2 is derived as mean of sp.2, sp.3 and sp.4 
getTrait (taxon      = c("sp.2", "sp.3", "sp.4"), 
          trait      = Btraits, 
          taxonomy   = Btaxonomy)
#>   species T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1    sp.2     0  0.00  1.00   0.0   1.0
#> 2    sp.3     0  0.50  0.50   1.0   0.0
#> 3    sp.4     0  0.25  0.75   0.5   0.5

getTrait (taxon      = c("g.2"), 
          trait      = Btraits, 
          taxonomy   = Btaxonomy)
#>   species T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1     g.2     0  0.25  0.75   0.5   0.5

##-----------------------------------------------------
## crisp values of traits
##-----------------------------------------------------

# for categoric values: the most abundant category is taken
 C1 <- fuzzy2crisp(trait       = Btraits, 
                   trait.class = Btraits.lab$trait, 
                   trait.score = Btraits.lab$modality, 
                   standardize = TRUE)
 C1
#>   species T1 T2
#> 1    sp.1 M2 M2
#> 2    sp.2 M3 M2
#> 3    sp.3 M2 M1
#> 4    sp.5 M3 M1
#> 5    sp.6 M1 M1
 
# the reverse returns a binary-coded value
 (C2fuz <- crisp2fuzzy(trait       = C1))
#>   species T1_M1 T1_M2 T1_M3 T2_M1 T2_M2
#> 1    sp.1     0     1     0     0     1
#> 2    sp.2     0     0     1     0     1
#> 3    sp.3     0     1     0     1     0
#> 4    sp.5     0     0     1     1     0
#> 5    sp.6     1     0     0     1     0
 metadata(C2fuz)
#>     colname trait modality
#> T11   T1_M1    T1       M1
#> T12   T1_M2    T1       M2
#> T13   T1_M3    T1       M3
#> T21   T2_M1    T2       M1
#> T22   T2_M2    T2       M2
 
# for numeric (or binary) values: the weighted mean is calculated
 C2 <- fuzzy2crisp(trait       = Btraits, 
                   trait.class = Btraits.lab$trait, 
                   trait.score = Btraits.lab$score, 
                   standardize = TRUE)
 C2
#>   species   T1  T2
#> 1    sp.1 0.50 2.0
#> 2    sp.2 1.00 2.0
#> 3    sp.3 0.75 0.2
#> 4    sp.5 0.65 1.1
#> 5    sp.6 0.00 0.2
 (C2f <- crisp2fuzzy(C2))  # this has no effect
#>   species   T1  T2
#> 1    sp.1 0.50 2.0
#> 2    sp.2 1.00 2.0
#> 3    sp.3 0.75 0.2
#> 4    sp.5 0.65 1.1
#> 5    sp.6 0.00 0.2
 
##-----------------------------------------------------
## Extend traits with higher level information
##-----------------------------------------------------

Btraits.ext <- extendTrait (trait    = Btraits, 
                            taxonomy = Btaxonomy)

Btraits.all <- rbind(Btraits, Btraits.ext)
Btraits.all
#>    species T1_M1 T1_M2 T1_M3     T2_M1     T2_M2
#> 1     sp.1  0.00  1.00  0.00 0.0000000 1.0000000
#> 2     sp.2  0.00  0.00  1.00 0.0000000 1.0000000
#> 3     sp.3  0.00  0.50  0.50 1.0000000 0.0000000
#> 4     sp.5  0.20  0.30  0.50 0.5000000 0.5000000
#> 5     sp.6  1.00  0.00  0.00 1.0000000 0.0000000
#> 6      g.1  0.00  1.00  0.00 0.0000000 1.0000000
#> 7      g.2  0.00  0.25  0.75 0.5000000 0.5000000
#> 8      g.3  0.20  0.30  0.50 0.5000000 0.5000000
#> 9      g.4  1.00  0.00  0.00 1.0000000 0.0000000
#> 10     f.1  0.00  0.50  0.50 0.3333333 0.6666667
#> 11     f.2  0.20  0.30  0.50 0.5000000 0.5000000
#> 12     f.3  1.00  0.00  0.00 1.0000000 0.0000000
#> 13     o.1  0.00  0.50  0.50 0.3333333 0.6666667
#> 14     o.2  0.60  0.15  0.25 0.7500000 0.2500000
#> 15     c.1  0.24  0.36  0.40 0.5000000 0.5000000

#same, but in crisp format
fuzzy2crisp(trait       = Btraits.all, 
            trait.score = Btraits.lab$score, 
            trait.class = Btraits.lab$trait)
#>    species    T1   T2
#> 1     sp.1 0.500 2.00
#> 2     sp.2 1.000 2.00
#> 3     sp.3 0.750 0.20
#> 4     sp.5 0.650 1.10
#> 5     sp.6 0.000 0.20
#> 6      g.1 0.500 2.00
#> 7      g.2 0.875 1.10
#> 8      g.3 0.650 1.10
#> 9      g.4 0.000 0.20
#> 10     f.1 0.750 1.40
#> 11     f.2 0.650 1.10
#> 12     f.3 0.000 0.20
#> 13     o.1 0.750 1.40
#> 14     o.2 0.325 0.65
#> 15     c.1 0.580 1.10

# In one go, and including sp.4
getTrait (taxon    = c("sp.1", "sp.2", "sp.3", "sp.4", "sp.5", "sp.6"),
          trait    = Btraits, 
          taxonomy = Btaxonomy, 
          trait.score = Btraits.lab$score, 
          trait.class = Btraits.lab$trait)
#>   species    T1  T2
#> 1    sp.1 0.500 2.0
#> 2    sp.2 1.000 2.0
#> 3    sp.3 0.750 0.2
#> 4    sp.4 0.875 1.1
#> 5    sp.5 0.650 1.1
#> 6    sp.6 0.000 0.2
          
          
TraitSmall <- data.frame(species=c("Alcyonium acaule",  "Alcyonium coralloides"),
                         T1.M1=c(0.9,1), T1.M2=c(0.1,0), T1.M3=c(0,0))
TaxSmall <- data.frame(species = c("Alcyonium acaule",  "Alcyonium coralloides"), 
              Genus   = c("Alcyonium", "Alcyonium"), 
              Family  = c("Alcyoniidae", "Alcyoniidae"), 
              order   = c("Alcyonacea", "Alcyonacea"))
T1 <- extendTrait(trait=TraitSmall, taxonomy=TaxSmall)