Hexagonal bins plot of predicted probability of presence.
Performance of the best method used for prediction
The performances indices shown below are based on the confusion matrix of the best model and the best method, applied on the whole set of presence and pseudo-absence for this species distribution. More details in tab 'method and parameters'
Summary of probability at map extent
The variable importance evaluates the weight of the predictor in the model. More info on
Summary of predictors at map extent
Short description of environmental variable, used as predictors.
Each Environmental variable was extracted from raster layer at the river segment location, with a base resolution of 25 meters.
caco3 : Content of calcium carbonate in underlying bedrock. Classe from 0=low to 6=high.
dem25 : Digital elevation model in meters at a resolution of 25 meters.
slope : Slope in percent deviated from dem25.
agri : Number of agricultural cells in a buffer of 125m around each river segment, computed with a moving windows algorithm.
forest : Number of forest cells in a buffer of 125m around each river segment, computed with a moving windows algorithm.
sundur : Mean yearly shunshine duration, in percent.
ectemp : Yearly range between hottest and coldest months.
mtemp : Mean yearly air temperature in °C.
stream : Stream order according to Strahler. Define hierarchy of stream segment.
mean : Mean annual flow in m3s-1.
high : Number of high flow conditions : mean of Q90 of all years/ median on whole period.
hispell : Total number of high flow spell on whole period (threshold: 0.9* mean flow value).
lospell : Total number of low flow spell on whole period (threshold: 0.05* mean flow value).
dure : Mean duration of high spell in days.
zero : Mean number of zero flow days on whole period.
cons : Mean constency of flow events, based on seasonal mean daily flow.
rise : Mean number of rises.
Model's tuning parameters
Each species distribution has been modeled with a set of 12 methods: mlp, mlpWeightDecay, pcaNNet, nnet, avNNet, bdk, xyf, gbm, glm, gam, earth, and svmRadial (Method definition and parameters list). The best one was selected based on its true skill statistic index (TSS=sensitivity+specificity-1). This index was computed from a train/test dataset where 80% was used to fit a model and the remaining 20% was kept to evaluate the model. Each parameters of each model has been tunned by the package caret, on R. Each resample generated during this step was tested with a K-fold reapeated cross validation, with K=10, repeated 2 times. The best one based on its TSS was kept to model the whole training set. Below, there are all parameters tested and their corresponding performance indices.
For my master thesis, I've made a comparaison of multiple methods of prediction, mostly neural-network family, to evaluate change in distribution of benthic macro invertebrate in Rhone watershed, state of Valais, Switzerland. The purpose of this web page is to share, in a interactive way, the result of this work.
The two coverages periods go from 1980 to 1990 and from 1991 to 2008. For each periods, 17 environmental variables and presence only data for 88 species were used to feed 12 models in R. The best models for each species were used to predict their presence on 217000 segments of rivers. The work has been done mainly within R.