GSMNP:Notebook/Maxent/Creating a Range Map

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
-
==Motivation and Background==
+
A desirable model output not included in the standard html output is a binary range map for the species. This map has only 2 categories, habitat and non-habitat. The delineation of habitat must be chosen by the user based on the binomial test results in the Analysis of omission and commission section (5.1). As an example, we chose a Balanced training omission, predicted area and threshold value threshold for the Hooded Warbler model. The logistic threshold was 0.139. Our task is to create a binary map from the original logistic projection map where values greater than 0.139 are true (1) and values less than 0.139 are false (0). The procedure follows:
-
===Motivation===
+
*Import the output logistic ASCII grid into ArcMap.
-
*Modeling of species distributions in parks holds many values for the scientific community, but for stewardship of park resources by the NPS, it is critical.  
+
ArcToolbox<math>\Rightarrow</math>Conversion Tools<math>\Rightarrow</math>To Raster<math>\Rightarrow</math>ASCII to Raster
-
**Only having species occurrences as points is of limited usefulness to park managers, since they cannot infer what is between the points.
+
**Input ASCII raster file: path to output folder/species name_ASCII.asc – Output raster: path to output folder/pred_species abbrev
-
*Knowing with some probability where species are in large natural areas is essential to taking actions to protect them, including monitoring, stewardship of rare species, reacting to a species that is suddenly found to be at-risk, and modeling future scenarios that place species in jeopardy.
+
**Output data type (optional): FLOAT
-
*Currently there are many threats to natural systems and native species at Great Smoky Mountains National Park.  
+
**Set Spatial Analyst Workspace to output folder Spatial Analyst<math>\Rightarrow</math>Options...<math>\Rightarrow</math>General
-
**The biological complexity, interactive stressors and limited agency resources at the Smokies, make knowing where to take the most effective actions imperative.
+
**Working Directory: path to output folder
-
===Background===
+
*Compute the logical comparisons
-
*Maxent is a method for generating predictive distributions given a set of occurrence data and known environmental variables at those locations.  
+
*Spatial Analyst<math>\Rightarrow</math>Raster Calculator
-
**This predicted distribution is constrained such that it is close to the empirical average of environmental variables at the occurrence locations.  
+
**hab_species abbrev = pred_species abbrev > threshold
-
**Among all possible models that fulfill these constraints the model of maximum entropy is the model which fits only the minimum constraints
+
**OK
-
**(i.e. it avoids over-fitting by choosing the most unconstrained model possible given the constraints set by the environmental variables at presence locations).  
+
After following the above recipe, the resulting grid will be in Arc binary grid format and have a value of 1 for habitat pixels, 0 for non-habitat pixels, and NODATA for pixels not inside the analysis mask [[File:figure3.png|frame|alt=Binary Range Map|Figure 3, Binary Range Map]]
-
*Maxent has been used extensively is physics and economics applications.  
+
-
**It is just one among many different options for generating species prediction distributions using environmental variables at species presence site ([http://www.nhm.ku.edu/desktopgarp/ GARP], [http://data.princeton.edu/R/glms.html GLM], [http://cran.r-project.org/web/packages/gam/index.html GAM]), but has several advantages. Taken from [http://www.cs.princeton.edu/~schapire/papers/ecolmod.pdf Phillips et al. (2006)], maxent:
+
-
#requires only presence data, not presence/absence data
+
-
#can use both continuous and categorical variables
+
-
#the optimization is efficient,
+
-
#has a concise probabilistic definition,
+
-
#it avoids over-fitting through regularization
+
-
#can address sampling bias formally,
+
-
#output is continuous (not just yes/no), and
+
-
#is generative rather than discriminative which makes it better for small sample sizes.
+
-
===Strengths & Weaknesses===
+
-
*There is some criticism against using Maxent for species distribution modelling. Specifically, Maxent considers only presence data instead of both presence and absence data. As a result, capture probabilities are not explicitly included in the model. This is nearly anathema in the field of Wildlife Biology where predictions based on mark-recapture studies have been the norm for years.
+
-
*There are at least 3 practical answers to this criticism:
+
-
#The first is to be explicit about the prediction probabilities that maxent produces.
+
-
##Rather than modelling the probability of an occurrence, maxent models the probability that an occurrence at a given location is different from a randomly selected location.
+
-
##The difference from true occurrence prediction is subtle, and in many cases probably does not matter.
+
-
#Second, outside of animal studies, presence data, not presence/absence data or multiple observer data, is the norm.
+
-
##We know of no published data on plants where multiple observers were used to assess the observation probability of a species. Longitudinal studies are common, but they are not used in the same way that mark-recapture studies are used with animals.
+
-
#Finally, because of the advantages outlined above, maxent is the easiest model to implement for the large amount of species that must modeled in the GRSM.  
+
-
##Developing an in-house model with all the advantages of maxent that includes both presence/absence data would be extremely costly.
+
-
##It is likely that support for presence/absence data will be included in future versions of maxent, at which point the predictions surfaces can easily be recalculated without the cost of developing an in-house solution.
+

Revision as of 22:26, 6 August 2014

A desirable model output not included in the standard html output is a binary range map for the species. This map has only 2 categories, habitat and non-habitat. The delineation of habitat must be chosen by the user based on the binomial test results in the Analysis of omission and commission section (5.1). As an example, we chose a Balanced training omission, predicted area and threshold value threshold for the Hooded Warbler model. The logistic threshold was 0.139. Our task is to create a binary map from the original logistic projection map where values greater than 0.139 are true (1) and values less than 0.139 are false (0). The procedure follows:

  • Import the output logistic ASCII grid into ArcMap.

ArcToolbox\RightarrowConversion Tools\RightarrowTo Raster\RightarrowASCII to Raster

    • Input ASCII raster file: path to output folder/species name_ASCII.asc – Output raster: path to output folder/pred_species abbrev
    • Output data type (optional): FLOAT
    • Set Spatial Analyst Workspace to output folder Spatial Analyst\RightarrowOptions...\RightarrowGeneral
    • Working Directory: path to output folder
  • Compute the logical comparisons
  • Spatial Analyst\RightarrowRaster Calculator
    • hab_species abbrev = pred_species abbrev > threshold
    • OK

After following the above recipe, the resulting grid will be in Arc binary grid format and have a value of 1 for habitat pixels, 0 for non-habitat pixels, and NODATA for pixels not inside the analysis mask frame|alt=Binary Range Map|Figure 3, Binary Range Map

Personal tools