# GSMNP:Notebook/Maxent/Creating a Range Map

(Difference between revisions)
 Revision as of 22:17, 6 August 2014 (view source)m (GSMNP:Notebook/Motivation and Background moved to GSMNP:Notebook/Maxent/Creating a Range Map: Duplicated)← Previous diff Revision as of 22:26, 6 August 2014 (view source)Next diff → Line 1: Line 1: - ==Motivation and Background== + A desirable model output not included in the standard html output is a binary range map for the species. This map has only 2 categories, habitat and non-habitat. The delineation of habitat must be chosen by the user based on the binomial test results in the Analysis of omission and commission section (5.1). As an example, we chose a Balanced training omission, predicted area and threshold value threshold for the Hooded Warbler model. The logistic threshold was 0.139. Our task is to create a binary map from the original logistic projection map where values greater than 0.139 are true (1) and values less than 0.139 are false (0). The procedure follows: - ===Motivation=== + *Import the output logistic ASCII grid into ArcMap. - *Modeling of species distributions in parks holds many values for the scientific community, but for stewardship of park resources by the NPS, it is critical. + ArcToolbox[itex]\Rightarrow[/itex]Conversion Tools[itex]\Rightarrow[/itex]To Raster[itex]\Rightarrow[/itex]ASCII to Raster - **Only having species occurrences as points is of limited usefulness to park managers, since they cannot infer what is between the points. + **Input ASCII raster file: path to output folder/species name_ASCII.asc – Output raster: path to output folder/pred_species abbrev - *Knowing with some probability where species are in large natural areas is essential to taking actions to protect them, including monitoring, stewardship of rare species, reacting to a species that is suddenly found to be at-risk, and modeling future scenarios that place species in jeopardy. + **Output data type (optional): FLOAT - *Currently there are many threats to natural systems and native species at Great Smoky Mountains National Park. + **Set Spatial Analyst Workspace to output folder Spatial Analyst[itex]\RightarrowOptions...[itex]\RightarrowGeneral - **The biological complexity, interactive stressors and limited agency resources at the Smokies, make knowing where to take the most effective actions imperative. + **Working Directory: path to output folder - ===Background=== + *Compute the logical comparisons - *Maxent is a method for generating predictive distributions given a set of occurrence data and known environmental variables at those locations. + *Spatial Analyst[itex]\RightarrowRaster Calculator - **This predicted distribution is constrained such that it is close to the empirical average of environmental variables at the occurrence locations. + **hab_species abbrev = pred_species abbrev > threshold - **Among all possible models that fulfill these constraints the model of maximum entropy is the model which fits only the minimum constraints + **OK - **(i.e. it avoids over-fitting by choosing the most unconstrained model possible given the constraints set by the environmental variables at presence locations). + After following the above recipe, the resulting grid will be in Arc binary grid format and have a value of 1 for habitat pixels, 0 for non-habitat pixels, and NODATA for pixels not inside the analysis mask [[File:figure3.png|frame|alt=Binary Range Map|Figure 3, Binary Range Map]] - *Maxent has been used extensively is physics and economics applications. + - **It is just one among many different options for generating species prediction distributions using environmental variables at species presence site ([http://www.nhm.ku.edu/desktopgarp/ GARP], [http://data.princeton.edu/R/glms.html GLM], [http://cran.r-project.org/web/packages/gam/index.html GAM]), but has several advantages. Taken from [http://www.cs.princeton.edu/~schapire/papers/ecolmod.pdf Phillips et al. (2006)], maxent: + - #requires only presence data, not presence/absence data + - #can use both continuous and categorical variables + - #the optimization is efficient, + - #has a concise probabilistic definition, + - #it avoids over-fitting through regularization + - #can address sampling bias formally, + - #output is continuous (not just yes/no), and + - #is generative rather than discriminative which makes it better for small sample sizes. + - ===Strengths & Weaknesses=== + - *There is some criticism against using Maxent for species distribution modelling. Specifically, Maxent considers only presence data instead of both presence and absence data. As a result, capture probabilities are not explicitly included in the model. This is nearly anathema in the field of Wildlife Biology where predictions based on mark-recapture studies have been the norm for years. + - *There are at least 3 practical answers to this criticism: + - #The first is to be explicit about the prediction probabilities that maxent produces. + - ##Rather than modelling the probability of an occurrence, maxent models the probability that an occurrence at a given location is different from a randomly selected location. + - ##The difference from true occurrence prediction is subtle, and in many cases probably does not matter. + - #Second, outside of animal studies, presence data, not presence/absence data or multiple observer data, is the norm. + - ##We know of no published data on plants where multiple observers were used to assess the observation probability of a species. Longitudinal studies are common, but they are not used in the same way that mark-recapture studies are used with animals. + - #Finally, because of the advantages outlined above, maxent is the easiest model to implement for the large amount of species that must modeled in the GRSM. + - ##Developing an in-house model with all the advantages of maxent that includes both presence/absence data would be extremely costly. + - ##It is likely that support for presence/absence data will be included in future versions of maxent, at which point the predictions surfaces can easily be recalculated without the cost of developing an in-house solution. +

## Revision as of 22:26, 6 August 2014

A desirable model output not included in the standard html output is a binary range map for the species. This map has only 2 categories, habitat and non-habitat. The delineation of habitat must be chosen by the user based on the binomial test results in the Analysis of omission and commission section (5.1). As an example, we chose a Balanced training omission, predicted area and threshold value threshold for the Hooded Warbler model. The logistic threshold was 0.139. Our task is to create a binary map from the original logistic projection map where values greater than 0.139 are true (1) and values less than 0.139 are false (0). The procedure follows:

• Import the output logistic ASCII grid into ArcMap.

ArcToolbox$\Rightarrow$Conversion Tools$\Rightarrow$To Raster$\Rightarrow$ASCII to Raster

• Input ASCII raster file: path to output folder/species name_ASCII.asc – Output raster: path to output folder/pred_species abbrev
• Output data type (optional): FLOAT
• Set Spatial Analyst Workspace to output folder Spatial Analyst$\Rightarrow$Options...$\Rightarrow$General
• Working Directory: path to output folder
• Compute the logical comparisons
• Spatial Analyst$\Rightarrow$Raster Calculator
• hab_species abbrev = pred_species abbrev > threshold
• OK

After following the above recipe, the resulting grid will be in Arc binary grid format and have a value of 1 for habitat pixels, 0 for non-habitat pixels, and NODATA for pixels not inside the analysis mask frame|alt=Binary Range Map|Figure 3, Binary Range Map