Sign up & Download
Sign in

Policy Gradient Planning for Environmental Decision Making with Existing Simulators

by Mark Crowley, David Poole
Cell ()

Abstract

In environmental and natural resource planning do- mains actions are taken at a large number of locations over multiple time periods. These problems have enor- mous state and action spaces, spatial correlation be- tween actions, uncertainty and complex utility models. We present an approach for modeling these planning problems as factored Markov decision processes. The reward model can contain local and global components as well as spatial constraints between locations. The transition dynamics can be provided by existing simula- tors developed by domain experts. We propose a land- scape policy defined as the equilibrium distribution of a Markov chain built from many locally-parameterized policies. This policy is optimized using a policy gra- dient algorithm. Experiments using a forestry simulator demonstrate the algorithms ability to devise policies for sustainable harvest planning of a forest.

Cite this document (BETA)

Available from Mark Crowley's profile on Mendeley.
Page 1
hidden
Page 2
hidden

Authors on Mendeley

Readership Statistics

7 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
29% Other Professional
 
29% Ph.D. Student
 
29% Researcher (at a non-Academic Institution)
by Country
 
29% United States
 
29% Canada
 
14% India

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in