Vous suivez désormais cette soumission
- Les mises à jour seront visibles dans votre flux de contenu suivi
- Selon vos préférences en matière de communication il est possible que vous receviez des e-mails
Applies value iteration to learn a policy for a Markov Decision Process (MDP) -- a robot in a grid world.
The world is freespaces (0) or obstacles (1). Each turn the robot can move in 8 directions, or stay in place. A reward function gives one freespace, the goal location, a high reward. All other freespaces have a small penalty, and obstacles have a large negative reward. Value iteration is used to learn an optimal 'policy', a function that assigns a
control input to every possible location.
video at https://youtu.be/gThGerajccM
This function compares a deterministic robot, one that always executes movements perfectly, with a stochastic robot, that has a small probability of moving +/-45degrees from the commanded move. The optimal policy for a stochastic robot avoids narrow passages and tries to move to the center of corridors.
From Chapter 14 in 'Probabilistic Robotics', ISBN-13: 978-0262201629, http://www.probabilistic-robotics.org
Aaron Becker, March 11, 2015
Citation pour cette source
Aaron T. Becker's Robot Swarm Lab (2026). MDP robot grid-world example (https://fr.mathworks.com/matlabcentral/fileexchange/49992-mdp-robot-grid-world-example), MATLAB Central File Exchange. Extrait(e) le .
Remerciements
A inspiré : Markov Decision Process (MDP) Algorithm, Kilobot Swarm Control using Matlab + Arduino
Informations générales
- Version 1.0.0.0 (7,72 ko)
Compatibilité avec les versions de MATLAB
- Compatible avec toutes les versions
Plateformes compatibles
- Windows
- macOS
- Linux
| Version | Publié le | Notes de version | Action |
|---|---|---|---|
| 1.0.0.0 | added link to video https://youtu.be/gThGerajccM |
