-
Introduction
-
Background
-
System Implementation
-
Experiments
-
Results and Analysis
-
Conclusions
-
Robot to enter an unknown environment
-
Difficult for a programmer to consider every eventuality
-
Solution = make robot more adaptable (learn)
-
Popular Learning Method = Reinforcement Learning
-
Nuclear industry characterisation robots (i.e. radiological mapping)
-
Battery powered robots must recharge batteries
-
Robots must find efficient paths to the recharger
-
Use RL to find efficient paths
Agent → action (at ) →→ Environment
↑ ↓ ↓
↑←← reward (rt ) ←← rt+1 ↓
↑←← state (st )←←←←← st+1
initialise Qπ(s,a) arbitrarily
repeat (for each episode):
initialise s
repeat (for each step of episode):
choose a from s using policy derived from Q
take action a, observe r, s’
Qπ(s,a) ← Qπ(s,a) + α[r + γ maxa’ Qπ(s,a) – Qπ(s,a)]
s ← s’;
until s is terminal
-
RL makes a robot's behaviour more adaptable (learn)
-
RL implemented in a MA environment = more adaptable, robust, dynamically reconfigurable architecture
-
Experimental results show RL can learn efficient control policies in a range of environments of varying complexity
-
Experimental results shown RL provides a more efficient + safer method for guiding a robot back to a recharging station than a simple non-AI method