System Implementation 3/3 - AIA Behaviour Flowchart
Once the AIA is initialised it is provided with data about the environment in which the robot is located i.e. position of obstacles, the recharging station and the robot. A model of the environment is constructed based upon this information, which also describes the current state of the environment, potential moves the robot can take, and the reward which would be derived from taking a particular move or getting to a goal state (i.e. the recharger). A multi-dimensional array is initialised to store Q-values for all actions in all states along with the Q-learning algorithm. The agent can enter an autonomous control mode in which Q-learning algorithm parameters are set, and the algorithm is applied to the model of the environment to simulate a number of learning episodes and in so doing update the values in the Q-value array. Alternatively these processes can be controlled by the user control GUI.









