Lifelong Learning for Vision based AUV Control

Academic Institution: Heriot Watt University

Academic Supervisor: Professor Yvan Petillot

Industry Partner: ROVCO

PhD Student: Pierre Nicolay

Start Date: September 2019

Abstract

The subsea environment represents an important current and future source of essential ressources, from enegery (oil&gas, renewables) to food (aquaculture, fisheries, agriculture) and medicine (new molecules). However, the extreme nature of the environment makes it unsuitable for humans and robotics solutions are commonly used. These underwater vehicles are mostly tele-operated and require an expert pilot as well as careful manual adjustments of control parameters before they can be deployed. With the rapid development of autonomous systems and the need for permanent deployment of robots to monitor offshore assets, there is a crucial need for systems that can adapt to the changes in the environment and vehicle configuration and/or thrusters ageing and faults. This requires a new generation of control algorithms, able to take direct feedback from the vehicle navigation system to adapt its parameters and learn the best control strategies on the fly.

The goal of this project is to enable an Autonomous Underwater Vehicle (AUV) to control its motion through a learnt control policy using vision sensors. This has the potential to make the motion control robust to changes in environment, sensor and vehicle degradation. Learning for robotic control is not a new area of research, but the recent rise in powerful applied machine vision techniques and recent advances in machine learning opens up new potential for work combining vision and learnt control for real world tasks. The new vision system developed by our industrial partner ROVCo provides real time accurate motion and pose estimation. This input can be used to evaluate and tune control systems to improve its accuracy in the presence of disturbances or changes on payload making the system more adaptable to a variety of real life scenarios. In a second stage, optimal policies can be learnt from data using deep reinforcement learning and lead to online adaptation in complex scenarios.

The key outputs of this project are envisaged to be a set of algorithms and implementations for control of AUV/ROV systems from visual inputs using a learnt approach to bring benefits over a hand-coded control scheme. For example, it could be enabling accurate, stable, motion control that works on a variety of platforms, or platforms with changing payload, or control robust to different water/current conditions etc. This could be learnt end to end approaches from vision to control given high level control objectives (e.g. “hold position”, “move right” or even “explore”) or it might be approaches dependent on noisy pose/environment sensing from other existing algorithms. A very successful PhD project would see the development of promising implementations, tested in the real world. These would then be a clear route to commercialisation by Rovco, leading to eventual deployment

SRPe