Yair Amir, Johns Hopkins University
Tamim Sookoor, Johns Hopkins Applied Physics Lab

In recent years, Reinforcement Learning (RL) algorithms have brought dramatic improvements to diverse tasks such as defeating world champions in games including Go and StarCraft, controlling robots, managing industrial control systems, optimizing supply chains, calibrating machines, and personalizing marketing and ad recommender systems. One of the main limitations of these techniques is their opaque failure modes: it is difficult to understand how exactly these systems work and predict when and how they will fail in a given scenario other than running the system. In order for such autonomous systems to be assured, two key issues must be addressed: (1) fault tolerance and (2) controller competence. The goal of the proposed research is to combine information from a Simplex-based system monitor with a RL competence estimator to assure the safety of RL-controlled city-scale critical infrastructure systems.
The research explores the two issues above by producing (i) an easy-to-understand Black box monitor that avoids system failures by detecting possible breach of system’s correctness invariants, ignoring the details of the autonomic controller, (ii) a White box monitor that estimates the competence of the autonomous control algorithm, and (iii) a Decision Module that takes input from the monitors to predict decisions that could result in the system entering a high-risk state and switching between the optimal RL-based autonomic controller and a sub-optimal safe controller.

The research will build off decades of work on the design and implementation of fault tolerant systems for safety critical applications that utilize the Simplex Architecture [1] and approaches to competence and uncertainty estimation in deep neural networks (DNNs) [2]. Initially, two realistic city-scale applications will be developed: the first is a smart grid and the second is an intelligent traffic control system. We anticipate this research to have broad impact in ensuring that RL-based autonomous control systems continue to operate safely when the AI fails in unpredictable and unanticipated ways.

[1] D. Seto, B. Krogh, L. Sha and A. Chutinan, “The Simplex architecture for safe online control system upgrades,” ACC, 1998.
[2] Rajendran, Vickram, and William LeVine. “Accurate Layerwise Interpretable Competence Estimation.” Advances in Neural Information Processing Systems. 2019.