Critical infrastructure systems such as the U.S. power grid are essential to maintaining our quality of life. However, many of these systems rely on aging equipment and outdated technology, making it difficult to meet growing demands and challenges, such as climate change and nation-state threats. While artificial intelligence has advanced significantly in recent years, it requires high levels of fault tolerance: the ability of the system to continue to operate properly even when some of its components fail or encounter errors.

Supported by a Challenge Grant from the Institute for Assured Autonomy, a team led by Enrique Mallada, associate professor of electrical and computer engineering at the Whiting School of Engineering, and Tamim Sookoor, senior professional staff at the Johns Hopkins University Applied Physics Laboratory, is investigating how to overcome basic limitations that prevent the use of advanced AI learning techniques in important systems where safety is critical.

These Challenge Grants aim to support interdisciplinary research teams in developing ideas for high-impact projects that focus on the existential challenges of assured autonomy, positioning them to pursue opportunities for external funding.

Mallada and Sookoor provide insight into their work by answering the following questions.

1. Briefly summarize the key focus and goals of your research through your IAA Challenge Grant. What prompted your team to take on this particular area of study?

Through the IAA Challenge Grant, we are addressing the topic of enabling the application of artificial intelligence approaches, such as reinforcement learning, in critical infrastructure systems such as the U.S. power grid. Systems such as these, which are critical to the quality of life we enjoy, rely on aging equipment and technology that struggles to meet increased demands and stressors such as climate change and nation-state adversaries. Artificial intelligence has demonstrated impressive advancements across several domains over the last decade, yet their application in domains that require high degrees of fault tolerance has been hindered by the fact that while they perform exceptionally well in most situations, they also demonstrate unanticipated failure modes. Enabling the application of AI to safety critical systems to enable them to continue operating in the face of these emergent stressors is why we decided to pursue this topic.

2. Describe some of the challenges associated with doing work on this topic, and how your team is addressing them.

The ability for AI approaches to perform well across most operating conditions yet fail catastrophically in rare situations makes them difficult to assure. Identifying these failure modes, and detecting them in time to mitigate them, while still providing the AI agent sufficient freedom to optimize operations, are some of the challenges associated with doing work on this topic. We are addressing these obstacles by taking a multi-modal approach to assurance that combines formal methods, such as barrier functions, with runtime assurance systems that can detect impending failures. We are also incorporating traditional control theoretic approaches that provide higher degrees of assurance but may function less well than modern AI techniques in certain situations.

3. How will your team’s work impact the field of assurance and autonomy? What are the next steps in this work?

Our team’s work will impact the field of autonomy by enabling the higher levels of assurance necessary to automate the functionality of safety-critical applications. We submitted an NSF Cyber Physical Systems Frontiers proposal that could support our ideas, if funded.