
Abstract: Machine learning algorithms are everywhere, ranging from simple data analysis and pattern recognition tools used across the sciences to complex systems that achieve superhuman performance on various tasks. Ensuring that they are safe—that they do not, for example, cause harm to humans or act in a racist or sexist way—is therefore not a hypothetical problem to be dealt with in the future, but a pressing one that we can and should address now.
In this talk I will discuss some of my recent efforts to develop safe machine learning algorithms, and particularly safe reinforcement learning algorithms, which can be responsibly applied to high-risk applications. I will focus on the article “Preventing undesirable behavior of intelligent machines” recently published in Science, describing its contributions, our subsequent extensions, and important areas of future work.
Bio: Philip Thomas is an assistant professor at UMass. He received his PhD from UMass in 2015 under the supervision of Andy Barto, after which he worked as a postdoctoral research fellow at CMU for two years under the supervision of Emma Brunskill before returning to UMass. His research focuses on creating machine learning algorithms, particularly reinforcement learning algorithms, which provide high-probability guarantees of safety and fairness. He emphasizes that these algorithms are often applied by people who are experts in their own fields, but who may not be experts in machine learning and statistics, and so the algorithms must be easy to apply responsibly. Notable accomplishments include publication of a paper on this topic in Science titled “Preventing Undesirable Behavior of Intelligent Machines” and testifying on this topic to the U.S. House of Representatives Taskforce on Artificial Intelligence at a hearing titled “Equitable Algorithms: Examining Ways to Reduce AI Bias in Financial Services.”
ABSTRACT: Let us consider a difficult computer vision challenge. Would you want an algorithm to determine whether you should get a biopsy, based on an x-ray? That’s usually a decision made by a radiologist, based on years of training. We know that algorithms haven’t worked perfectly for a multitude of other computer vision applications, and biopsy decisions are harder than just about any other application of computer vision that we typically consider. The interesting question is whether it is possible that an algorithm could be a true partner to a physician, rather than making the decision on its own. To do this, at the very least, we would need an interpretable neural network that is as accurate as its black box counterparts. In this talk, I will discuss two approaches to interpretable neural networks: (1) case-based reasoning, where parts of images are compared to other parts of prototypical images for each class, and (2) neural disentanglement, using a technique called concept whitening. The case-based reasoning technique is strictly better than saliency maps, and the concept whitening technique provides a strict advantage over the posthoc use of concept vectors. Here are the papers I will discuss:
-
- This Looks Like That: Deep Learning for Interpretable Image Recognition. NeurIPS spotlight, 2019. https://arxiv.org/abs/1806.10574
- IAIA-BL: A Case-based Interpretable Deep Learning Model for Classification of Mass Lesions in Digital Mammography, 2021. https://arxiv.org/abs/2103.12308
- Concept Whitening for Interpretable Image Recognition. Nature Machine Intelligence, 2020. https://rdcu.be/cbOKj
- Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and use Interpretable Models Instead, Nature Machine Intelligence, 2019. https://rdcu.be/bBCPd
- Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges, 2021 https://arxiv.org/abs/2103.11251
BIO: Coming Soon


