Modern Experimental Design and Active Learning in the Real World

We are holding an online reading group focusing on modern adaptive experimental design and active learning in the real world. All interested participants are welcome to join!

The reading group will be held on Thursdays at 11am PDT/California, 6pm GMT/UK, 7pm CET/Zurich time. To add this to your calendar, click here. To receive information via email, subscribe to our mailing list.

To join, please use the following Zoom link: https://ethz.zoom.us/j/67585775251

Speaker Schedule

January 12, 2023 Kelly W. Zhang
January 19, 2023 Kevin Jamieson
January 26, 2023 Raul Astudillo
February 2, 2023 Emmanuel Bengio
February 16, 2023 Aldo Pacchiano
February 23, 2023 Haitham Bou Ammar
March 2, 2023 Kevin Tran
March 9, 2023 Zi Wang
March 16, 2023 Viraj Mehta
March 23, 2023 Johannes Kirschner

Past Talks

Johannes Kirschner, March 23, 2023

Title: Data-Driven Optimization of Particle Accelerators with Safety Constraints

Abstract: Tuning machine parameters of particle accelerators is a repetitive and time-consuming task that is challenging to automate. While many off-the-shelf optimization algorithms are available, their use is limited in practice because most methods need to account for safety-critical constraints in each iteration, such as loss signals or step-size limitations. One notable exception is Bayesian optimization, a data-driven tuning approach for global optimization with noisy feedback. We propose a Bayesian optimization algorithm (SafeLineBO) that combines line search techniques and safety constraints, and exhibits favourable properties in high-dimensional settings. We evaluate our method on two particle accelerators of the Paul Sherrer Institute, demonstrating efficacy and efficiency for tuning up to 16 parameters subject to 224 constraints.

Relevant Papers:

Bio: Johannes Kirschner is a postdoc fellow with Prof. Csaba Szepesvári at the University of Alberta (supported by an “Early Postdoc Mobility fellowship” of the Swiss National Foundation). Johannes’ research is focused on reinforcement learning algorithms, experimental design and data-driven decision-making, and his work spans theoretical foundations to challenging real-world applications. Before joining the University of Alberta, Johannes obtained his PhD at ETH Zurich with Prof. Andreas Krause.

Viraj Mehta, March 16, 2023

Title: Automated Experimental Design of Tokamak Rampdowns using Bayesian Optimization

Abstract: Tokamaks typically operate in discrete shots lasting a short period of time. In these shots, the plasma current must be ramped up to a flat top current at which physics experiments and ultimately fusion reactions will occur. Before the available change in flux of the current drive mechanism is exhausted or if there is an emergency necessitating an early shutdown, the current must be safely ramped down to a low level before the plasma disrupts and deposits its energy on the vessel wall. On the next generation of tokamaks it is essential that this is done safely as the device could be damaged by the stored electrical or magnetohydrodynamic energy present in the plasma. Typically the ramp-down phase of a shot consists of a decrease in current and injected power and optionally a change in shape, but there is considerable flexibility in the rate, sequencing, and duration of these changes. In this work, we give a procedure for automatically choosing experimental rampdown designs to rapidly converge to a good design of the rampdown phase using probabilistic machine learning methods and acquisition function techniques taken from Bayesian Optimization. In our experiments at DIII-D over the course of 2022, we found that there is a clear and statistically significant reduction in current and energy at the end of the rampdown in comparison to baseline operations when using rampdown designs generated by the model at the end of shots. We also find that the best action predicted by the model significantly improved as the model was able to explore over the course of the experimental campaign.

Bio: Viraj Mehta is a fourth-year PhD student at the Robotics Institute at Carnegie Mellon University. He is broadly interested in reinforcement learning, generative models, and dynamical systems. In particular he works on solving control problems in science with machine learning in regimes where the data-generating process is expensive. Much of his work is motivated by the problem of plasma control for nuclear fusion, for which he conducts applied research in collaboration with the Princeton Plasma Physics Lab. His work has appeared in Neurips, ICML, ICLR, CDC, and other venues.

Zi Wang, March 9, 2023

Title: Experimental Design and Domain Expertise: The Essential Ingredients for Robot Learning

Abstract: Artificial intelligence researchers have long sought to embed learning abilities in robotic systems. We identify three stages of developing intelligent agents: (1) prior engineering, where knowledge is built in by experts; (2) skill learning by experimenting; and (3) adaptation to new situations. We demonstrate examples of these stages on complex long-horizon robot manipulation tasks. In particular, we show how data efficient learning can be enabled by combining domain expertise and experimental design. Finally, we discuss the Bayesian alignment problem, which is how to align Bayesian priors in intelligent systems with expert beliefs, so that we can better understand and control the behaviors of these systems.

Relevant Papers:

Bio: Zi Wang is a research scientist at Google Research, Brain Team. Her research focuses on probabilistic methods for decision making, including problems such as Bayesian optimization, active learning and robot learning. Zi obtained her Ph.D. in computer science from Massachusetts Institute of Technology in 2020, where she was awarded MIT Graduate Women of Excellence, Rising Stars in EECS, and RSS Pioneers.

Kevin Tran, March 2, 2023

Title: Active materials discovery for sustainable energy storage

Abstract: We may be able to slow the effects of climate change by transitioning from fossil fuels to solar energy, but solar energy’s inconsistent availability makes implementation difficult. This could be addressed by storing the energy in solar fuels, which are fuels created from solar energy, CO₂, and H₂O. Unfortunately, solar fuels are hindered by insufficiently mature material technologies. In this talk, we discuss two past studies where we used active learning to discover catalysts for solar fuel production. We then review several other examples of active learning in the materials informatics community. We end by discussing open issues with active learning in the materials informatics community.

Relevant Papers:

Bio: Kevin Tran is a senior research scientist at Toyota Research Institute (TRI). He works on accelerated materials design and discovery in TRI’s Energy & Materials division. His work focuses on developing methods to improve the rate at which we can advance technologies for sustainable energy storage, which often manifests in the form of modeling and automated decision making. Kevin also has experience in software development, process engineering, and biomedical engineering. He received his PhD in chemical engineering at Carnegie Mellon University with Zachary Ulissi, and his bachelor’s in chemical engineering at the University of Delaware with Babatunde Ogunnaike.

Haitham Bou Ammar, February 23, 2023

Title: Combinatorial Bayesian Optimisation with Applications to Antibody discovery and Logic Synthesis EDA

Abstract: Many critical emerging real-world problems are instances of combinatorial optimisation with an expensive-to-evaluate (non-linear) black-box objective. Examples are widespread, including machine learning, robotics, medicine and chip design scenarios. In those cases, standard combinatorial solvers, e.g., simulated annealing and genetic algorithms, face difficulties due to the high-sample complexity associated with their heuristics. In this talk, we will elaborate on new combinatorial solvers that extend Bayesian optimisation to discrete spaces and can handle expensive-to-evaluate (non-linear) black-box objectives. We detail Gaussian process kernels that operate in combinatorial spaces and illustrate effective acquisition optimisation techniques over discrete domains. We then apply such solutions to two real-world use cases from antibody design and logic synthesis EDA. Our results demonstrate effective solutions leading us to rank 1 in the EPFL logic synthesis benchmark in 2022.

Relevant Papers:

Bio: Haitham Bou-Ammar leads the reinforcement learning team at Huawei technologies Research and Development UK and is an Honorary Lecturer at UCL. His primary research interests lie in the field of statistical machine learning and artificial intelligence, focusing on Bayesian optimisation, probabilistic modeling, and reinforcement learning. He is also interested in learning using massive amounts of data over extended time horizons – a property common to “Big-Data” problems. His research also spans different areas of control theory, nonlinear dynamical systems, social networks, and distributed optimisation.

Aldo Pacchiano, February 16, 2023

Title: RLHF: Reinforcement Learning with Once-per-Episode Feedback

Abstract: Despite Reinforcement learning’s remarkable success in several application and simulation domains, research in the field has barely ventured beyond the typical modeling assumptions underlying the MDP formalism. In this work we aim to reimagine the way in which rewards are produced by moving away from the typical setting of per-step Markovian rewards to a model that instead produces a binary score acting at the trajectory level. While this is an extreme test case for theory, it is also arguably more representative of real-world applications than the traditional requirement in RL practice that the learner receive feedback at every time step. Indeed, in many real-world applications of reinforcement learning, such as self-driving cars, and robotics, it is easier for a human labeler to evaluate whether a learner’s complete trajectory was either “good” or “bad,” but harder to provide a reward signal at each step. To show that learning is possible in this more challenging setting, we study the case where trajectory labels are generated by an unknown parametric model and provide a statistically and computationally efficient algorithm that achieves sub-linear regret. We will also comment on how to extend these results to the dueling setting where a human labeler decides which one of two trajectories is better.

Relevant Papers:

Bio: Aldo is a Postdoctoral Researcher at Microsoft Research NYC. He obtained his PhD at UC Berkeley where he was advised by Peter Bartlett and Michael Jordan. His research lies in the areas of Reinforcement Learning, Online Learning, Bandits and Algorithmic Fairness. He is particularly interested in furthering our statistical understanding of learning phenomena in adaptive environments and use these theoretical insights and techniques to design efficient and safe algorithms for scientific, engineering, and large-scale societal applications.

Emmanuel Bengio, February 2, 2023

Title: Introduction to GFlowNet

Abstract: GFlowNet, short for Generative Flow Network, is a new generative modeling framework which we think is particularly suited for discrete, combinatorial objects. The idea behind GFN is to estimate (graph-theoretic) flows in a directed acyclic network. The network represents all possible ways of constructing an object, and so knowing the flow gives us a policy which we can follow to sequentially construct objects in an energy-based fashion (i.e., objects are drawn proportionally to some quantity, like a reward or unnormalized density). In this talk I’ll present the framework, the intuition behind it, some nice properties it has, some cool things we can do with it and some really cool avenues for future work such as drug discovery.

Relevant Papers:

Bio: Emmanuel Bengio is an ML Scientist at Recursion, working on the intersection of GFlowNets and de-novo drug design. He did his PhD under Joelle Pineau and Doina Precup at McGill/Mila, focusing on understanding generalization in deep RL.

Raul Astudillo, January 26, 2023

Title: Composite Bayesian Optimization for Efficient and Scalable Adaptive Experimentation

Abstract: Experimentation is ubiquitous in science and a key driver of human progress. Many experimentation tasks can be cast as optimization problems with expensive or time-consuming to evaluate objective functions. Bayesian optimization has emerged as a powerful tool for tackling such problems. However, many experimentation tasks arising in high-stakes applications such as materials design and drug discovery are out of the reach of standard approaches. In this talk, I will describe recent advances that aim to address this challenge. In particular, I will focus on how the composite structure of many experimentation tasks can be exploited to improve the efficiency and scalability of Bayesian optimization methods. Finally, I will provide directions for future research toward a general framework for efficient end-to-end adaptive experimental design in complex settings.

Relevant Papers:

Bio: Raul is a Postdoctoral Scholar in the Department of Computing and Mathematical Sciences at Caltech, hosted by Professor Yisong Yue. He obtained his Ph.D. in Operations Research and Information Engineering from Cornell University, working under the supervision of Professor Peter Frazier. Before that, he completed the undergraduate program in Mathematics offered jointly by the University of Guanajuato and the Center for Research in Mathematics in Mexico. In 2021, he was a Visiting Researcher at Meta within the Adaptive Experimentation team led by Eytan Bakshy. Raul’s research interests lie at the intersection between operations research and machine learning, with an emphasis on Bayesian methods for efficient sequential data collection. His work combines principled decision-theoretic foundations with sophisticated machine learning tools to develop frameworks for adaptive experimentation in robotics, materials design, cellular agriculture, and other scientific applications.

Kevin Jamieson, January 19, 2023

Title: Lessons learned in deploying bandit algorithms

Abstract: Bandit algorithms, and adaptive experimentation more generally, promise the same statistically significant guarantees as, say, non-adaptive A/B testing, but require far fewer trials which results in a savings in time and money. However, such promises hold only under assumptions that rarely hold in practice, and for algorithms that may require unrealistic data interaction patterns. This talk explores this tension through two case studies in deploying state of the art algorithms to a large online experimentation platform and a robotics application in an industrial setting. Problems will be discussed, sensible solutions will be proposed, and opinions will be offered.

Relevant Papers:

Bio: Kevin Jamieson is an Assistant Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington and is the Guestrin Endowed Professor in Artificial Intelligence and Machine Learning. He received his B.S. from the University of Washington, his M.S. from Columbia University, and his Ph.D. In 2015 from the University of Wisconsin - Madison under the advisement of Robert Nowak, all in electrical engineering. He returned to the University of Washington as faculty in 2017 after a postdoc in the AMP lab at the University of California, Berkeley working with Benjamin Recht. Jamieson’s work has been recognized by an NSF CAREER award and Amazon Faculty Research award. Jamieson’s research explores how to leverage already-collected data to inform what future measurements to make next, in a closed loop.

Kelly W. Zhang, January 12, 2023

Title: Inference after Adaptive Sampling for Longitudinal Data

Abstract: Online algorithms that learn to optimize treatments over time are increasingly used in a variety of digital intervention problems. These algorithms repeatedly update parameter estimates as data accrues; these parameter estimates are used to inform treatment decisions. These algorithms are called “adaptive sampling” algorithms and the resulting data is considered “adaptively collected.” In this work, we focus on data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or “pooling” data across users allows adaptive sampling algorithms to potentially learn faster. However, by pooling, these algorithms induce dependence between the collected user data trajectories; this makes statistical inference on this data-type especially challenging. We provide methods to perform a variety of statistical analyses on such adaptively collected data, including Z-estimation, off-policy analyses, and inferring excursion effects. This work is motivated by our work in designing experiments in which online reinforcement learning algorithms pool data across users to learn to optimize treatment decisions, yet reliable statistical inference is essential for conducting a variety of statistical analyses after the experiment is over.

Bio: Kelly W. Zhang is a final-year Ph.D. candidate in computer science at Harvard University advised by Susan Murphy and Lucas Janson. Her research focuses on addressing challenges faced when applying reinforcement learning algorithms to real-world problems. She has developed methods for statistical inference for data collected by bandit and reinforcement learning algorithms, i.e., adaptively collected data. She also works on developing the reinforcement learning algorithm to be used in Oralytics, a mobile health app aimed to help users develop healthy oral hygiene habits, in collaboration with Oral-B and researchers at UCLA and UMichigan. She is supported by an NSF Graduate Research Fellowship.