Stanford RAIN (Research on Artificial Intelligence and INcentives) Seminar

RAIN is a seminar on the theory and practice of AI in strategic and societal settings. Supported by Stanford’s Society & Algorithms Lab (SOAL), it serves as a hub for talks and discussion at the intersection of AI, incentives, and society.

Our talks are on Tuesdays from 4:30-5:30 PM PT in Y2E2 101!
To receive updates on upcoming talks, please join our email list!

Upcoming Talks

Jun 16, 2026

Quanquan Liu

Judging Proofs and Evolving Programs: Expert Feedback for Reliable LLM Systems

→ Abstract and Bio

Large language models are increasingly capable of producing plausible solutions, but expert domains demand more than plausibility: they require reliable mechanisms for evaluation, correction, and improvement. This talk connects two recent projects that study this challenge from complementary angles. QEDBench examines the reliability of automated evaluation by measuring the alignment gap between LLM-as-a-judge protocols and human expert assessment for university-level mathematical proofs, using dual rubrics and extensive human evaluation. ParEVO moves from evaluation to synthesis, introducing an agentic evolutionary framework for generating high-performance parallel algorithms over irregular data structures such as sparse graphs. It combines curated instruction data, specialized models, and iterative feedback from performance tests to optimize generated programs. Together, these papers argue for a shift from static benchmark performance toward expert-grounded feedback loops: systems that can be rigorously judged, corrected using domain-specific signals, and improved in settings where correctness, reasoning, and efficiency are all non-negotiable.

Bio: Quanquan C. Liu is an Assistant Professor of Computer Science at Yale University. Her research spans algorithms for large-scale data, dynamic and parallel graph algorithms, high-performance computing, differential privacy, and resilient distributed computation. Recently, her work has also explored how large language models can be evaluated and improved in expert technical domains, including mathematical proof assessment through QEDBench and high-performance parallel code synthesis through ParEVO. She received her Ph.D., MEng, and B.S. from MIT.

Previous Talks This Year

Jun 2, 2026

Kevin Leyton-Brown

Algorithm Synthesis with Theoretical Guarantees

→ Abstract and Bio

Despite massive progress on LLM-driven code synthesis, identifying algorithms that achieve excellent empirical performance requires extensive, extremely expensive empirical testing. Algorithm configuration methods are automated ways of performing such testing: optimizing the performance of parameterized heuristic algorithms on given distributions of problem instances. Such methods can be seen as efficient procedures for extending classical machine learning to hypothesis spaces consisting of algorithm designs. This talk will begin by defining the problem and illustrating its promise via some recent practical success stories. However, all widely used algorithm configuration methods both achieve poor asymptotic runtime performance in the worst case and optimize what I will argue is the wrong objective function. I will begin by explaining why we should leverage decision theory to maximize expected utility instead of minimizing average runtime. Then I will present a new algorithm configuration approach called Continuous, Online Utilitarian Procrastination (COUP), which optimizes this objective while offering strong theoretical guarantees. I will conclude by showing that these guarantees come effectively for free, as COUP achieves state-of-the-art empirical performance.

Bio: Kevin Leyton-Brown is a professor of Computer Science and a Distinguished University Scholar at the University of British Columbia. He holds a Canada CIFAR AI Chair at the Alberta Machine Intelligence Institute and is an associate member of the Vancouver School of Economics. He received a PhD and an M.Sc. from Stanford University (2003; 2001) and a B.Sc. from McMaster University (1998). He studies artificial intelligence, mostly at the intersection of machine learning with either the design and operation of electronic markets or the design of heuristic algorithms.

May 26, 2026

Ashesh Rambachan

From Next-Token Prediction to Automata Induction

→ Abstract and Bio

Sequence data is ubiquitous in economics — job histories in labor economics, diagnosis and treatment sequences in health economics, strategic interactions in game theory. Generative sequence models can learn to predict these sequences well, but their complexity makes it hard to extract interpretable economic insights from their predictions. We develop a framework for inducting compact state representations (finite automata) that summarize estimated next-token probabilities. This provides a common language between black-box sequence models and the dynamic restrictions imposed by economic models. We illustrate the framework through applications to collusive behavior and cooperation in repeated games.

Bio: Ashesh Rambachan is a Silverman (1968) Family Career Development Assistant Professor of Economics at MIT. His research interests are primarily in econometrics with a focus on applications of machine learning in economics and causal inference.

May 19, 2026

Yiming Yang

What Makes Diffusion Models Work for Combinatorial Optimization?

→ Abstract and Bio

Recent advances in generative modeling — particularly diffusion models — have shown strong performance beyond traditional domains such as vision and language, extending to combinatorial optimization (CO). However, it remains unclear what fundamentally drives their success in these structured, discrete settings. In this talk, I present a unified perspective on what makes diffusion-based approaches effective for CO, centered on three interconnected aspects: training signal, credit assignment, and stable local updates. · First, I show that in CO, labeled optimal solutions are sparse and often weak as supervision, while unsupervised approaches that directly leverage objective values provide dense and informative training signals. · Second, I examine how global objectives are propagated to local updates. Standard diffusion relies on indirect, multi-step signal propagation, whereas methods such as combinatorial adjoint matching (CAM) directly inject endpoint objectives into each step. These represent two extremes — multi-step refinement and direct global supervision — and effective performance requires balancing the two, leading to a notion of problem-dependent effective depth. · Third, I show that even with proper objectives and credit assignments, diffusion models can fail without stable local updates. In discrete spaces, naive updates can be unstable; structure (e.g., feasibility-preserving operators) and regularization (e.g., controlling update magnitude) are essential for stable training and effective inference. Overall, the effectiveness of diffusion models in CO arises not from any single component, but from the combination of guided and stabilized local updates, suggesting a broader perspective for generative optimization beyond diffusion alone.

Bio: Yiming Yang is a Professor in the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University. Her recent research focuses on generative machine learning, including diffusion and flow models, agentic large language models, and their applications to combinatorial optimization and scientific discovery. Her work explores a unifying perspective that views generative models as structured reasoning processes, connecting local operations with global objectives through geometric and dynamic views. She has made influential contributions across multiple phases of machine learning, from early work in text classification and information retrieval to widely cited advances in foundation models, including XLNet, Transformer architectures, and neural architecture search (e.g., DARTS). Her recent work further investigates scalable reasoning, alignment, and evaluation in large language models and agentic systems. She has published extensively in top venues such as NeurIPS, ICML, ICLR, ACL, EMNLP, AAAI, and SIGIR, and is a member of the ACM SIGIR Academy.

May 12, 2026

Grant Schoenebeck

Incentivized Alignment for Strategic Agents (Human and Otherwise)

→ Abstract and Bio

Advances in machine learning enable new forms of human-AI collaboration, but collaborative settings typically involve agents with divergent objectives and private information. This will become increasingly critical in the emerging world of agentic AI, where ML-powered agents act on behalf of individuals or institutions with conflicting goals. I use the term incentivized alignment to describe the approach of combining both machine learning and incentive design to achieve alignment of system outcomes despite misaligned agents. This talk presents two case studies of incentivized alignment showing how machine learning can make mechanism design scalable and practical, and how mechanism design can make machine learning strategically robust. First, I examine the use of LLMs as judges for rating subjective responses. While LLMs perform well on existing datasets, they are highly susceptible to manipulation. I propose adapting peer-prediction mechanisms to create strategically-robust scoring mechanisms that incentivize honest reporting. Beyond ensuring high-quality inputs to AI systems, these mechanisms can potentially eliminate reward hacking in ML training pipelines. Second, I consider collective decision-making where agents hold different objectives and private information. In particular, we examine the Community Notes mechanism in this context. The goal is then to design mechanisms that incentivize strategic agents to select outcomes that would be optimal under full information sharing, according to certain criteria. Both case studies demonstrate solutions for incentivized alignment in multi-agent systems employing the combination of incentive design and machine learning, a theme likely to be central to the future of collaborative AI.

Bio: Grant Schoenebeck is an associate professor at the University of Michigan in the School of Information. His work has recently focused on developing and analyzing systems for eliciting and aggregating information from a diverse group of agents with varying information, interests, and abilities by combining ideas from machine learning and economics (e.g. game theory, mechanism design, and information design). More generally, his recent work has been about incentives and (machine) learning in a variety of contexts. His research is supported by multiple NSF grants including a CAREER award and spans publications in top venues including NeurIPS, ICLR, EC, WINE, the Web Conference, STOC, and FOCS. His former PhD students and postdocs now hold tenure-track positions at the University of Illinois Urbana-Champaign, Peking University, George Mason University, and Shanghai Jiao Tong University. He recently served as Program Committee Co-chair for WINE, Theory Track Co-chair for EC, and Economics and Computation Track co-chair at the Web Conference. Grant received his PhD at UC Berkeley, studied theology at Oxford University, and received his BA in mathematics and computer science from Harvard.

May 5, 2026

Joseph Jay Williams

Using Adaptive Experimentation to Design Real-World AI Systems for Education & Health

→ Abstract and Bio

AI systems can look promising - helping certain people with mental health and some students with learning – yet fail others. I develop Adaptive A/B Experimentation methods that embed AI into everyday interfaces – to discover which actions work, for who, and when. This has generated AI systems that used text messages for mental health coaching, and created online homework systems that experimented with explanations like an expert teacher. Adaptive Experimentation enabled AI systems that: (1) generate candidate actions using improved LLM interfaces and co-design between LLMs, users, and scientists (e.g. action A, B, C…); and (2) use reinforcement learning algorithms to test, personalize, and deploy interventions in real time. We identified a 3-minute intervention that boosted student grades by as much as 4%. This work received 1st place in the $1M XPRIZE and a $3M NSF grant to provide practitioners and scientists access to Adaptive Experimentation tools. These tools enable cross-domain improvement in how AI systems change beliefs and behavior for the better.

Bio: Joseph Jay Williams is an Assistant Professor at the University of Toronto in Computer Science, with courtesy appointments supervising PhD students in Statistical Sciences, Psychology, and the Vector Institute for Artificial Intelligence. He also has courtesy appointments in Economics and Industrial Engineering. He directs the Adaptive Experimentation and Intelligent Interventions lab. His lab's work is represented in over 85 papers, 2 Best Paper Awards (1 at CHI), 4 Runner-up/Honorable Mention for Best Paper (CHI, EDM, LAS), and 1st place in a $1M XPRIZE competition for the future of experimentation technology in education. He's received over $2M in grant funding, enabling interventions impacting over 500,000 people. His PhD Students have spanned HCI (Human-Computer Interaction), Education, Health, Psychology, applied AI (Reinforcement Learning & LLMs), and Statistics. Joseph is originally from Trinidad and Tobago, was previously an Assistant Professor in Information Systems & Analytics at the National University of Singapore, Research Scientist at Harvard, postdoc at Stanford, and did his PhD at UC Berkeley.

Apr 28, 2026

Cathy Wu

Tackling the Long Tail of Transportation Optimization with Machine Learning

→ Abstract and Bio

Before changing a bus network, signal timing plan, or autonomy deployment, decision-makers must compare relevant counterfactuals. However, such counterfactual questions induce a long tail of difficult optimization problems for which traditional approaches are prohibitively costly—requiring years of solver development, lengthy solve times, or both. My research asks: How can AI lower the cost of solving transportation optimization problems? In principle, deep reinforcement learning (RL) can be used to solve arbitrary optimization problems. However, RL is far from mature; our work exposes fundamental limitations in current methods: brittleness to even small changes, like network structure or demand. In response, I take two broad approaches. First is to address non-robustness in deep RL: I will present a Bayesian approach that trains an ensemble of RL models to solve contextual control problems with up to 30x improved sample efficiency. Second is to understand how to use AI in conjunction with classical optimization techniques: I will show how AI can help identify and eliminate unproductive decisions within combinatorial optimization solvers, leading to 2-10x faster solve times. Finally, we inform transportation policy by tackling an open optimization problem: we produce the first prospective impact assessment of city-scale eco-driving, showing that optimizing vehicle speeds at intersections can significantly improve energy efficiency without sacrificing throughput or safety. Together, these approaches suggest a principled "middle road" between pure RL and pure classical optimization for scalable decision-making across transportation, logistics, and beyond.

Bio: Cathy Wu is the Class of 1954 Career Development Professor at MIT, holding appointments in LIDS, CEE, and IDSS. She holds a Ph.D. in EECS from UC Berkeley, and B.S. and M.Eng. in EECS from MIT, and completed a Postdoc at Microsoft Research. Her research group studies machine learning for optimization, with a focus on transportation. She is broadly interested in enabling faster, evidence-driven decisions for sociotechnical systems. Cathy is the recipient of the NSF CAREER (2023), the Ole Madsen Mentoring Award (2025), the IEEE ITS Best Dissertation Award (2019), and the CUTC Milton Pikarsky Memorial Award (2018). She serves on the Board of Governors for the IEEE ITSS, is an Associate Editor or Area Chair for ICML, NeurIPS, ICRA, Transportation Research Part C, and Operations Research, and served as Program Co-chair for RLC 2025. She is also the inaugural Chair and Co-founder of the REproducible Research In Transportation Engineering (RERITE) Working Group.

Apr 21, 2026

Juba Ziani

How Differential Privacy Shapes Incentives in Data Sharing

→ Abstract and Bio

Many data-driven decisions rely on combining data held by multiple independent actors who differ in their willingness to share sensitive information. While differential privacy is often regarded as a gold standard for privacy in statistical analysis and learning tasks, much less is understood about how privacy choices shape incentives to participate and contribute data when participation is voluntary. This talk takes an economic and incentive-centric view of differential privacy in collaborative and federated data-sharing environments. I study settings in which participation is voluntary, and privacy protection affects agents' incentives, trading off learning benefits—which improve as more data are pooled—against privacy disutilities from sharing sensitive information. The talk analyzes both centralized designs, in which a platform commits to a (potentially personalized) privacy policy and agents decide whether to participate, and decentralized designs, in which agents jointly determine participation and form data-sharing coalitions through direct interactions. This is based on two recent joint works. One with Rachel Cummings, Hadi Elzayn, Vasilis Gkatzelis, Manolis Pountourakis, and one with Raef Bassily, Kate Donahue, and student authors Diptangshu Sen and Annuo Zhao.

Bio: Juba Ziani is an Assistant Professor in the School of Industrial and Systems Engineering and an Adjunct Professor in the School of Computer Science at Georgia Tech. He is a recipient of the NSF CAREER Award. His research lies at the intersection of computer science, operations research, and economics. He uses tools from learning theory, game theory, and optimization to address technical and societal challenges arising from AI, machine learning, and data-driven decision-making. Prior to joining Georgia Tech, Juba was a Ph.D. student in Computing and Mathematical Sciences at Caltech, advised by Katrina Ligett and Adam Wierman, and a postdoctoral fellow at the Warren Center for Data Science at the University of Pennsylvania, hosted by Sampath Kannan, Michael Kearns, and Aaron Roth.

Apr 14, 2026

Chara Podimata

Toward a Science of Auditing AI-Mediated Information Ecosystems

→ Abstract and Bio

AI-mediated systems, from social media recommendation algorithms to LLMs, now curate the information that billions of people worldwide consume at an unprecedented scale. Yet both operate as black boxes: their internal mechanisms are opaque, their biases poorly understood, and their accountability to ethical norms mostly unenforced. In this talk, I present two complementary studies that work toward a science of auditing such systems. In doing so, I will reveal a duality: LLMs can serve as both the methodological tool and the object of the auditing study. In the first study, I introduce a counterfactual auditing framework that uses LLMs as behavioral engines for synthetic user accounts, enabling causal identification of how social media algorithms respond to user demographics, a form of identification that had previously been infeasible. Deployed on X during the 2024 U.S. presidential election, we find that the platform's recommendation algorithm substantially amplifies toxic, polarizing, and right-leaning content, with effects that are highly heterogeneous across user types and political leanings. In the second, I turn the same auditing lens on LLMs themselves, querying 12 models daily from July through November 2024 on a set of more than 12,000 election-related questions. I find that LLMs exhibit systematic biases in how they represent candidates and electoral issues, are sensitive to demographic steering, and hold implicit (and highly unstable) beliefs about election outcomes. These findings suggest that LLMs are political actors, whether or not they intend to be. Taken together, I argue that auditing AI-mediated information systems requires new methodological frameworks, ones that are counterfactual, large-scale, and sensitive to heterogeneity across user populations. Building this science is one of the most pressing challenges at the intersection of AI and society.

Bio: Chara Podimata is the Class of 1942 Career Development Professor and Assistant Professor of Operations Research and Statistics at MIT's Sloan School of Management. Her research sits at the intersection of theoretical computer science, operations research, and AI, with a focus on AI auditing and incentive-aware AI. She received her PhD in Computer Science from Harvard University, where she was a member of the EconCS group, and completed a FODSI postdoctoral fellowship at UC Berkeley. Her work is supported by an Amazon Research Award, a Google Research Scholar Award, a MacArthur Foundation x-grant, and several internal MIT awards. In her "free" time, she trains for marathons and triathlons and adventures with her pup, Terra.

Mar 31, 2026

Vivek Farias

The Sign Estimator: Preference Modeling for LLM Alignment under Heterogeneity

→ Abstract and Bio

LLM alignment methods typically learn a single reward model (either implicitly or explicitly) from pairwise comparison data. This approach implicitly assumes homogeneous preferences across human labelers — an assumption that is violated in practice. As a result, the learned reward model is generally mis-specified: Prior work shows that it is inconsistent with the population-average utility, incurring large distortion, and that recovering the average utility is provably impossible in the worst case. In this work, we show that the average utility is recoverable under a relatively mild assumption. Our accompanying estimator, the Sign Estimator, simply replaces the standard cross-entropy loss function in reward learning pipelines with a notion of binary classification loss and yields a reward model that is ordinally consistent with the population-average utility. We further establish a finite-sample convergence rate of $O(n^{-1/3})$, which provides, to our knowledge, the first consistent estimator for heterogeneous preferences that does not suffer from the curse of dimensionality.

Bio: Vivek is interested in the development of new methodologies and applications for large scale dynamic optimization. He received his Ph.D. in Electrical Engineering from Stanford University in 2007 and is the Patrick J. McGovern (1959) Professor at MIT. Vivek is a recipient of an INFORMS MSOM Student Paper Prize (2006), an INFORMS JFIG paper prize (2009, 2011), the NSF CAREER award (2011), MIT Sloan’s Outstanding Teacher award (2013), the INFORMS Simulation Society Best Publication Award (2014), the INFORMS Pricing and Revenue Management Best Publication Award (2015), the INFORMS MSOM Best Publication award in Management Science (2016), the MSOM Young Scholar Prize (2020), the Wagner prize (2022), the Pierskalla award (2024), and is an Informs Fellow (2025). Vivek’s doctoral advisees have on various occasions won the Nicholson, MSOM, APS and RMP student paper prizes. Outside of academia, Vivek was co-founder/CTO at Celect (2014-19; acquired by Nike); was a corresponding author of the technology at Seer (2018-2020; IPO); and is co-founder/CTO at Cimulate (2023-26; acquired by Salesforce).

Apr 7, 2026

Meena Jagadeesan

Power and Limitations of Aggregation in Compound AI Systems

→ Abstract and Bio

When designing compound AI systems, a common approach is to query multiple copies of the same model and aggregate the responses to produce a synthesized output. Given the homogeneity of these models, this raises the question of whether aggregation unlocks access to a greater set of outputs than querying a single model. In this talk, we investigate the power and limitations of aggregation within a stylized principal-agent framework. This framework models how the system designer can partially steer each agent's output through its reward function specification, but still faces limitations due to prompt engineering ability and model capabilities. Our analysis uncovers three natural mechanisms -- feasibility expansion, support expansion, and binding set contraction -- through which aggregation expands the set of outputs that are elicitable by the system designer. We prove that any aggregation operation must implement one of these mechanisms in order to be elicitability-expanding, and that strengthened versions of these mechanisms provide necessary and sufficient conditions that fully characterize elicitability-expansion. Finally, we provide an empirical illustration of our findings for LLMs deployed in a toy reference-generation task. Altogether, our results take a step towards characterizing when compound AI systems can overcome limitations in model capabilities and in prompt engineering. Based on joint work with Nivasini Ananthakrishnan.

Bio: Meena Jagadeesan is an incoming Assistant Professor in the Computer and Information Science Department at the University of Pennsylvania, starting in Summer 2026. Her research aims to steer multi-agent interactions in machine learning ecosystems. She is currently a Stanford AI Lab postdoctoral fellow advised by Tatsu Hashimoto and Sanmi Koyejo. She received a PhD in Computer Science from UC Berkeley advised by Michael I. Jordan and Jacob Steinhardt.

Mar 10, 2026

Suresh Venkatasubramanian

Frames, Measurements, and Tools: A triple threat for AI governance

→ Abstract and Bio

2025 felt like the year that we started to throw caution to the winds when it came to AI deployment. AI policy priorities have shifted almost 180 degrees, global cooperation has been replaced by talk of American dominance, and the relentless march of LLMs into every nook and cranny of our lives continues apace. And 2026 seems like more of the same. One would not be faulted for thinking that we've abandoned virtually everything we've learnt about how to deploy AI systems -- in decision making or other critical settings -- responsibly. And yet, the lessons we've learnt from over a decade of thinking about responsible AI seem still relevant and still point us towards new and interesting research questions. These lessons can be roughly organized in an interacting triangle of Framing, Measuring, and Building. How we frame the problems we are concerned about, how we measure the degree to which these problems are real, and how we build tools to help us measure and mitigate, are the basis of success in AI governance thus far, and are the way we can start to tackle the next wave of challenges in this space. In this talk I'll present some recent work that can be classified into one or more of these dimensions, from better ways to think about regulating general purpose AI to proofs of concept for how to do end to end auditing of complex AI supply chains.

Bio: Suresh Venkatasubramanian directs the Center for Technological Responsibility, Reimagination, and Redesign (CNTR) with the Data Science Institute at Brown University, and is a Professor of Computer Science and Data Science. Suresh's background is as a computer scientist and his current research interests lie in algorithmic fairness, and more generally the impact of automated decision-making systems in society. Suresh recently finished a stint in the Biden-Harris administration, where he served as Assistant Director for Science and Justice in the White House Office of Science and Technology Policy. In that capacity, he helped co-author the Blueprint for an AI BIll of Rights. His research on algorithmic fairness has received press coverage across the globe, including NPR’s Science Friday, NBC, and CNN, as well as in other media outlets. He is a past member of the Computing Community Consortium Council of the CRA, spent 4 years (2017-2021) as a member of the board of the ACLU in Utah, and is a past member of New York City’s Failure to Appear Tool (FTA) Research Advisory Council, the Research Advisory Council for the First Judicial District of Pennsylvania and the Utah State Auditor's Commission on protecting privacy and preventing discrimination. He was named in 2023 by Fast Company to their AI20 list of thinkers shaping the world of generative AI, and currently sits on the boards of the Data and Society Institute, the Partnership on AI, and the Ada Lovelace Institute. He is the co-chair of ACM's AI And Algorithms Policy Committee.

Feb 17, 2026

Nico Christianson

End-to-end learning for uncertainty- and risk-aware decision-making

→ Abstract and Bio

Machine learning can significantly improve average performance for decision-making under uncertainty in a wide range of domains. However, ensuring robust, risk-aware decisions—a critical need in high-stakes settings—requires well-calibrated uncertainty estimates; yet in high-dimensional settings, there can be many valid uncertainty estimates, each with its own performance profile. That is, not all uncertainty is equally valuable for downstream decision-making. In this talk, I will discuss recent work developing an end-to-end learning framework to train machine learning models while enforcing uncertainty calibration and risk constraints through conformal prediction-based methods. Our proposed approach enables provable guarantees on calibration and risk control while providing consistent improvements over existing, two-stage baselines in applications spanning energy systems and medical image classification.

Bio: Nico Christianson is a Stanford Energy Postdoctoral Fellow and an incoming Assistant Professor of Computer Science at Johns Hopkins University (starting Fall 2026). His research lies broadly at the intersection of algorithms, machine learning, and optimization, with a specific emphasis on the development of new, theoretically-grounded algorithms and AI/ML methods for reliable decision-making under uncertainty. Much of his work is motivated by modern energy and sustainability challenges, with applications ranging from energy resource operation to sustainable computing systems. Nico received his PhD in computing and mathematical sciences from Caltech in 2025, where he was supported by an NSF Graduate Research Fellowship and a PIMCO Data Science Fellowship. His PhD dissertation won Caltech’s Ben P.C. Chou Doctoral Prize in Information Science and Technology and Demetriades-Tsafka-Kokkalis Prize in Renewable Energy. Before Caltech, Nico received an AB in applied mathematics from Harvard College.

Feb 10, 2026

Negin Golrezaei

Learning Safe Strategies for Value Maximizing Buyers in Uniform Price Auctions

→ Abstract and Bio

We study the bidding problem in repeated uniform price multi-unit auctions from the perspective of a value-maximizing buyer. The buyer aims to maximize their cumulative value over T rounds while adhering to per-round return-on-investment (RoI) constraints in a strategic (or adversarial) environment. Using an m-uniform bidding format, the buyer submits m bid-quantity pairs (bi, qi) to demand qi units at bid bi, with m ≪ M in practice, where M denotes the buyer's maximum demand. We introduce the notion of safe bidding strategies as those that satisfy the RoI constraints irrespective of competing bids. Despite the stringent requirement, we show that these strategies satisfy a mild no-overbidding condition, depend only on the bidder's valuation curve, and the bidder can focus on a finite subset without loss of generality. Though the subset size is exponential in m, we design a polynomial-time learning algorithm that achieves sublinear regret, both in full-information and bandit settings, relative to the hindsight-optimal safe strategy. We assess the robustness of safe strategies against the hindsight-optimal strategy from a richer class. We define the richness ratio α ∈ (0, 1] as the minimum ratio of the value of the optimal safe strategy to that of the optimal strategy from richer class and construct hard instances showing the tightness of α. Our algorithm achieves α-approximate sublinear regret against these stronger benchmarks. Simulations on semi-synthetic auction data show that empirical richness ratios significantly outperform the theoretical worst-case bounds. The proposed safe strategies and learning algorithm extend naturally to more nuanced buyer and competitor models.

Bio: Negin Golrezaei is the Theresa Seley Associate Professor of Management Science and an Associate Professor of Operations Management at the MIT Sloan School of Management. Her research focuses on advancing digital marketplaces—such as e-commerce, online advertising, and emissions trading systems—through data-driven strategies and algorithmic innovations. She aims to create more resilient, equitable, and sustainable digital ecosystems. In addition to her academic role, Negin has served as a visiting scholar at Google Research and Meta, where she collaborated with research and product teams to design and test new mechanisms for online marketplaces. Before joining MIT, she was a postdoctoral fellow at Google Research in New York, working with the Market Algorithms team. She holds a BSc (2007) and MSc (2009) in Electrical Engineering from Sharif University of Technology, Iran, and a PhD (2017) in Operations Research from the University of Southern California.

Feb 3, 2026

Moshe Tennenholtz

Artificial Social Intelligence

→ Abstract and Bio

Artificial intelligence (AI) has been growing at an unprecedented pace. Many of us have experienced a “ChatGPT moment” — a realization that AI will profoundly transform our lives. While numerous challenges and calls for improvement remain, there is little doubt that AI agents will play a central role in shaping our future. We argue, however, that the prevailing perspective on AI agent design is insufficient for achieving desirable social welfare, not merely due to computational or regulatory constraints. While it is understood that AI agents should be orchestrated in order to be used by an organization, and that system-level outcomes depend not only on the design of individual agents, the far more intricate reality is that the combination of misaligned incentives and incompatible technological designs may lead to poor social outcomes. Our argument is not merely conceptual but constitutes a concrete call to action: to establish a systematic research agenda on Artificial Social Intelligence, tackling multi-agent alignment among incentive-wise and technology-wise diverse AI agents. We illustrate this vision through four complementary research directions: (i) understanding multi-agent alignment in information retrieval (search, RAG, attribution) ecosystems, (ii) analyzing model selection in language-based economics as a strategic choice, (iii) rethinking fairness and regulation through the lens of multi-agent ethics, and (iv) designing hybrid social laws for human-AI coexistence. Together, these directions outline a roadmap toward welfare-maximizing AI societies— an essential step toward socially aligned intelligence.

Bio: Moshe Tennenholtz is a professor with the Technion -- Israel Institute of Technology, where he joined on 1993, and holds the Sondheimer Technion Academic Chair. In 2008 Moshe founded the Microsoft Research activity in Israel, and served as its leader until 2014, with significant contributions and distinguished ROI. He was also the founder of the Technion-Microsoft research center, and served as his scientific director, and served as co-founder and chief scientist of several startups. In joint work with colleagues and students he introduced several contributions to the interplay between artificial intelligence to game theory / economics, such as the study of artificial social systems, co-learning, non-cooperative computing, distributed games, the axiomatic approach to ranking, reputation, recommendation and trust systems, competitive safety analysis, program equilibrium, mediated equilibrium, learning equilibrium, as well the first near-optimal algorithm for reinforcement learning in adversarial contexts.

Jan 27, 2026

Jon Kleinberg

Superhuman AI in a Complex Human Ecosystem: Chess as a Model Domain

→ Abstract and Bio

In domains where AI systems have achieved superhuman performance, there is an opportunity to study the similarities and contrasts between human and AI behaviors at the level of granular decisions, not just aggregate performance. This type of analysis can yield several potential sources of insight. First, by studying expert-level human decisions through the lens of systems that far surpass this performance, we can try to characterize the settings in which humans errors are most likely to occur. Second, we can try to design systems whose decisions match human ones as closely as possible. And finally, we can ask whether it is possible to adapt superhuman AI so that its decisions can be usefully interleaved with human decisions, making them compatible in a way that allows collaboration. We pursue these goals in a domain with a long history in AI: chess. For our purposes, chess provides a setting with many different levels of human expertise; like other domains where people acquire expertise and mastery, it is a context in which people train over many years, drawing on more than a century of scholarship in the area, and acquire levels of skill far beyond what most practitioners can ever hope to achieve. And yet if we construe the goal of chess to be the winning of chess games, then algorithms have long since surpassed human beings, and by an increasingly enormous margin, allowing us to study what happens when powerful algorithms are introduced into a domain like this. We'll discuss a line of work that predicts human decisions in chess at a move-by-move level much more accurately than existing chess engines, and in a way that is tunable to fine-grained differences in human skill; then we'll talk about extensions that use this framework to create AI chess agents that are simultaneously superhuman but also more compatible with human decision-making. We'll use these results to reflect on what we can learn from chess as a setting that simultaneously exhibits both very high levels of human skill and AI that has progressed far into superhuman levels of ability. The talk is based on joint work with Ashton Anderson, Solon Barocas, Karim Hamade, Difan Jiao, Reid McIlroy-Young, Sendhil Mullainathan, Siddhartha Sen, Zhenwei Tang, Russell Wang, and Eric Xue.

Bio: Jon Kleinberg is the Tisch University Professor in the Departments of Computer Science and Information Science at Cornell University. His research focuses on the interaction of algorithms and networks, the roles they play in large-scale social and information systems, and their broader societal implications. He is a member of the National Academy of Sciences, the National Academy of Engineering, the American Academy of Arts and Sciences, and the American Philosophical Society, and he has served on advisory groups including the National AI Advisory Committee (NAIAC) and the National Research Council's Computer Science and Telecommunications Board (CSTB) and Committee on Science, Technology, and Law (CSTL). He has received MacArthur, Packard, Simons, Sloan, and Vannevar Bush research fellowships, as well as awards including the the Nevanlinna Prize, the World Laureates Association Prize, the ACM/AAAI Allen Newell Award, and the ACM Prize in Computing.

Jan 13, 2026

Sebastien Bubeck

Recent advances in LLMs for Mathematics

→ Abstract and Bio

I will review the progress of large language models for mathematics over the last 3 years, from barely solving high school level mathematics to solving some minor open problems in convex optimization, combinatorics and probability theory. The emphasis will be on trying to identify the shape of the current frontier capabilities, as it stands today, finding out both where it's helpful and where it's still falling short as a research assistant.

Bio: Sebastien Bubeck is currently a research lead at OpenAI. Previously he served as VP AI and Distinguished Scientist at Microsoft, spending 10 years in Microsoft Research, and before that he was an assistant professor at Princeton University. His work on machine learning, convex optimization and online algorithms won several best paper awards, and more recently his work on Large (and Small) Language Models, including their applications to science, were featured in mainstream media such as the New York Times and Wired.

Jan 6, 2026

Christian Borgs

Are travel bans effective in containing the spread of a disease?

→ Abstract and Bio

In this talk, I present a mathematical model for the spread of an epidemic from one community to another via travel. Here each community is modeled by a random network (for simplicity, we assume it is an Erdos-Renyi random graph), with the epidemic spread inside the community given by the SIR model on this graph. Travel is modeled by individuals moving from one community to the other at some rate eta_T, and returning home at another rate eta_H. We assume that the return rate is of the same order as the recovery rate of the epidemic, while eta_T is much smaller. Under this assumption, we rigorously prove that if an epidemic starts in the first community, and the second community enacts a travel ban at the moment the epidemic is large enough to be detectable, such a travel ban is ineffective in preventing a large outbreak in the second community. But contrast, other mitigation measure like masks or vaccinations (modeled by reducing the rate of infections in the second community) are effective.

Bio: Christian Borgs is professor in the Berkeley AI Research Group (BAIR) in the EECS department at Berkeley, and faculty director of the Bakar Institute of Digital Materials for the Planet. Borgs is a Fellow of the American Mathematical Society, and the American Association for the Advancement of Science. Borgs current research focuses on both AI for science and the science of networks, including mathematical foundations, particularly the theory of graph limits aka Graphons (which he co-invented about 15 years ago), graph processes, graph algorithms, and applications of graph theory from economics to systems biology and epidemics.

Nov 18, 2025

Adam Kalai

Why Language Models Hallucinate

→ Abstract and Bio

Large language models (LLMs) sometimes generate statements that are plausible but factually incorrect—a phenomenon commonly called "hallucination." We argue that these errors are not mysterious failures of architecture or reasoning, but rather predictable consequences of standard training and evaluation incentives. We show (i) that hallucinations can be viewed as classification errors: when pretrained models cannot reliably distinguish a false statement from a true one, they may produce the false option rather than saying I don't know; (ii) that optimization of benchmark performance encourages guessing rather than abstaining, since most evaluation metrics penalize expressing uncertainty; and (iii) that a possible mitigation path lies in revising existing benchmarks to reward calibrated abstention, thus realigning incentives in model development. Joint work with Santosh Vempala (Georgia Tech) and Ofir Nachum & Edwin Zhang (OpenAI).

Bio: Adam Tauman Kalai is a Research Scientist at OpenAI, specializing in AI Safety and Ethics. His research interests also include algorithms, AI theory, and game theory. Adam earned his BA from Harvard University and his PhD from Carnegie Mellon University, after which he served as an Assistant Professor at TTIC and Georgia Tech and a Senior Principal Researcher at Microsoft Research New England. He is also a member of Project CETI's science team.

Nov 11, 2025

Sarah Cen

Bridging the Gap Between Research and Policy in AI Safety and Accountability

→ Abstract and Bio

As AI becomes increasingly integrated into both the private and public sectors, challenges around AI safety and policy have arisen. There is a growing, compelling body of work around the legal and societal challenges that come with AI, but there is a gap in our rigorous understanding of these problems. In this talk, I dive deep into a few topics in AI safety and policy. We will discuss AI supply chains (the increasingly complex ecosystem of AI actors and components that contribute to AI products) and study how AI supply chains complicate machine learning objectives. We'll then shift our discussion to AI audits and evidentiary burdens in cases involving AI. Using Pareto frontiers as a tool for assessing performance-fairness tradeoffs, we will show how a closed-form expression for performance-fairness Pareto frontiers can help plaintiffs (or auditors) overcome evidentiary burdens or a lack of access in AI contexts. I'll conclude with a longitudinal study of LLMs during the 2024 US election season. If time permits, we may touch on formal notions of trustworthiness.

Bio: Sarah Cen is a postdoc at Stanford University and incoming Assistant Professor at Carnegie Mellon University's Departments of ECE & EPP. At Stanford, Sarah works with Prof. Percy Liang in Computer Science and Prof. Daniel Ho in the Stanford Law School. Her research is interdisciplinary and inspired by works in machine learning, economics, law, and policy. She has ongoing work on algorithmic auditing, AI supply chains, due process for AI determinations, risk under the EU AI Act, and formalizing trustworthy algorithms. Previously, Sarah received her BSE in Mechanical Engineering from Princeton University and Master's in Engineering Science (Robotics) from Oxford University, where she worked on autonomous vehicles.

Nov 4, 2025

Dylan Hadfield-Menell

Building aligned agents for open-universe tasks

→ Abstract and Bio

As agents move from the lab into real-world settings, designers have a limited ability to anticipate the agent's context and design explicit safeguards. In this talk, I will outline challenges that this raises from the perspective of designing flexible, robust, and aligned agent behaviors. The key to the approach is to design agents that can model and respond to appropriate uncertainty about a user's intended goal and the normative environment they are deployed into. I will begin with a survey of current alignment techniques and AI agents, then outline the theoretical motivation for this approach. Next, I will describe recent work from my lab that attempts to address this problem by 1) designing flexible goal inference mechanisms that can track the set of plausible user goals reliably from context; and 2) integrating these inference tools with efficient agent designs that leverage POMDP solvers in order to train agents that implement belief-constrained behaviors. I will conclude with a discussion of recent work that evaluates collaborative agents and discuss the implications for the design of aligned systems that augment and integrate with human users and intent.

Bio: Dylan Hadfield-Menell is an Associate Professor of EECS at MIT. His research develops methods to ensure that AI systems behavior aligns with the goals and values of their human users and society as a whole, a concept known as 'AI alignment'. His goal is to enable the safe, beneficial, and trustworthy deployment of AI in real-world settings.

Oct 21, 2025

Connor Lawless

Democratizing Optimization via Generative AI

→ Abstract and Bio

From healthcare delivery to resilient power grid management, optimization has the potential to improve decision-making for some of today's most pressing problems, but its use is often limited by the mathematical expertise required to model and solve complex problems. This talk will showcase the potential of generative AI to lower this barrier and democratize access to advanced optimization tools. Motivated by a collaboration with Microsoft Outlook, the first part of the talk will present a novel framework for interactive decision support for non-expert users that leverages large language models (LLM) to translate user requests into an underlying constraint programming model. We investigate this framework through the lens of meeting scheduling, and showcase its potential via a user study with a prototype system. In the second part of the talk, we demonstrate how LLMs can be used to automatically generate problem-specific optimization solver configurations, a challenging task for even expert optimization users. Our approach achieves up to 70% speed-ups over default solver settings with little-to-no additional compute. We will conclude by discussing broader opportunities for integrating natural language and optimization, moving toward a future where powerful decision-making tools are as accessible for managers at a local food bank as they are for applied scientists at Amazon.

Bio: Connor Lawless is a Postdoctoral Fellow at the Stanford Institute for Human- Centered Artificial Intelligence advised by Ellen Vitercik and Madeleine Udell. His research blends tools from optimization, machine learning, and human-computer interaction to make advanced analytics tools more accessible and trustworthy. He received his PhD in Operations Research from Cornell University where he was advised by Oktay Gunluk, and previously spent time at Microsoft Research, IBM Research, and the Royal Bank of Canada.

Oct 14, 2025

Alice Oh

LLM Evaluation for the Real World

→ Abstract and Bio

Traditional evaluation methods for large language models (LLMs)—often centered on accuracy in static multiple-choice or short-answer questions—fail to capture real-world complexities. As LLMs increasingly serve users in dynamic, multicultural contexts, we must redefine meaningful evaluation. This talk presents our recent research advancing LLM evaluation through culturally aware, socially grounded, and customizable benchmarks. We assess factual consistency across languages, everyday knowledge in underrepresented cultures, and cultural inclusivity. We highlight that biases become evident in generation tasks, reflecting actual LLM use. Central to our approach is BenchHub, a unified benchmark suite categorizing over 300,000 questions across diverse domains and cultures, enabling tailored evaluations. BenchHub underscores domain-specific variations and the critical role benchmark composition plays in LLM performance rankings. These insights demonstrate that accuracy alone is insufficient; comprehensive LLM evaluation must consider culture, context, and customization. This talk advocates a broader evaluation agenda, presenting foundational steps toward robust, inclusive assessments.

Bio: Alice Oh is a Professor in the School of Computing at KAIST. Her major research area is at the intersection of natural language processing (NLP) and computational social science, with a recent focus on multilingual and multicultural aspects of LLMs. She collaborates with scholars in humanities and social sciences such as political science, education, and history. She has served as Program Chair for ICLR 2021 and NeurIPS 2022, General Chair for ACM FAccT 2022 and NeurIPS 2023, and DEI Chair for COLM 2024. She is the current President of SIGDAT which oversees EMNLP.

Oct 7, 2025

Nika Haghtalab

Distortion of Learning and AI Alignment from Heterogenous Human Feedback

→ Abstract and Bio

After pre-training, large language models are aligned with human preferences based on crowdsourced pairwise comparisons. State-of-the-art alignment methods (such as PPO-based RLHF and DPO) are built on the assumption of aligning with a single preference model, despite being deployed in settings where users have diverse preferences. As a result, it is not even clear that these alignment methods produce models that satisfy users in any meaningful way. In this work, we ask a deceptively simple yet foundational question: Do state-of-the-art alignment methods actually produce models that satisfy users on average in the presence of heterogeneous preferences? Drawing on social choice theory, and modeling each user's comparisons via an individual Bradley–Terry (BT) model, we introduce the distortion of an alignment method: the worst-case ratio between the optimal achievable average utility and the average utility of the learned policy. This notion yields concrete insights into alignment with heterogeneous preferences. In particular, we establish an impossibility result for aligning to average user utility — counter to the conventional wisdom that ML methods, even if imperfect for every individual, at least perform well on average. Distortion also highlights sharp differences between alignment methods: we show that widely used approaches such as RLHF and DPO can have exponentially large — or even unbounded — distortion, whereas a constant minimax-optimal distortion is achievable via a method inspired by social choice theory, known as maximal lotteries, or Nash Learning from Human Feedback.

Bio: Nika Haghtalab is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. She works on a broad and versatile set of problems related to machine learning, algorithms, economics, and society. Her work contributes to an emerging mathematical foundation for learning and decision-making systems in the presence of economic and societal forces. Her work has been recognized by a Sloan fellowship (2024), Schmidt Sciences AI2050 award, NSF CAREER (2022), Google Research Scholar award (2023), NeurIPS and ICAPS best paper awards, EC exemplary track paper awards, and several other industry awards and fellowships.

Sep 30, 2025

Andrew Ilyas

Data Attribution, Selection, and Valuation at Scale with Metagradients

→ Abstract and Bio

Training data is now recognized as a key driver the performance of AI systems. Indeed, AI companies are signing multi-million dollar deals for training data acquisition, raising the question: how "should" this training data be priced? Understanding how to value training data requires us to understand the downstream impact of this data on model behavior---which is made challenging by the complex, uninterpretable nature of large-scale ML models. In the first part of this talk, we present some recent work on tracing back model performance to training data---improving on a long line of prior work in machine learning, our method can optimally (in a natural sense) predict the impact of training data on model performance. In the second part of the talk, we propose a framework for studying data pricing theoretically, inspired by our experimental results in the first part of the talk. We conclude with some open questions and directions.

Bio: Andrew is an incoming Assistant Professor at CMU. Previously, he was a Stein Fellow at Stanford and a PhD student at MIT, where he was supported by an Open Philanthropy AI Fellowship. His interests are currently in understanding and predicting the effects of design choices on downstream machine learning systems.

Previous talks can be found here.

About The Seminar

Seminar Organizers: Amin Saberi, Nikil Selvam, Xizhi Tan, Ellen Vitercik.

Faculty Involved: Itai Ashlagi, Ashish Goel, Ramesh Johari, Amin Saberi, Aaron Sidford, Johan Ugander, Irene Lo, Ellen Vitercik.

Note for Speakers: The talk is 55 minutes including questions (as we often start a couple of minutes late). If you are giving a talk at RAIN, please plan a 45-50 minute talk since the audience usually ask a lot of questions. Also, the audience is fairly knowledgeable, so speakers should not feel obligated to provide basic game-theoretic, algorithmic, societal, industrial, probabilistic, or statistical background.

Website template from the Stanford MLSys Seminar Series.