Stanford RAIN (Research on Algorithms and Incentives in Networks) Seminar

The Internet is a complex network made of both machines and people, and hence, problems in this domain often require techniques from both algorithms and the social sciences. The RAIN (Research on Algorithms and Incentives in Networks) seminar, supported by SOAL, provides a gathering place for talks and social discussion in this area.

Upcoming Talks

Apr. 22, 2024
Sophie Yu
Constant Regret Primal-Dual Policy for Multi-Way Dynamic Matching
→ Abstract and Bio We study a discrete-time dynamic multi-way matching model. There are finitely many agent types that arrive stochastically and wait to be matched. State-of-the-art dynamic matching policies in the literature require the knowledge of all system parameters to determine an optimal basis of the fluid relaxation, and focus on controlling the number of waiting agents using only matches within the optimal basis (Kerimov et al., 2021a,b; Gupta, 2021). In this paper, we propose a primal-dual policy that schedule matches for future arrivals based on an estimator for the dual solution. Our policy does not require the knowledge of the arrival rates and operates with greater flexibility as it does not restrict matches to only the match types within an optimal basis. We show that our policy is first to achieve constant regret at all times under unknown arrival rates, and when the arrival rates are known, it achieves the optimal scaling as the lower-bound described in Kerimov et al. (2021a,b). Furthermore, when the arrival rates are known, the primal-dual policy significantly outperforms alternative dynamic matching policies in several numerical simulations. Here is the paper link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4357216

Bio: Sophie Yu a Postdoctoral Scholar under Prof. Itai Ashlagi and Prof. Amin Saberi, at Management Science and Engineering, Stanford University. She will join the Wharton School of Business as an assistant professor of Operations, Information and Decisions, starting in summer of 2024. She received her Ph.D. in Decision Sciences from the Fuqua School of Business at Duke University, under Prof. Jiaming Xu and Prof. Yehua Wei. Her research interests focus on data analysis, algorithm design, and performance evaluation in large-scale networks and stochastic systems.
Apr. 29, 2024
Omar Besbes
TBD
→ Abstract and Bio TBD

Bio: TBD

Previous Talks This Year

Sept. 29, 2023
Elaine Shi
Decentralized Mechanism Design
→ Abstract and Bio In transaction fee mechanism design, users bid to get their transactions confirmed in the block. Classical auctions completely fail in such a decentralized environment where even the auctioneer (i.e., miners) can be a strategic player. Further, the miners can collude with a subset of the users, e.g., facilitated by real-world platforms like Flashbots. A line of works have attempted to devise a 'dream' transaction fee mechanism but all have failed. In this talk, I will first show that this is not a coincidence --- in fact, there is a fundamental mathematical barrier towards achieving a 'dream' transaction fee mechanism. Then, I will explain how to overcome impossibilities with the help of cryptography, leading to practical mechanisms that achieve good social welfare and revenue.

Bio: Elaine Shi is an Associate Professor at Carnegie Mellon University. Prior to joining CMU, she taught at Cornell and the University of Maryland. Her research interests include cryptography, security, algorithms, mechanism design, and foundations of blockchains. She has won numerous awards such as the Sloan Fellowship, the Packard Fellowship, the ONR YIP award, the NSA Best Science of Cybersecurity Paper award, Cylab Distinguished Alumni Award, and various other best paper awards. Her work on Oblivious RAM and privacy-preserving algorithms have been deployed at a large scale by companies like Signal, Google, and JP Morgan.
Oct. 2, 2023
Alex Teytelboym
Equilibrium Existence and Implementability
→ Abstract and Bio We explore equilibria in markets with money and in pseudomarkets when there are indivisible goods. We show that under a regularity condition, competitive equilibria exist in a transferable utility economy if and only if all random equilibria in a pseudomarket are implementable as a lottery over allocations. Our paper bridges two core models of competitive market designs for indivisible resources and generates new implementability and maximal domain results for pseudomarkets. One consequence is that random equilibria in pseudomarkets are implementable even in the presence of complementarities which feature in existing pseudomarkets, such as course assignment. We provide further equivalence results in the presence of ex-post individual (e.g., technological) constraints, ex-ante individual (e.g., budget) constraints, priorities (e.g., school choice), and ex-post aggregate (e.g., diversity) constraints.

Bio: Alex Teytelboym is a Professor of Economics at the Department of Economics, University of Oxford, a Tutorial Fellow at St. Catherine’s College, and a Senior Research Fellow at the Institute for New Economic Thinking at the Oxford Martin School. He directs the Oxford University Business Economics Programme. His research interests lie mainly in market design and the economics of networks. He is also interested in environmental economics and is part of the Leverhulme Centre for Nature Recovery. He often advises companies, governments and NGOs on auction and market design. He is also co-founder of Refugees.AI, an organisation that is developing new technology for refugee resettlement (originally funded by Skoll Centre for Social Entrepreneurship).
Oct. 23, 2023
Emre Kiciman
Causal inference and LLMs: A new frontier
→ Abstract and Bio The causal capabilities of large language models (LLMs) is a matter of significant debate, with critical implications for the use of LLMs in societally impactful domains such as medicine, science, law, and policy. We further our understanding of LLMs and their causal implications, considering the distinctions between different types of causal reasoning tasks, as well as the entangled threats of construct and measurement validity. LLM-based methods establish new state-of-the-art accuracies on multiple causal benchmarks. Algorithms based on GPT-3.5 and 4 outperform existing algorithms on a pairwise causal discovery task (97%, 13 points gain), counterfactual reasoning task (92%, 20 points gain), and actual causality (86% accuracy in determining necessary and sufficient causes in vignettes). At the same time, LLMs exhibit unpredictable failure modes and we provide some techniques to interpret their robustness. Crucially, LLMs perform these causal tasks while relying on sources of knowledge and methods distinct from and complementary to non-LLM based approaches. Specifically, LLMs bring capabilities so far understood to be restricted to humans, such as using collected knowledge to generate causal graphs or identifying background causal context from natural language. We envision LLMs to be used alongside existing causal methods, as a proxy for human domain knowledge and to reduce human effort in setting up a causal analysis, one of the biggest impediments to the widespread adoption of causal methods. We also see existing causal methods as promising tools for LLMs to formalize, validate, and communicate their reasoning especially in high-stakes scenarios. In capturing common sense and domain knowledge about causal mechanisms and supporting translation between natural language and formal methods, LLMs open new frontiers for advancing the research, practice, and adoption of causality.

Bio: Emre Kiciman is a Senior Principal Researcher at Microsoft Research, where his research interests span causal inference, machine learning, the security of AI systems, and AI’s implications for people and society. Emre is a co-founder of the DoWhy library for causal machine learning. For publications and talks, please visit https://kiciman.org.
Oct. 30, 2023
Jason Hartline
Regulation of Algorithmic Collusion
→ Abstract and Bio Consider sellers in a competitive market that use algorithms to adapt their prices from data that they collect. In such a context it is plausible that algorithms could arrive at prices that are higher than the competitive prices and this may benefit sellers at the expense of consumers (i.e., the buyers in the market). This paper gives a definition of algorithmic non-collusion for pricing algorithms. The definition allows a regulator to empirically audit algorithms by applying a statistical test to the data that they collect. Algorithms that are good, i.e., approximately optimize prices to market conditions, can be augmented to collect the data sufficient to pass the audit. Algorithms that have colluded on, e.g., higher-than-competitive prices cannot pass the audit. The definition allows for the possession of useful side information which, e.g., may be correlated with supply and demand and should affect the prices used by good algorithms. The paper provides an analysis of the statistical complexity of such an audit, i.e., how much data is sufficient for the test of non-collusion to be accurate. Joint work with Sheng Long and Chenhao Zhang.

Bio: Jason Hartline received his Ph.D. in 2003 from the University of Washington under the supervision of Anna Karlin. He was a postdoctoral fellow at Carnegie Mellon University under the supervision of Avrim Blum; and subsequently a researcher at Microsoft Research in Silicon Valley. He joined Northwestern University in 2008 where he is a professor of computer science. He was on sabbatical at Harvard University in the Economics Department during the 2014 calendar year and visiting Microsoft Research, New England for the Spring of 2015. He is the director of Northwestern’s Online Markets Lab, he was a founding codirector of the Institute for Data, Econometrics, Algorithms, and Learning from 2019-2022, and is a cofounder of virtual conference organizing platform Virtual Chair.
Nov. 6, 2023
Robert Kleinberg
U-Calibration: Forecasting for an Unknown Agent
→ Abstract and Bio We consider the problem of evaluating forecasts of binary events when predictions are consumed by rational agents who take an action in response to a prediction, but whose utility is unknown to the forecaster. We show that optimizing forecasts for a single scoring rule (e.g., the Brier score) cannot guarantee low regret for all possible agents. In contrast, forecasts that are well-calibrated guarantee that all agents incur sublinear regret. However, calibration is not a necessary criterion here; it is possible for miscalibrated forecasts to provide good regret guarantees for all possible agents. Moreover, calibrated forecasting procedures have provably worse convergence rates than forecasting procedures targeting a single scoring rule. Motivated by this, we present a new metric for evaluating forecasts that we call U-calibration, equal to the maximal regret of the sequence of forecasts when evaluated under any bounded scoring rule. We show that sublinear U-calibration error is a necessary and sufficient condition for all agents to achieve sublinear regret guarantees. We additionally demonstrate how to compute the U-calibration error efficiently and provide an online algorithm that achieves asymptotically optimal U-calibration error (on par with optimal rates for optimizing for a single scoring rule, and bypassing lower bounds for the traditionally calibrated learning procedures). Finally, we discuss generalizations to the multiclass prediction setting. This talk is joint work with Renato Paes Leme, Jon Schneider, and Yifeng Teng.

Bio: Bobby Kleinberg is a Professor of Computer Science at Cornell University and a part-time Faculty Researcher at Google. His research concerns algorithms and their applications to machine learning, economics, networking, and other areas. Prior to receiving his doctorate from MIT in 2005, Kleinberg spent three years at Akamai Technologies; he and his co-workers received the 2018 SIGCOMM Networking Systems Award for pioneering the first Internet content delivery network. He is a Fellow of the ACM and a recipient of the ACM SIGecom Mid-Career Award for advancing the understanding of on-line learning and decision problems and their application to mechanism design.
Nov. 13, 2023
Bruno Ribeiro
Causal Lifting and Causal Identification in Link Prediction using Neural Networks
→ Abstract and Bio In this talk, we consider a small piece of the larger puzzle of endowing state-of-the-art neural networks with the ability to perform causal reasoning. First, we will cover the rich literature on causal identification under peer effects, both under intrinsic factors and path-dependent graph evolution models. Then, I will introduce “Causal Lifting”, which generalizes the earlier causal identification methods for both nodes and edges under network effects, also unveiling a fundamental difference between estimating causal effects on nodes from effects on edges. Finally, we will explore how Causal Lifting aligns causal identification seamlessly with how we design state-of-the-art neural network models for relational reasoning tasks, which can then be integrated with chatbots such as ChatGPT.

Bio: Bruno Ribeiro is an Associate Professor in the Department of Computer Science at Purdue University, currently a Visiting Associate Professor at Stanford University. Prior to joining Purdue in 2015, he earned his Ph.D. from the University of Massachusetts Amherst and was a postdoctoral fellow at Carnegie Mellon University. Ribeiro is interested in the intersection between relational learning and causality in deep learning. Ribeiro received an NSF CAREER award in 2020, an Amazon Research Award in 2022, and multiple best paper awards.
Dec. 4, 2023
Shuchi Chawla
Revisiting revenue maximization for many buyers: buy-many mechanisms and sequential posted pricing
→ Abstract and Bio A recent line of research has established a novel desideratum for designing approximately-revenue-optimal multi-item mechanisms, namely the buy-many constraint. Under this constraint, prices for different allocations made by the mechanism must be subadditive implying that the price of a bundle cannot exceed the sum of prices of individual items it contains. This natural constraint has enabled several positive results in multi-item mechanism design bypassing well-established impossibility results. Our work addresses a main open question from this literature involving the design of buy-many mechanisms for multiple buyers. We show that a simple sequential item pricing mechanism with buyer-specific prices can achieve an O(log m) approximation to the revenue of any buy-many mechanism when all buyers have unit-demand preferences over m items. This is the best approximation possible as it directly matches the previous results for the single-buyer setting where no simple mechanism can obtain a better approximation. En route to proving this result we define and prove a multi-dimensional online contention resolution scheme (OCRS) for revenue. This is joint work with Rojin Rezvan (UT-Austin), Yifeng Teng (Google Research), and Christos Tzamos (UW-Madison).

Bio: Shuchi Chawla holds an Endowed Professorship in Computer Science at UT-Austin and is an Amazon Scholar. Shuchi is a theoretical computer scientist specializing in the areas of algorithm design and economics and computation. Prior to joining UT-Austin, she spent 15 years as a professor of CS at the University of Wisconsin-Madison. Shuchi is the recipient of an NSF Career award, a Sloan Foundation fellowship, and several awards for her research and teaching at UW-Madison. Shuchi recently served as the PC Chair of SODA'20 and EC'21, and currently serves on the editorial boards of the SIAM Journal of Computing, the ACM Transactions on Algorithms and the ACM Transactions on Economics and Computation.
Dec. 11, 2023
Grant Schoenebeck
Mechanisms To Procure Information Without Verification
→ Abstract and Bio In a wide variety of contexts including peer grading, peer review, and crowd-sourcing (e.g. evaluating LLM outputs) we would like to design mechanisms which reward agents for producing high quality responses. Unfortunately, computing rewards by comparing to ground truth or gold standard is often cumbersome, costly, or impossible. Instead we would like to compare agent reports. First, we will argue the importance of creating measurements of agent response quality that are both accurate and strategy-proof (agents do not benefit by intentionally falsely reporting). Second, we will illustrate that previous solutions only achieve one of these desiderata: naïve mechanisms fail to be strategy-proof, yet existing strategy-proof mechanisms are much less accurate than the naïve mechanisms in evaluating response quality. Third, we will show how to create mechanisms that are both strategy-proof and (reasonably) accurate. The key theoretical technique is a variational interpretation of mutual information, which permits machine learning to estimate mutual information using only a few samples (and likely has other applications in machine learning even beyond the strategic settings). Finally, time permitting, we will survey future directions for the field.

Bio: Grant Schoenebeck is an associate professor at the University of Michigan in the school of information. His work spans diverse areas in theoretical computer science but has recently focused on combining ideas from theoretical computer science, machine learning, and economics (e.g game theory, mechanism design, and information design) to develop and analyze systems for eliciting and aggregating information from of diverse group of agents with varying information, interests, and abilities. His research is supported by the NSF including an NSF CAREER award. Before coming to the University of Michigan in 2012, he was a Postdoctoral Research Fellow at Princeton. Grant received his PhD at UC Berkeley, studied theology at Oxford University, and received his BA in mathematics and computer science from Harvard.
Feb. 12, 2024
Yuqing Kong
Eliciting Information Without Verification from Humans and Machines
→ Abstract and Bio Many application domains rely on eliciting high-quality (subjective) information. This presentation will talk about how to elicit and aggregate information from both human and machine participants, especially when the information cannot be directly verified. The first part of the talk presents a mechanism, DMI-Mechanism, designed to incentivize truth-telling in the setting where participants are assigned multiple multi-choice questions (e.g. what’s the quality of the above content? High/Low). DMI-Mechanism ensures that truthful responses are more rewarding than any less informative strategy. The implementation of DMI-Mechanism is straightforward, requiring no verification or prior knowledge, and involves only two participants and four questions for binary-choice scenarios. When applied to machine learning, DMI-Mechanism results in a loss function that is invariant to label noise. The second part of the talk discusses the elicitation of information not just from humans but also from machines. Recognizing the limitations in time and resources that humans and machines have, the talk introduces a method to elicit and analyze the 'thinking hierarchy' of both entities. This approach not only facilitates the aggregation of information when the majority of agents are at less sophisticated 'thinking' levels but also provides a unique way to compare humans and machines. This talk is based a series of works including Kong (SODA 2020, ITCS 2022, JACM 2024), Xu, Cao, Kong, Wang (NeurIPS 2019), Kong, Li, Zhang, Huang, Wu (NeurIPS 2022), Huang, Mei, Kong (2024).

Bio: Yuqing Kong is currently an assistant professor at The Center of Frontier Computing Science (CFCS), Peking University. She obtained her Ph.D. degree from the Computer Science and Engineering Department at University of Michigan in 2018 and her bachelor degree in mathematics from University of Science and Technology of China in 2013. Her research interests lie in the intersection of theoretical computer science and the areas of economics: information elicitation, prediction markets, mechanism design, and the future applications of these areas to crowdsourcing and machine learning.
Feb. 26, 2024
Panos Toulis
Experimental Designs for Structural Economic Models on Large Networks
→ Abstract and Bio This talk presents an ongoing large field experiment in a country of South America that randomizes tax audit notices (treatment) to firms connected through a large network of VAT transactions. While the ultimate goal is to optimize tax audit policy, the short-term goal is to estimate causal effects of tax audit notices on firm behavior. Of particular interest is to understand spillovers, that is, the response of firms that are not treated but are connected to other firms that are treated. First, I will discuss why current popular approaches to experimenting on networks are limited by the reality of inter-firm networks, such as their size, high interconnectivity and heavy-tailed degree distributions. I will then describe an approach to experimentation that leverages subtle sub-structures in the network. This approach is specifically designed to allow the application of Fisherian-style permutation tests of causal effects. These testing procedures are computationally efficient and finite-sample valid, qualities that are important for testing in a robust way the parameters of structural economic models.

Bio: Panos Toulis studies causal inference in complex settings (e.g., networks) through resampling methods such as permutation tests. These methods are model-agnostic and thus have a degree of robustness not afforded by classical model-based statistlcal methods. He is also interested in the design of experiments on networks, and generally the interface between statistics and optimization. His research has been published in the Journal of the Royal Statistical Society, Annals of Statistics, Biometrika, Journal of the Americal Statistical Association, Journal of Econometrics, Statistics and Computing, and Games and Economic Behavior, as well as in major machine learning and economics conferences. For his research, Toulis has received the Arthur P. Dempster Award from Harvard University’s Department of Statistics, the LinkedIn Economic Graph Challenge award, and the 2012 Google United States/Canada PhD Fellowship in statistics. Toulis got his PhD in statistics from Harvard University, advised by Edo Airoldi, David Parkes, and Don Rubin. He also holds MS degrees in statistics and computer science from Harvard University, and a BS in electrical and computer engineering from Aristotle University in Thessaloniki, Greece. Outside of academia, he has prior corporate experience in software engineering at Google Inc. and at startup companies in Greece. He also enjoys science fiction, history, and politics.
Mar. 4, 2024
Yash Kanoria
Simulation is All You Need
→ Abstract and Bio Motivated by online matching markets and network revenue management (NRM) problems with many types (e.g., fulfillment optimization), we study dynamic spatial matching (DSM) in which supply and demand live in d dimensional space and need to be matched with each other dynamically. If demand and supply have the same spatial distribution, greedy matching suffices, and achieves average match distance of the same order as the distance to the nearest neighbor, *except* for the case of d=1 and both supply and demand arriving dynamically over time. If demand and supply have different spatial distributions, the matching constraint has bite and greedy matching fails. We introduce a unifying and practical algorithmic principle for NRM and DSM dubbed SOAR: Simulate, Optimize, Assign, Repeat, which repeatedly simulates the future to enable good matching decisions. Simulating one sample path at each stage already enables SOAR to produce near optimal regret for the majority of NRM models in the literature, and for DSM with non-atomic demand and supply distributions. For more challenging NRM and DSM models, SOAR with multiple simulated sample paths at each stage achieves near optimal regret.

Bio: Yash Kanoria is an Associate Professor of Business in the Decision, Risk and Operations division at Columbia Business School, specializing in the design and optimization of marketplaces, especially matching markets. He obtained a BTech from IIT Bombay and a PhD from Stanford in Electrical Engineering. He has received a National Science Foundation CAREER Award, a Sigecom Test of Time Award, and an Operations Research Best Paper award for optimizing Amazon's outbound supply chain.
Apr. 4, 2024
David Shmoys
Algorithmic Tools for Redistricting: Fairness via Analytics
→ Abstract and Bio The American winner-take-all congressional district system empowers politicians to engineer electoral outcomes by manipulating district boundaries. To date, computational solutions mostly focus on drawing unbiased maps by ignoring political and demographic input, and instead simply optimize for compactness and other related metrics. However, we maintain that this is a flawed approach because compactness and fairness are orthogonal qualities; to achieve a meaningful notion of fairness, one needs to model political and demographic considerations, using historical data. We will discuss a series of papers that explore and develop this perspective. In the first (joint with Wes Gurnee), we present a scalable approach to explicitly optimize for arbitrary piecewise-linear definitions of fairness; this employs a stochastic hierarchical decomposition approach to produce an exponential number of distinct district plans that can be optimized via a standard set partitioning integer programming formulation. This enables a large-scale ensemble study of congressional districts, providing insights into the range of possible expected outcomes and the implications of this range on potential definitions of fairness. Further work extending this (joint with Julia Allen & Wes Gurnee), shows that many additional real-world constraints can be easily adapted in this framework (such as minimal county splits as was recently required in Alabama legislation in response to the US Supreme Court decision Milligan v. Alabama). In another paper (joint with Nikhil Garg, Wes Gurnee, and David Rothschild), we study the design of multi-member districts (MMDs) in which each district elects multiple representatives, potentially through a non-winner-takes-all voting rule (as was proposed in H.R. 4000). We carry out large-scale analyses for the U.S. House of Representatives under MMDs with different social choice functions, under algorithmically generated maps optimized for either partisan benefit or proportionality. We find that with three-member districts using Single Transferable Vote, fairness-minded independent commissions can achieve proportional outcomes in every state (up to rounding), and this would significantly curtail the power of advantage-seeking partisans to gerrymander.

Bio: David Shmoys is the Laibe/Acheson Professor and Director for the Center for Data Science for Enterprise & Society at Cornell University. He obtained his Ph.D. in computer science from the University of California at Berkeley in 1984, and held postdoctoral positions at MSRI in Berkeley and Harvard University, and a faculty position at MIT before joining the Cornell faculty. He is a Fellow of the ACM, INFORMS, and of SIAM, was an NSF Presidential Young Investigator. He has been the advisor for 30 graduated Ph.D. students, and his former students are currently on the faculties of many leading universities and industrial research labs, including MIT, Waterloo, Brown, Maryland, Georgetown, and Google. Shmoys' research has focused on the design and analysis of efficient algorithms for discrete optimization problems, with applications including scheduling, inventory theory, computational biology, and most recently, comptuational sustainability. He has been working on data-driven models in a broad cross-section of areas, including COVID epidemiological modeling, congressional districting and IoT network design
Apr. 8, 2024
Tim Roughgarden
The Computer in the Sky
→ Abstract and Bio Turing-complete blockchain protocols approximate the idealized abstraction of a computer in the sky that is open access, runs in plain view, and, in effect, has no owner or operator. This technology can, among other things, enable stronger notions of ownership of digital possessions than we have ever had before. Building the computer in the sky is hard (and scientifically fascinating), and in this talk I'll highlight three threads in my recent research on this challenge: Possibility and impossibility results for permissionless consensus protocols (i.e., implementing an “ownerless” computer). Incentive-compatible transaction fee mechanism design (i.e., making an “open-access” computer sustainable and welfare-maximizing). A Black-Scholes-type formula for quantifying adverse selection in automated market makers (some of the most popular 'programs' running on the computer in the sky). The talk will emphasize the diversity of mathematical tools necessary for understanding blockchain protocols and their applications (e.g., distributed computing, game theory, mechanism design, and continuous-time stochastic processes) and the immediate practical impact that mathematical work on this topic has had (e.g., Ethereum's EIP-1559 and LVR for automated market makers).

Bio: Tim Roughgarden is a Professor in the Computer Science Department at Columbia University and the Founding Head of Research at a16z crypto. Prior to joining Columbia, he spent 15 years on the computer science faculty at Stanford, following a PhD at Cornell and a postdoc at UC Berkeley. His research interests include the many connections between computer science and economics, as well as the design, analysis, applications, and limitations of algorithms. For his research, he has been awarded the ACM Grace Murray Hopper Award, the Presidential Early Career Award for Scientists and Engineers (PECASE), the Kalai Prize in Computer Science and Game Theory, the Social Choice and Welfare Prize, the Mathematical Programming Society's Tucker Prize, the INFORMS Lanchester Prize, and the EATCS-SIGACT Gödel Prize. He was an invited speaker at the 2006 International Congress of Mathematicians and the Shapley Lecturer at the 2008 World Congress of the Game Theory Society. He is a Fellow of the Guggenheim Foundation, the ACM, the Game Theory Society, and the Society for the Advancement of Economic Theory. He has written or edited ten books and monographs, including Twenty Lectures on Algorithmic Game Theory (2016), Beyond the Worst-Case Analysis of Algorithms (2020), and the Algorithms Illuminated book series (2017-2020).
Apr. 15, 2024
Kirthevasan Kandasamy
Mechanism Design for Collaborative Normal Mean Estimation
→ Abstract and Bio Due to the popularity of machine learning, many organizations view data as an invaluable resource, likening it to the "new oil/gold". However, unlike other types of resources, data can be freely replicated and used by many. Hence, data produced by one organization, can, in principle, generate limitless value to many others. This will accelerate economic, social, and scientific breakthroughs and benefit society at large. However, strategic considerations such as free-riding, competition, and data monetization may prevent such open sharing of data between organizations. In this talk, I will focus on free-riding, and show that in naive mechanisms for data sharing, agents may attempt to benefit from the data that the others have collected without contributing any themselves. Free-riding can manifest in strategic behavior such as under-collecting data, or submitting fabricated datasets. I will present some of our recent work where we study these challenges in one of the most foundational statistical problems, normal mean estimation. Here, a set of strategic agents collect i.i.d samples from a normal distribution at a cost, and wish to estimate the mean of this distribution. To facilitate collaboration, we design mechanisms that incentivize agents to collect a sufficient amount of data and share it truthfully, so that they are all better off than working alone. Our approach prevents under-collection and data fabrication via two key techniques: first, when sharing the others’ data with an agent, the mechanism corrupts this dataset proportional to how much the data reported by the agent differs from the others; second, we design minimax optimal estimators for the corrupted dataset. Our mechanism, which is Nash incentive compatible and individually rational, achieves a social penalty (sum of all agents’ estimation errors and data collection costs) that is at most a factor 2 of the global minimum. When extending this work to high dimensional (non-Gaussian) distributions with bounded variance, where agents have heterogeneous data collection capabilities, our mechanisms retain these properties, but with slightly weaker results. This is joint work with Yiding Chen, Alex Clinton, and Jerry Zhu.

Bio: Kirthevasan Kandasamy is an assistant professor in the Department of Computer Sciences at the University of Wisconsin-Madison, working on topics in machine learning and game theory. He was a postdoctoral scholar at the University of California, Berkeley. He completed his PhD in Machine Learning at Carnegie Mellon University.

Previous talks can be found here.

About The Seminar

Seminar Organizers: Itai Ashlagi, Mohak Goyal, Tristan Pollner.

Faculty Involved: Itai Ashlagi, Ashish Goel, Ramesh Johari, Amin Saberi, Aaron Sidford, Johan Ugander, Irene Lo, Ellen Vitercik.

Note for Speakers: The talk is 55 minutes including questions (as we often start a couple of minutes late). If you are giving a talk at RAIN, please plan a 45-50 minute talk since the audience usually ask a lot of questions. Also, the audience is fairly knowledgeable, so speakers should not feel obligated to provide basic game-theoretic, algorithmic, societal, industrial, probabilistic, or statistical background.

Website template from the Stanford MLSys Seminar Series.