The RAIN seminar is held on Wednesdays from 12:00-1:00pm in Y2E2 101 . And yes, lunch is provided!
RAIN schedule for Winter Quarter 2015-16
Near Feasible Stable Matchings
Optimization in Online Content Recommendation Services: Beyond Click-Through Rates
||Irad Ben-Gal||Modeling of Personal Mobility Behavioral Patterns
| March 2
RAIN schedule for Spring Quarter 2015-16
Google Calendar for RAIN
Previous year's talks
Archived talks can be accessed here.
Talk AbstractsNear Feasible Stable Matchings
Rakesh Vohra, University of Pennsylvania
Stable matchings when there are complementarities in the preferences of one side need not exist. One instance of where this might matter is in the National Resident Matching Program (NRMP). With the participation of couples, who have preferences over pairs of hospital slots, stable matchings need not exist. Indeed, the problem of determining whether a stable matching exists in the presence of couples is NP-complete. For any student preferences, we show that each instance of a matching problem has a `nearby' instance with a stable matching. The nearby instance is obtained by perturbing the capacities of the hospitals. Specifically, the capacity of each hospital will change (up or down) by no more than 4 and the total capacity of all hospitals will not go down and not down or increase by more than 9. These `perturbation’ bounds are independent of the size of the instance. We think the technique used to arrive at this result of independent interest and should be useful elsewhere. It involves a combination of Scarf’s lemma and randomized rounding.
Joint work with Thanh Nguyen
Bio: Professor Vohra is known for his application of mathematical programming techniques to the study of Game Theory and Mechanism Design. He formerly taught at Northwestern University, where he was the John L. and Helen Kellogg Professor of Managerial Economics and Decision Sciences in the Kellogg School of Management, with additional appointments in the Department of Economics and the Department of Electrical Engineering and Computer Science. He came to Penn as part as of the Penn Integrates Knowledge program that President Amy Guttmann established in 2005 as a University-wide initiative to recruit exceptional faculty members whose research and teaching exemplify the integration of knowledge across disciplines. His appointment is shared between the Department of Economics in the School of Arts and Sciences and the Department of Electrical and Systems Engineering in the School of Engineering and Applied Science.
Optimization in Online Content Recommendation Services: Beyond Click-Through Rates
Yonatan Gur, Stanford Graduate School of Business
A new class of online services allows internet media sites to direct users from articles they are currently reading to other content they may be interested in. This process creates a “browsing path” along which there is potential for repeated interaction between the user and the provider, giving rise to a dynamic optimization problem. A key metric that often underlies this recommendation process is the click-through rate (CTR) of candidate articles. While CTR is a measure of instantaneous click likelihood, we analyze the performance improvement that one may achieve by some lookahead that accounts for the potential future path of users. To that end, using a large data set of user path history at major media sites, we introduce and derive a representation of content along two key dimensions: clickability, the likelihood to click to an article when it is recommended; and engageability, the likelihood to click from an article when it hosts a recommendation. We then propose a class of heuristics that leverage both clickability and engageability, and provide theoretical support for favoring such path-focused heuristics over myopic heuristics that focus only on clickability (no lookahead). We conduct a live pilot experiment that measures the performance of a practical proxy of our proposed class, when integrated into the operating system of a worldwide leading provider of content recommendations. We estimate the aggregate improvement in clicks-per-visit relative to the CTR-driven current practice. The documented improvement highlights the importance and the practicality of efficiently incorporating for the future path of users in real time.
Joint work with Omar Besbes and Assaf Zeevi.
Bio: Yonatan Gur is an Assistant Professor of Operations, Information, and Technology at Stanford University’s Graduate School of Business. His research interests include stochastic modeling and optimization, data-driven decision-making, machine learning, and game theory with applications to revenue management, pricing, operations management, and service systems, and with a particular emphasis on dynamic, online environments. Yonatan received his PhD in Decision, Risk, and Operations from Columbia Business School. He also holds a B.Sc. from the school of Physics and Astronomy and an M.Sc. from the School of Mathematical Sciences, Tel Aviv University.
Modeling of Personal Mobility Behavioral Patterns
Irad Ben-Gal, Tel Aviv University
In recent years, personal location data is continuously captured by mobile devices, GPS chips and other sensors. Such data provides a unique learning opportunity on individuals’ mobility behavior that can be used for various applications in transportation, marketing, homeland security, smart cities and more. Nonetheless, modeling such data poses new challenges related to data volume, diversity, inhomogeneity as well as the required granularity level. In this talk, we will address a real ‘smart city’ use-case and cover some of the associated opportunities and challenges. We will present a new set of mobility-behavior models that generalizes Markov Chains and Variable-Order Bayesian Networks. We will discuss how they can be used in different applications such as pattern recognition, anomaly detection, clustering and classification.
Crowdsourcing Societal Tradeoffs
Vincent Conitzer, Duke University
It would be desirable if, as a society, we could reduce the amount of
landfill trash we create, the amount of carbon dioxide we emit, the
amount of forest we clear, etc. Since we cannot (or are in any case
not willing to) simultaneously pursue all these objectives to their
maximum extent, we must prioritize among them. Currently, this is
done mostly in an ad-hoc manner, with people, companies, local
governments, and other entities deciding on an individual basis which
of these objectives to pursue, and to what extent.
A more systematic approach would be to set, at a global level, exact numerical tradeoffs: using one gallon of gasoline is as bad as creating x bags of landfill trash. Having such tradeoffs available would greatly facilitate decision making, and reduce inefficiencies resulting from inconsistent decisions across agents. But how could we arrive at a reasonable value for x? I will discuss this from the viewpoint of computational social choice.
Joint work with Rupert Freeman, Markus Brill, and Yuqian Li
Bio: Vincent Conitzer is the Kimberly J. Jenkins University Professor of New Technologies and Professor of Computer Science, Professor of Economics, and Professor of Philosophy at Duke University. He received Ph.D. (2006) and M.S. (2003) degrees in Computer Science from Carnegie Mellon University, and an A.B. (2001) degree in Applied Mathematics from Harvard University. His research focuses on computational aspects of microeconomics, in particular game theory, mechanism design, voting/social choice, and auctions. This work uses techniques from, and includes applications to, artificial intelligence and multiagent systems. Conitzer has received the Social Choice and Welfare Prize (2014), a Presidential Early Career Award for Scientists and Engineers (PECASE), the IJCAI Computers and Thought Award, an NSF CAREER award, the inaugural Victor Lesser dissertation award, an honorable mention for the ACM dissertation award, and several awards for papers and service at the AAAI and AAMAS conferences. He has also been named a Guggenheim Fellow, a Kavli Fellow, a Bass Fellow, a Sloan Fellow, and one of AI's Ten to Watch. Conitzer and Preston McAfee are the founding Editors-in-Chief of the ACM Transactions on Economics and Computation (TEAC).
Design and Analysis for Experiments in Networks
Johan Ugander, Stanford University
A/B testing is a standard approach for evaluating the effect of online experiments; the goal is to estimate the Average Treatment Effect (ATE) of a new feature or condition by exposing a sample of the overall population to it. A drawback with A/B testing is that it is poorly suited for experiments involving social interference, when the treatment of individuals spills over to neighboring individuals along an underlying social network. This talk will describe a novel experimental framework, graph cluster randomization, for identifying average treatment effects under social interference. Given this framework, we analyze the variance of the average treatment effect as a property of the graph cluster design, and bias/variance trade-offs under exposure model misspecifications. This talk will discuss joint work with Lars Backstrom, Dean Eckles, Brian Karrer, Jon Kleinberg, and Joel Nishimura.
Bio: Johan Ugander is an Assistant Professor of Management Science &ame; Engineering at Stanford University. His research develops algorithmic and statistical frameworks for analyzing social networks, social systems, and other large-scale social data. Prior to joining Stanford he was a post-doctoral researcher at Microsoft Research, Redmond from 2014-2015, held an affiliation with the Facebook Data Science team from 2010-2014, and obtained his Ph.D. in Applied Mathematics from Cornell University in 2014.
Scalable Large Near-Clique Detection in Large-Scale Networks
Charalampos E. Tsourakakis, Harvard School of Engineering and Applied Sciences
Finding large near-(bi)cliques in massive networks is a notoriously hard
problem of great importance to many applications, including anomaly
detection and security, and community detection in social networks, and the Web graph.
But can we exploit idiosyncrasies of real-world networks to attack
this NP-hard problem successfully?
In this talk I will answer this question in the affirmative. Specifically,
I will present a significant advance in routines
with rigorous theoretical guarantees for scalable extraction
of large near-cliques from large-scale networks, the k-clique densest subgraph problem.
I will present exact, approximation, and MapReduce algorithms that allow us
to scale our computations to large-scale networks. At the heart of our scalable
algorithm lies a densest subgraph sparsifier theorem.
Our approach allows us to extract large near-(bi)cliques from a diverse collection of networks with
millions of edges within few seconds. Finally, I will present interesting graph mining findings
such as anomaly detection in real-world networks. I will conclude my talk with some interesting open problems.
Bio: Dr. Charalampos Tsourakakis is currently a CRCS Postdoctoral Fellow in the School of Engineering and Applied Sciences (SEAS) at Harvard University. He received his Ph.D. from the Algorithms, Combinatorics and Optimization (ACO) program at Carnegie Mellon University (CMU). He also holds a Master of Science from the Machine Learning Department at CMU. He did his undergraduate studies in the School of Electrical and Computer Engineering (ECE) at the National Technical University of Athens (NTUA). He is the recipient of a best paper award in IEEE Data Mining and has designed two graph mining libraries for tera-scale graphs. The former has been officially included in Windows Azure, and the latter was a research highlight of Microsoft Research. His main research interest lies in designing scalable and data-aware algorithms for massive networks that recover hidden structure and solve complex data-driven problems.
Racial Discrimination in the Sharing Economy: Evidence from a Field Experiment
Michael Luca, Harvard Business School
Online marketplaces increasingly choose to reduce the anonymity of buyers and sellers in order to facilitate trust. We demonstrate that this common market design choice results in an important unintended consequence: racial discrimination. In a large-scale field experiment on Airbnb, we find that requests from guests with distinctively African-American names are roughly 16% less likely to be accepted than identical guests with distinctively White names. The difference persists whether the host is African American or White, male or female. The difference also persists whether the host shares the property with the guest or not, and whether the property is cheap or expensive. Discrimination is costly for hosts who indulge in it: hosts who reject African-American guests are able to find a replacement guest only 35% of the time. On the whole, our analysis suggests a need for caution: while information can facilitate transactions, it also facilitates discrimination
Joint work with Benjamin Edelman and Dan Svirsky.
Bio: Michael Luca is an assistant professor of business administration at Harvard Business School. Professor Luca’s research applies econometric methods to field data in order to study the impact of information in market settings. He works closely with governments and companies to identify, design, implement, and experiment with different approaches to information disclosure, taking into account the behavioral foundations of how people make decisions. He has done work on rankings, expert reviews, online consumer reviews, and quality disclosure laws, among other types of information provision. Professor Luca has ongoing field research with Facebook, Yelp, several colleges, the UK government, and several U.S. cities, in addition to other partners. His current work focuses on crowdsourced reviews, analyzing a variety of companies including Yelp, Amazon, and ZocDoc. His findings have been written and blogged about in such media outlets as The Wall Street Journal, The New York Times, The Washington Post, The Huffington Post, Chicago Tribune, Harvard Business Review, PC World Magazine, and Salon. Professor Luca received his Ph.D. in economics from Boston University and a bachelor’s degree in economics and mathematics from SUNY Albany.
Bobby Kleinberg, Cornell
An important recent theme in the development of on-line social systems is the potential of crowdsourced effort to solve large problems — defining tasks in which many people can each contribute a small amount of time to the overall goal. In some cases the arrangement is based on a direct compensation scheme, in which a (low) rate is paid for each unit of work performed. But in many settings one only has access to a crowd "in the wild", as they go about their everyday activities. Examples include product recommendations, social news readers, and scientific activities ranging from crowdsourced "citizen science" to the funding of research by national agencies.
In all of these domains, there is a problem of misaligned incentives: the designer's goal is to carry out exploration (of the space of news stories, products, bird habitats, etc.) as efficiently as possible, but for reasons of scale they must implement the exploration via a crowd composed of members who each derive their own utility from participating in the exploration. We model this as a multi-armed bandit problem in which selfish, myopic agents pull arms with publicly observable outcomes, and a principal seeking to maximize the cumulative time-discounted reward may influence the agents by offering monetary (or other) rewards contingent on choosing particular actions. Our main result is a full characterization of the trade-off between the expected payments the principal must make and the total reward that can be achieved.
Joint work with Peter Frazier, David Kempe, and Jon Kleinberg.
Bio: Bobby Kleinberg is an Associate Professor of Computer Science at Cornell University. His research studies the design and analysis of algorithms, and their relations to economics, learning theory, and networks. Prior to receiving his doctorate from MIT in 2005, he spent three years at Akamai Technologies, where he assisted in designing the world's largest Internet Content Delivery Network. He is the recipient of a Microsoft Research Faculty Fellowship, an Alfred P. Sloan Foundation Fellowship, and an NSF CAREER Award.
Distributed Algorithms for Big Graph Analytics
Alex Dimakis, UT Austin
Many datasets from the web, social networks, bioinformatics and neuroscience are naturally graph-structured. Graph algorithms are difficult to implement in distributed computation frameworks like Hadoop MapReduce and for this reason several in-memory graph engines like Pregel, GraphLab and GraphX are being currently developed. Our goal is to design parallelizable algorithms that discover graph structural properties and theoretically analyze their performance. We give an overview of the emerging problems in designing distributed algorithms over such graph engines and discuss future directions.
Bio: Alex Dimakis is an Assistant Professor at the Electrical and Computer Engineering department, University of Texas at Austin. From 2009 until 2012 he was with the Viterbi School of Engineering, University of Southern California. He received his Ph.D. in 2008 and M.S. degree in 2005 in electrical engineering and computer sciences from UC Berkeley and the Diploma degree from the National Technical University of Athens in 2003. During 2009 he was a CMI postdoctoral scholar at Caltech. He received an NSF Career award in 2011, a Google faculty research award in 2012 and the Eli Jury dissertation award in 2008. He is the co-recipient of several best paper awards including the joint Information Theory and Communications Society Best Paper Award in 2012. He is currently serving as an associate editor for IEEE Signal Processing letters. His research interests include information theory, coding theory, signal processing, and networking, with a current focus on distributed storage and machine learning.
Estimating the Causal Impact of Recommendation Systems from Observational Data
Jake Hofman, MSR NYC
Recommendation systems are an increasingly prominent part of the web, accounting for up to a third of all traffic on several of the world's most popular sites. Nevertheless, little is known about how much activity such systems actually cause over and above activity that would have occurred via other means (e.g., search) if recommendations were absent. Although the ideal way to estimate the causal impact of recommendations is via randomized experiments, such experiments are costly and may inconvenience users. In this paper, therefore, we present a method for estimating causal effects from purely observational data. Specifically, we show that causal identification through an instrumental variable is possible when a product experiences an instantaneous shock in direct traffic and the products recommended next to it do not. We then apply our method to browsing logs containing anonymized activity for 2.1 million users on Amazon.com over a 9 month period and analyze over 4,000 unique products that experience such shocks. We find that although recommendation click-throughs do account for a large fraction of traffic among these products, at least 75% of this activity would likely occur in the absence of recommendations. We conclude with a discussion about the assumptions under which the method is appropriate and caveats around extrapolating results to other products, sites, or settings.
Bio: Jake Hofman is a Researcher at Microsoft Research in New York City, where his work in computational social science involves applications of statistics and machine learning to large-scale social data. Prior to joining Microsoft, he was a member of the Microeconomics and Social Systems group at Yahoo! Research. Jake is also an Adjunct Assistant Professor of Applied Mathematics at Columbia University, where he has designed and taught classes on a number of topics ranging from biological physics to applied machine learning. He holds a B.S. in Electrical Engineering from Boston University and a Ph.D. in Physics from Columbia University.
Competitive Algorithms from Competitive Equilibria
Kamesh Munagala, Duke University
Scheduling jobs is a fundamental problem that arises in numerous forms
and in various situations. In a typical scheduling problem, jobs of
unknown duration arrive in an online fashion. The rate at which jobs
get processed at any time instant depends on the resources allocated
by the scheduler to them. We introduce and study a general scheduling
problem that we term the Packing Scheduling problem (PSP), where these
rates are subject to arbitrary packing constraints. The PSP framework
captures a variety of scheduling problems that arise in data center
computing and multi-core architectures. More concretely, PSP models
multidimensional resource requirements and parallelizability, as well
as network bandwidth requirements arising in scheduling jobs on shared
clusters. It also captures many classical scheduling problems such as
heterogeneous machine scheduling, broadcast scheduling, and scheduling
jobs of different parallelizability.
For the PSP problem we show a constant competitive algorithm for minimizing total weighted completion time, when job arrivals and durations are unknown and adversarially chosen. Surprisingly, we achieve this result by applying the well-known Proportional Fairness algorithm from economics literature to allocate resources to jobs each time instant. Proportional Fairness has been extensively studied in the context of maximizing fairness in resource allocation; however, we present one of the first analyses in the context of scheduling. We also show that this algorithm has good theoretical guarantees for minimizing total delay (latency or flow time) in several special cases. Our results showcase the advantage of viewing complex scheduling problems as sequential resource allocation problems, and applying ideas from economics for both algorithm design and analysis.
Joint work with Sungjin Im (UC Merced) and Janardhan Kulkarni (Duke).
Bio: Kamesh Munagala is an Associate Professor of Computer Science at Duke University. He obtained his Ph.D. from Stanford University in 2003 and B.Tech. from IIT Bombay in 1998. He is primarily interested in approximation algorithms, sequential decision theory, and computational economics, as well as applications of these techniques to e-commerce, databases, and social networks. He is a recipient of the NSF CAREER Award and the Alfred P. Sloan Research Fellowship, and is a co-author of the best paper at WWW 2009. He was a Visiting Research Professor at Twitter in 2012, and currently serves as the Director of Graduate Studies for the Duke CS department.
Physics-inspired Algorithms and Phase Transitions in Community Detection
Cris Moore, Santa Fe Institute
Detecting communities, and labeling nodes according to which community they belong to, is an essential problem in the study of social and biological networks. One of the best and fastest algorithms for this problem is Belief Propagation, which updates probability distributions of node labels until they reach a fixed point. I'll give a friendly introduction to Belief Propagation, and the analogy with the cavity method of statistical physics. Don't worry, no physics is required!
In addition to being practically useful, our Belief Propagation algorithm can be studied analytically, revealing phase transitions in the ability of any algorithm to find communities. It also gives us principled ways to tell when communities are statistically significant, unlike algorithms that "overfit" and find illusory structure even in random networks. It can be extended to dynamic networks where nodes switch communities over time. Finally, I'll argue that the consensus of many good solutions is often better than the "best" single solution — and that this is true for many real-world optimization problems.
This is joint work with Aurelien Decelle, Florent Krzakala, Elchanan Mossel, Joe Neeman, Allan Sly, Lenka Zdeborova, and Pan Zhang.
Bio: Cristopher Moore received his B.A. in Physics, Mathematics, and Integrated Science from Northwestern University, and his Ph.D. in Physics from Cornell. He has published over 120 papers at the boundary between physics and computer science, ranging from quantum computing, to phase transitions in NP-complete problems, to the theory of social networks and efficient algorithms for analyzing their structure. With Stephan Mertens, he is the author of The Nature of Computation, published by Oxford University Press. He is a Professor at the Santa Fe Institute.
In 2014, Cris Moore was named a fellow of the American Physical Society for fundamental contributions to the at the interface between nonlinear physics, statistical physics, and computer science, including complex network analysis, phase transitions in NP-complete problems, and the computational complexity of physical simulation.
Optimal Contracts for Intermediaries in Online Advertising
Ozan Candogan, Duke Fuqua Business School
In online display advertising, the prevalent method advertisers employ to acquire impressions is to contract with an intermediary. These contracts involve upfront payments made by the advertisers to the intermediary, in exchange for running campaigns on their behalf. This paper studies the optimal contract offered by the intermediary in a setting where advertisers’ budgets and targeting criteria are private. This problem can naturally be formulated as a multi-dimensional dynamic mechanism design problem, which in general is hard to solve. We tackle this problem by employing a novel performance space characterization technique, which relies on delineating the expected cost and value achievable by any feasible (dynamic) bidding policy. This technique provides a convex optimization formulation of the optimal contract design problem. Using this formulation, we obtain a closed-form characterization of the optimal contract, when advertisers have identical value distributions. Conversely, when advertisers are heterogeneous, we provide a duality-based approach, which reduces the optimal contract design problem to a simpler convex optimization problem. The intermediary’s bid in the optimal contract is obtained by first using the optimal dual solution to compute a weighted average of the values associated with different types (to guarantee that the advertiser reports her type truthfully), and then shading this quantity (to account for budget constraints). Our results indicate that an intermediary can profitably provide bidding service to a budget-constrained advertiser, and at the same time increase the overall market efficiency.
Joint work with Santiago Balseiro (Duke University).
Bio: Ozan Candogan is an assistant professor at the Fuqua School of Business, where he is a member of the Decision Sciences area. Prior to joining Fuqua, he received his Ph.D. and M.S. degrees in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. His research interests are in mechanism design, decision making in social and economic networks, and analysis of dynamic strategic interactions. His most recent work focuses on the design of novel iterative auction mechanisms for efficient allocation of resources, and he was a finalist in the 2013 George Nicholson Student Paper Competition. Potential applications of his research include novel auction formats that can be employed for procurement, pricing and advertising mechanisms that optimally utilize available social network information, and pricing mechanisms for selling cloud computing service. His research has appeared in journals such as Management Science, Operations Research, Mathematics of Operations Research, and Games and Economic Behavior. He is also a recipient of the 2012 Microsoft Research Ph.D. fellowship, and 2009 Siebel Scholarship.
Randomized Field Experiments in Mobile Marketing
Anindya Ghose, NYU Stern
The explosive growth of smartphones and location-based services (LBS) has contributed to the rise of mobile advertising. In this talk, we present results from multiple randomized field experiments that are designed to measure the effectiveness of mobile marketing and promotions. In the first experiment, using data from a location based app for smartphones, we measure the effectiveness of mobile coupons. The aim is to analyze the causal impact of geographical distance between a user and retail store, the display rank, and coupon discounts on consumers’ response to mobile coupons. In a second large scale field study we examine the role of contextual crowdedness on the redemption rates of mobile coupons. We find that people become increasingly engaged with their mobile devices as trains get more crowded, and in turn become more likely to respond to targeted mobile messages. The change in behavior can be accounted for by the phenomenon of “mobile immersion”: to psychologically cope with the loss of personal space in crowded trains and to avoid accidental gazes, commuters can escape into their personal mobile space. In turn, they become more involved with targeted mobile messages they receive, and, consequently, are more likely to make a purchase in crowded trains. These studies causally show that mobile advertisements based on real-time static geographical location and contextual information can significantly increase consumers’ likelihood of redeeming a geo-targeted mobile coupon. However, beyond the location and contextual information, the overall mobile trajectory of each individual consumer can provide even richer information about consumer preferences. In a third study, we propose a new mobile advertising strategy that leverages full information on consumers’ offline moving trajectories. To examine the effectiveness of this new mobile trajectory-based advertising strategy, we designed a large-scale randomized field experiment in one of the largest shopping malls in the world. We find that mobile trajectory-based advertising can lead to highest redemption probability, fastest redemption behavior, and highest satisfaction rate from customers at the focal advertising store. Various practical implications for mobile marketing are discussed.
Bio: Anindya Ghose is a Professor of IT and a Professor of Marketing at New York University's Leonard N. Stern School of Business. He is the co-Director of the Center for Business Analytics at NYU Stern. He is the Robert L.& Dale Atkins Rosen Faculty Fellow and a Daniel P. Paduano Fellow of Business Ethics at NYU Stern. He has been a Visiting Associate Professor at the Wharton School of Business. He also serves as the Scientific Advisor to 3TI China. He was recently named by Business Week as one of the "Top 40 Professors Under 40 Worldwide" and by Analytics Week as one the "Top 200 Thought Leaders in Big Data and Business Analytics". His research has been recognized by 10 best paper awards and nominations. He has published more than 75 papers in premier scientific journals and peer reviewed conferences, and has given more than 200 talks internationally. His research has been profiled and he has been interviewed numerous times in the BBC, Bloomberg TV, CNBC, China Daily, Financial Times, Fox News, Forbes, Knowledge@Wharton, Korean Broadcasting News Company, Los Angeles Times, MSNBC, NBC, New York Times, New York Daily, National Public Radio, NHK Japan Broadcasting, Reuters, Time Magazine, Washington Post, Wall Street Journal, Xinhua, and elsewhere.