Skip to content Skip to navigation

Archive of RAIN talks

Spring 08-09


Date Speaker Topic
Apr 3 Mayur Datar
Google
Internet search advertising: Challenges in targeting and quality considerations
Apr 10
-
No meeting this week
Apr 17 D Sivakumar
Google
Affiliation Networks
Apr 24
-
No meeting -- WWW 2009
May 1
-
No meeting -- BAGT
May 8 Sandeep Pandey
Yahoo! Research
Using Bandit Models in Search Engines
May 14 Richard Chatwin
VP Research, Adchemy
Research Challenges in Online Advertising
May 22 Robert Johnson
Facebook
Scaling Social Data
May 29 Yury Lifshits
Yahoo! Research
Data Cloud: Integrating Structured Information into Web Search
June 5 Robbie Yan Manipulation Robustness of Collaborative Filtering Systems


Winter 08-09


Date Speaker Topic
Jan 16 - Organizational meeting
Jan 23 G.D. Ramkumar
SnapTell, Inc.
Scalable image recognition algorithms for camera phone applications
Jan 30 Luca de Alfaro Content-Driven Reputation and Trust for Collaborative Content
Feb 6 Ioannis Antonellis Tagging with Queries: How and Why?
(To appear in WSDM 2009)
Feb 13 Scott Brave
Baynote, Inc.
Content Targeting and Recommendation Systems
Feb 20 Ravi Kumar Online Social Networks: Modeling and Mining
Feb 27 Michael Mahoney Community Structure in Large Social and Information Networks
Mar 6 Kostas Kollias An Incentive-Based Architecture for Social Recommendations
Mar 13 Ali Dasdan Web Search Engine Metrics: Direct Metrics to Measure User Satisfaction


Autumn 08-09


Date Speaker Topic
Oct 3 - Talk by Jure Leskovec and organizational meeting
Oct 10 Hady Lauw Trusts among Users of Online Communities -- an Epinions Case Study
Oct 17 Canceled -
Oct 24 Bahman Bahmani Online Budgeted Matching in Random Input Models with
Applications to Adwords
Oct 31 Sergei Vassilvitskii Top-k Aggregation Using Intersection of Ranked Inputs
Nov 7 - No meeting - Bay Area Theory Seminar (BATS)
Nov 14 Mukund Sundararajan Utility-Maximizing Privacy Mechanisms
Nov 21 Hamid Nazerzadeh Online advertising
Nov 28 - No meeting - Happy Thanksgiving!
Dec 5 Tim Roughgarden From Bayesian to Worst-Case Optimal Auction Design


Spring 07-08


Date Speaker Topic
Apr 11BAGT
Apr 18Robbie Yan Reputation Markets
Apr 25Cancelled for WWW
May 2Bobji Mungamuru Click Fraud
May 9Ying Xu Oral Examination
May 16 Esteban Arcaute Network Formation Games
Mar 23Christina Aperjis Reputation Systems
May 30Evimaria Terzi Inferring Links and Initiators
June 6Yury Lifshits Social Design


Winter 07-08


Date Speaker Topic
Jan 25Alon Altman An Axiomatic Approach to Personalized Ranking Systems
Feb 1Ashish Goel Open Problems
Feb 8Cancelled
Feb 15Brian Babcock Adchemy
Feb 22Rajat Bhattacharjee Oral Examination
Feb 29Panagiotis Papadimitriou Web Graph Similarity for Anomaly Detection
Mar 7Brad Null Ratio Index for Budgeted Learning, with Applications
Mar 14Mukund Sundararajan Optimal Marketing Strategies over Social Networks


Autumn 07-08


Date Speaker Topic
Oct 15Michael Schapira Interdomain Routing and Games
Oct 22Esteban Arcaute Decentralized Search of Random Graphs
Oct 29Orkut Buyukkokten Social Network Revolution
Nov 5Aranyak Mehta Random input models for Adwords
Nov 12Phil Long Boosting the area under the ROC curve
Nov 19Thanks-giving
Nov 26cancelled
Dec 3Arash Asadpour Ad auctions with reserved price


Spring 06-07


Date Speaker Topic
Apr 13Ying Xu a discussion on WWW'07 papers
Apr 20BAGT
Apr 27Anish Das Sarma Detecting Near-Duplicates for Web Crawling
May 4Mayur Datar Google News Personalization
May 11Adam Meyerson Pricing Partially Compatible Products
May 18Sergei Vassilvitskii Click Fraud Resistant Methods for Learning Click-Through Rates
May 25Dilys Thomas Streaming Algorithms and Worm Fingerprinting
June 1Neil Daswani Anatomy of Clickbot.A


Winter 06-07


Date Speaker Topic
Jan 26Nicolas Lambert Asymptotically Optimal Repeated Auctions
Feb 2Hamid Nazerzadeh Mechanism design based on CPA
Feb 9Ashish Goel Reusable and multiple goods
Feb 16Ying Xu Estimating Search Engine Index and Web size
Feb 23Mukund Sundararajan Is efficiency expensive?
March 2Paul Heymann Collaborative tagging systems
March 9Mihaela Enachescu Multi-path Routing
March 16Aleksandra Korolova / Dilys Thomas Clickthrough of and Bidding on keywords


Autumn 06-07


Date Speaker Topic
Oct 25 Ying Xu The Netflix Challenge
Nov 1 Ashish/Alexandra Social Networks
Nov 8 Arash/Shubha PageRank
Nov 15 TBA TBA
Nov 22 - Netflix follow-up
Nov 29 Zoltan Gyongyi TBA
Dec 6 Sreenivas Gollapudi TBA


Spring 05-06


Date Speaker Topic
Apr 7 Ashish Goel Axiomatic PageRank
Apr 14 - Organizational Meeting
Apr 21 Sergei Vassilvitskii Relevance Feedback and Web Search
Apr 28 Hamid Nazerzadeh AdWords with Unreliable Estimates
May 5 Mihaela Enachescu Compact Routing
May 12 Ying Xu Streaming Algorithms for Graph Problems
May 19 - -
May 26 Udi Manber New Problems in Search
Jun 2 Rajat Bhattacharjee Incentive Based Ranking Mechanisms


Winter 05-06


Date Speaker Topic
Jan 20 - Organizational Meeting
Jan 27 Brad Null Multiarmed Bandit Problem
Feb 3 Edith Cohen Spatially-Decaying Aggregation
Feb 10 Andrew Tomkins Routing and Growth Models for Social Networks
Feb 17 - BAGT
Feb 24 Amin Saberi Open Problems
Mar 3 Mayur Datar, Ashutosh Garg Recommendation Systems
Mar 10 Amruta Joshi Googlebase


Fall 05-06


Date Speaker Topic
Oct 7 Roee Engelberg Equilibria in Online Games
Oct 14 Ying Xu, Sergei Vassilvitskii Models for the Web Graph
Oct 21 Gagan Aggarwal AdWords
Oct 28 Omar Benjelloun SERF
Nov 4 Zoltan Gyongyi Link Spam
Nov 11 Mihaela Enachescu Query Incentive Networks
Nov 18 - BATS
Dec 2 Rajat Bhattacharjee Survey of Recommendation Systems


Spring 04-05


Date Speaker Topic
Apr 4 Shubha Nabar Impact of Search Engines on Evolution of Page Popularity
Apr 11 Svetlozar Nestorov Mining Structure from Semistructured Data
Apr 18 Sepandar Kamvar Reputation Management in P2P Networks
Apr 25 Rina Panigrahy PageRank Axioms
May 2 Amin Saberi Spread of Viruses on the Internet
May 9 Ramesh Johari Stable Policy Routing with Provider Independence
May 16 Rajat Bhattacharjee Reputation Systems
May 23 Mehran Sahami Matching Text on the Web
Jun 6 Amruta Joshi Modeling Documents


Winter 04-05


Date Speaker Topic
Feb 2 Ying Xu Recommendation Systems/Collaborative Filtering
Feb 9 Ashish Goel Spam and Page Rank
Feb 16 Gagan Aggarwal AdWords
Feb 23 David Arthur Gossip and File Sharing
Mar 2 Anil Kamath AdWords
Mar 10 Sergei Vassilvitskii
Mar 16 Dilys Thomas Social Networks

Abstracts

Internet search advertising: Challenges in targeting and quality considerations
Mayur Datar


In this informal talk, I will touch upon two topics related to search advertising:

Targeting: How do you match advertiser keywords to user queries? What are the underlying challenges in bridging this semantic gap?

How do you measure the quality of search ads? Is it in the best interest of all parties involved, users, advertisers and the search engine, to always show ads for queries? What are the costs and benefits to showing ads and how do you decide on when is it appropriate to show ads?

Look forward to more questions than answers! Hopefully these will lead to good research topics.

Bio:
--------------
Mayur Datar works as a Research Scientist with Google Inc. His research interests are in datamining, algorithms, databases and computer science theory. Prior to joining Google, Mayur obtained his doctorate degree in computer science from Stanford university and a Bachelor of Technology degree from I.I.T. Bombay. He was awarded the President of India, Gold Medal for being the most outstanding student of his graduating batch from I.I.T. Bombay. He has published several papers in renowned conferences like SIGMOD, VLDB, KDD, FOCS, SODA, WWW.





Affiliation Networks
D. Sivakumar


In the last decade, structural properties of several naturally arising networks (the Internet, socialnetworks, the web graph, etc.) have been studied intensively with a view to understanding their evolution. In recent empirical work, Leskovec, Kleinberg, and Faloutsos identify two new and surprising properties of the evolution of many real-world networks: densification (the ratio of edges to vertices grows over time), and shrinking diameter (the diameter reduces over time to a constant). These properties run counter to conventional wisdom, and are certainly inconsistent with graph models prior to their work.

In this work, we present the first model that provides a simple, realistic, and mathematically tractable generative model that intrinsically explains all the well-known properties of the social networks, as well as densification and shrinking diameter. Our model is based on ideas studied empirically in the social sciences, primarily on the groundbreaking work of Breiger (1973) on bipartite models of social networksthat capture the affiliation of agents to societies.

We also present algorithms that harness the structural consequences of our model. Specifically, we show how to overcome the bottleneck of densification in computing shortest paths between vertices by producing sparse subgraphs that preserve or approximate shortest distances to all or a distinguished subset of vertices. This is a rare example of an algorithmic benefit derived from a realistic graph model.

The talk will review the history of relevant random graph models, and attempt to illustrate the co-evolution of theoretical modeling and empirical insights in this fascinating slice of graph theory.

The talk reports work done jointly with Silvio Lattanzi of Sapienza University, Roma, Italy, and is to appear in STOC 2009.





Using Bandit Models in Search Engines
Sandeep Pandey


Web search engines perform two basic tasks: they acquire Web pages via crawling, and present lists of pages and advertisements in response to user search queries. Given the gigantic size of the Web these tasks can be viewed as constrained optimization problems. On the acquisition side the constraint is on the crawling resources, while on the presentation side, the constraint is on user attention: users cannot be expected to sift through large amounts of content. Furthermore, some parameters needed to solve these optimization problems may or may not be known leading to the offline and online versions of the problems. In this talk we will look at these optimization problems.

In particular, we will focus on the online presentation of advertisements where the unknown parameters are the ad quality values. I will present a method of simultaneously estimating unknown parameters and exploiting current estimates. Our method extends and adapts well-known "Bandit" models from Machine Learning.

Bio:
--------------
Sandeep Pandey is a Research Scientist in Search and Advertising group at Yahoo! Research. Before joining Yahoo!, he received his Ph.D. from Carnegie Mellon and his B.Tech. from IIT Bombay. His research interests include Web crawling and advertising. He also likes to work on online learning problems such as the multi-armed bandit problem. In his spare time Sandeep enjoys watching cricket and playing tennis.





Research Challenges in Online Advertising
Richard Chatwin
VP Research, Adchemy


Adchemy is leading a revolution in online performance marketing, both as a top-25 U.S. online advertiser and through the development of the Adchemy Digital Marketing Platform, an integrated suite of products that allows online marketers to maximize their customers' end-to-end experience. I will discuss some of the research challenges that we are tackling in developing our technology solutions, including problems related to online matching, multi-armed bandits, and the optimizer's curse.

Bio:
--------------
Richard has been VP Research at Adchemy since its inception in late 2004. He has over nine years of experience in the online advertising industry. He received his PhD from the Operations Research department at Stanford, where he focused on airline revenue management. He spent several years in the consulting world pioneering the development and implementation of real option valuation methodologies for asset valuation.





Scaling Social Data
Robert Johnson
Director of Engineering, Facebook


Facebook serves billions of pages a day based on hundreds of terabytes of dynamic data, which presents many scaling challenges for our storage and caching systems. In particular the social nature of the data gives it some interesting properties - it is not easily clustered or partitioned by users or applications, it updates frequently, and many queries require massive network fan-in. I'll be discussing the problems that arise in scaling such a system and some of the solutions we've used to get to our current scale, as well as our plans for infrastructure to solve these problems at even larger scale and in a more general way for future web 2.0 applications with similar properties.

Bio:
--------------
Robert Johnson is Director of Engineering at Facebook, where he leads the software development efforts to cost-effectively scale Facebook's infrastructure and optimize performance for its many millions of users. During his time with the company, the number of users has expanded by more than twenty-fold and Facebook now handles billions of page views a day. Robert was previously at ActiveVideo Networks where he led the distributed systems and set-top software development teams. He has worked in a wide variety of engineering roles from robotics to embedded systems to web software. He received a B.S. In Engineering and Applied Science from Caltech.





Data Cloud: Integrating Structured Information into Web Search
Yury Lifshits


In this talk we address two questions: (1) How to use structured data in web search? (2) How to gather structured data? For the first question we identify valuable classes of data, present query classes that can benefit from structured data and describe architecture that combines keyword search with structured search. For the second question we present Data Cloud: An ecosystem of data publishers, search engine (data cloud) and data consumers.
We show connection from Data Cloud strategy to classic notion in economics: network effect in two-sided markets. At the end of the talk an early demo implementation will be presented.

Bio:
--------------
Yury Lifshits obtained his PhD degree from Steklov Institute of Mathematics at St.Petersburg (2007). He spent a year as a postdoc at Caltech before joining Yahoo! Research in 2008. He won two gold medals at International Mathematical Olympiads, received the Best Paper Award in Application Track of CSR'07 and the Yandex Teaching Award (2006) for his course "Algorithms for Internet". Yury is a maintainer of "The Homepage of Nearest Neighbors": http://simsearch.yury.name





Manipulation Robustness of Collaborative Filtering Systems
Robbie Yan


While the expanding universe of products available via Internet commerce provides consumers with valuable options, sifting through the numerous alternatives to identify desirable choices can be challenging. Collaborative filtering systems aid this process by recommending to users products desired by similar individuals. Because purchase decisions are influenced by these systems, however, they have become targets of manipulation by unscrupulous vendors. In this talk, we will discuss our findings on the robustness of collaborative filtering systems to manipulation. In particular, we provide theoretical and empirical results demonstrating that while nearest neighbor algorithms, which are widely used in commercial systems, can be highly susceptible to manipulation, two classes of collaborative filtering algorithms which we refer to as linear and asymptotically linear are relatively robust. In particular, we find that as a user rates an increasing number of products, the average accuracy of predictions made by these algorithms becomes insensitive to manipulated data. Our results provide guidance for the design of future collaborative filtering systems.





Scalable image recognition algorithms for camera phone applications
G.D. Ramkumar, SnapTell, Inc.


SnapTell has created technology that allows a mobile phone user to snap a picture of a product and instantly receive reviews, prices, and information. The service has been scaled to support more than ten million products including all books, DVDs, CDs, video games, and other product categories. We present an outline of the algorithms used to power the service emphasizing scaling and locality sensitive indexing. SnapTell's popular iPhone application has been downloaded by over half a million users over the last few weeks.

The technology can also be used to make print advertisements in a magazine, newspaper, poster, or billboard interactive without adding barcodes or otherwise altering the content. Consumers snap a picture of the object and send it to a shortcode to enter into a contest, access a mobile website, or download ringtones, wallpaper, video clips, or other information. SnapTell is the leading vendor in this space and offers a solution with unprecedented scale and robustness.

We present a synopsis of the image matching algorithms used to power the service and discuss potential research topics in the space.





Content-Driven Reputation and Trust for Collaborative Content
Luca de Alfaro (UCSC, visiting at Google)


Open collaboration is one of the most effective ways to create content and aggregate information, as witnessed by the Wikipedia. In this talk, we propose a reputation systems for the authors of collaborative information, and a trust systems for the aggregated information.

The goal of the reputation system is to provide an incentive to collaboration, and to provide information about the quality of authors. We propose a content-driven notion of reputation, where authors of long-lasting contributions gain reputation, while authors of contributions that are not preserved lose reputation. We show that, on the Wikipedia, the content-driven reputation of authors is a good predictor of the quality of their future contributions. We then describe a trust system for content, where the trust in a portion of content is computed according to the reputations of the authors who created and revised it. Again, we show that on the Wikipedia content trust is a good predictor of future content longevity. Finally, we describe how the reputation and trust systems can be made robust to Sybil attacks, in which people use multiple identities to try boost their reputation and raise the trust of content of their choice.

The reputation and trust systems have been implemented in the WikiTrust system/.

Bio:
--------------
Luca de Alfaro received his Ph.D. degree from Stanford University, where he worked on system specification and verification under the supervision of Zohar Manna. After graduation, he was for three years at UC Berkeley as a postdoctoral researcher, working with Thomas A. Henzinger on verification and on game theory. In 2001, Luca de Alfaro joined the faculty of UC Santa Cruz. His research interests involve system specification and verification, embedded software, game theory, incentive systems for on-line collaboration, and trust and reputation on the web. Luca de Alfaro is currently at Google as a visiting scientist, on leave from UC Santa Cruz.

Luca de Alfaro has long been an enthusiastic user of on-line collaborative tools, including wikis. One evening in 2005, after working at the wiki he uses to facilitate information sharing among members of his research group, he decided to start a wiki on cooking, which is one of his passions. The wiki was soon peppered with spam and less than useful contributions. Rather than cleaning up, a most tedious job, Luca de Alfaro shut off the wiki temporarily, and he started pondering how to provide incentives to collaboration, and compute quality metrics for content. As he found the job of rating authors or contributions by hand most tedious, he was interested in methods for computing reputation and quality metrics purely from content analysis. The research led to the WikiTrust system/. The wiki on cooking has been brought back to life (http://www.cookiwiki.org), even though it receives little traffic. You are welcome to register, add recipes, and see WikiTrust in action.





Tagging with Queries: How and Why? (To appear in WSDM 2009)
Ioannis Antonellis


Web search queries capture the information need of search engine users. Search engines store these queries in their logs and analyze them to guide their search results.

In this work, we argue that not only a search engine can benefit from data stored in these logs, but also the web users. We first show how clickthrough logs can be collected in a distributed fashion using the http referer field in web server access logs. We then perform a set of experiments to study the information value of search engine queries when treated as "tags" or "labels" for the web pages that both appear as a result and the user actually clicks on. We ask how much extra information these query tags provide for web pages by comparing them to tags from the del.icio.us bookmarking site and to the pagetext. We find that query tags can provide substantially many (on average 250 tags per URL), new tags (on average 125 tags per URL are not present in the pagetext) for a large fraction of the Web.

Joint work with Hector Garcia-Molina and Jawed Karim





Content Targeting and Recommendation Systems
Scott Brave (Baynote, Inc)


Baynote delivers on-demand recommendation technology and social search for over 200 websites, including Motorola, NASA, Intuit, and Expedia. By tapping into the wisdom of the online community and automatically displaying the best content and products, Baynote's Collective Intelligence Platform (CIP) improves the user experience while increasing online revenue, leads and impressions. In this talk, I will present Baynote's unified approach to content targeting and discuss some of the opportunities and challenges in building next-generation recommendation systems.

Bio:
--------------
Scott Brave is a co-founder and CTO at Baynote, Inc. Prior to Baynote, he was a postdoctoral scholar in the Department of Communication at Stanford University where he served as lab manager for the CHIMe (Communication between Humans and Interactive Media) Lab. Scott received his Ph.D. in Human-Computer Interaction (Communication Dept.), and B.S. in Computer Systems Engineering (AI focus) from Stanford University, and his M.S. from the MIT Media Lab. Scott is also a former Associate Editor of the "International Journal of Human-Computer Studies" and co-author of Wired for speech: How voice activates and advances the human-computer relationship.





Online Social Networks: Modeling and Mining
Ravi Kumar (Yahoo! Research)


Online social networks have become major and driving phenomena on the web. In this talk we will address key modeling and algorithmic questions related to large online social networks. From the modeling perspective, we raise the question of whether there is a generative model for network evolution. The availability of time-stamped data makes it possible to study this question at an extremely fine granularity. We exhibit a simple, natural model that leads to synthetic networks with properties similar to the online ones. From an algorithmic viewpoint, we focus on data mining challenges posed by the magnitude of data in these networks. In particular, we examine topics related to influence and correlation in user activities in such networks.

Bio:
--------------
Ravi Kumar joined Yahoo! Research in July 2005. Prior to this, he was a research staff member at the IBM Almaden Research Center in the Computer Science Principles and Methodologies group. He obtained his PhD in Computer Science from Cornell University in December 1997. His primary interests are web algorithms, algorithms for large data sets, and theory of computation.





Community Structure in Large Social and Information Networks
Michael Mahoney (Stanford University)


The concept of a community is central to social network analysis, and thus a large body of work has been devoted to identifying community structure. For example, a community may be thought of as a set of web pages on related topics, a set of people who share common interests, or more generally as a set of nodes in a network more similar amongst themselves than with the remainder of the network. Motivated by difficulties we experienced at actually finding meaningful communities in large real-world networks, we have performed a large scale analysis of a wide range of social and information networks. Our main methodology uses local spectral methods and involves computing isoperimetric properties of the networks at various size scales -- a novel application of ideas from statistics and scientific computation to internet data analysis. Our empirical results suggest a significantly more refined picture of community structure than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small size scales, and at larger size scales, the best possible communities gradually ``blend in'' with the rest of the network and thus become less ``community-like.'' This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. Possible mechanisms for reproducing our empirical observations will be discussed, as will implications of these findings for clustering, classification, and more general data analysis in modern large social and information networks.

Bio:
--------------
Michael Mahoney is currently a research scientist in the math department at Stanford University. He is currently working on geometric network analysis; developing approximate computation and regularization methods for large informatics graphs; and applications to community detection, clustering, and information dynamics in large social and information networks. His research interests also include randomized algorithms for matrices and regression problems, applications to DNA single nucleotide polymorphism data, and large-scale statistical data analysis more generally. He has been a faculty member at Yale University and a researcher at Yahoo.





An Incentive-Based Architecture for Social Recommendations
Kostas Kollias


Recommender systems and social networking services are important Internet monetization tools and their popularity greatly facilitates web-based commercial activity. In this talk we will focus on how these tools can be coupled in order to form an architecture that incentivizes users to provide honest recommendations in a social network. The architecture consists of one distinct reputation system for each user, a recommendation language which allows users to place their reviews, a product ranking algorithm, and a revenue sharing mechanism. We will discuss the details of the architecture, the properties that allow users to profit by placing honest recommendations and how fast these recommendations lead to an equilibrium state.

Joint work with Rajat Bhattacharjee and Ashish Goel





Web Search Engine Metrics: Direct Metrics to Measure User Satisfaction
Ali Dasdan


Web search is an indispensable resource for users to search for information on the Web. Web search is also an important service for publishers and advertisers to present their content to users. Thus, user satisfaction is the key and must be quantified. In this talk, our goal is to give a practical review of web search metrics from a user satisfaction point of view. This viewpoint is intuitive enough for a typical user to express his or her web search experience using it. The metrics that we are planning to cover fall into the following areas: relevance, coverage, comprehensiveness, diversity, discovery freshness, content freshness, and presentation. We will also describe how these metrics can be mapped to proxy metrics for the stages of a generic search engine pipeline. Our hope is that practitioners can apply these metrics readily and that researchers can get problems to work on, especially in formalizing and refining metrics.

This short talk is based on a half-day tutorial that the presenter with Kostas Tsioutsiouliklis and Emre Velipasaoglu will present at the WWW'09 conference.

Bio:
--------------
Ali Dasdan is a director managing the metrics, analysis, and algorithms group of Yahoo! Web Search. His group is responsible for defining and implementing white-box metrics for production web search systems, production web search data, and the Web. His group is also responsible for monitoring the metrics and diagnosing the causes for anomalies detected. His research interests are in web search and advertising since 2006. He obtained his PhD in Computer Science from the University of Illinois at Urbana-Champaign in 1999.





Trusts among Users of Online Communities -- an Epinions Case Study
Hady Lauw


Trust between a pair of users is an important piece of information for users in an online community (such as electronic commerce websites and product review websites) where users may rely on trust information to make decisions. In this paper, we address the problem of predicting whether a user trusts another user. Most prior work infers unknown trust ratings from known trust ratings. The effectiveness of this approach depends on the connectivity of the known web of trust and can be quite poor when the connectivity is very sparse which is often the case in an online community. In this paper, we therefore propose a classification approach to address the trust prediction problem. We develop a taxonomy to obtain an extensive set of relevant features derived from user attributes and user interactions in an online community. As a test case, we apply the approach to data collected from Epinions, a large product review community that supports various types of interactions as well as a web of trust that can be used for training and evaluation. Empirical results show that the trust among users can be effectively predicted using pre-trained classifiers.



Bio

Hady is currently a Visiting Researcher at Microsoft Search Labs. Prior to that, he earned his Ph.D. at Nanyang Technological University, Singapore under a fellowship funded by A*STAR (Singapore's NSF-equivalent). His research interests are in social networks and Web mining, covering such topics as rating behaviors, trust, and mass collaboration. His work has been published in various conferences and journals such as KDD, SDM, CIKM, WSDM, EC, and TKDE.



Top-k Aggregation Using Intersection of Ranked Inputs
Sergei Vassilvitskii


There has been considerable past work on efficiently computing top k objects by aggregating information from multiple ranked lists of these objects. An important instance of this problem is query processing in search engines: one has to combine information from several different posting lists (rankings) of web pages (objects) to obtain the top k web pages to answer user queries. Two particularly well-studied approaches to achieve efficiency in top-k aggregation include early-termination algorithms (e.g., TA and NRA) and pre-aggregation of some of the input lists. However, there has been little work on a rigorous treatment of combining these approaches.

We generalize the TA and NRA algorithms to the case when pre-aggregated intersection lists are available in addition to the original lists. We show that our versions of TA and NRA continue to remain ``instance optimal,'' a very strong optimality notion that is a highlight of the original TA and NRA algorithms. Using an index of millions of web pages and real-world search engine queries, we empirically characterize the performance gains offered by our new algorithms. We show that the practical benefits of intersection lists can be fully realized only with an early-termination algorithm.

Joint work with Ravi Kumar, Kunal Punera and Torsten Suel



Utility-Maximizing Privacy Mechanisms
Mukund Sundararajan


A mechanism for releasing information about a statistical database with sensitive data must resolve a trade-off between utility and privacy. Informally, publishing fully accurate information maximizes utility while minimizing privacy, while publishing random noise accomplishes the opposite. Formally, privacy can be rigorously quantified using the framework of {\em differential privacy}, which requires that a mechanism's output distribution is nearly the same (in a strong sense) whether or not a given database row is included or excluded. Thus far, (dis)utility has typically been quantified via the expected error of a (randomly perturbed) response to a query.

We pursue much stronger and more general utility guarantees. Just as differential privacy guarantees protection against every potential attacker, independent of its side information, we seek a mechanism that guarantees near-optimal utility to every potential user, independent of its side information. Formally, we model the side information of a potential user as a prior distribution over query results. An interaction between such a user and a mechanism induces a posterior distribution -- the utility of the mechanism for this user is defined as the accuracy of this posterior as quantified via a user-specific loss function. A differentially private mechanism $M$ is (near-)optimal for a given user $u$ if $u$ derives (almost) as much utility from $M$ as from any other differentially private mechanism.

We prove that for each fixed counting query and differential privacy level, there is a {\em geometric mechanism} $M^*$ --- a discrete variant of the well-studied Laplace mechanism --- that is {\em simultaneously utility-maximizing} for every possible user, subject to the differential privacy constraint. This is an extremely strong utility guarantee: {\em every} potential user $u$, no matter what its side information and loss function, derives as much utility from $M^*$ as from interacting with a differentially private mechanism $M_u$ that is optimally tailored to $u$. The first part of our proof characterizes the utility-maximizing differentially private mechanism for a fixed but arbitrary user in terms of a basic feasible solution to a linear program with constraints that encode differential privacy. The second part shows that all of the relevant vertices of this polytope (ranging over all possible users) are derivable from the geometric mechanism via suitable remappings of its range.

Joint work with Arpita Ghosh and Tim Roughgarden.



Online Advertising
Hamid Nazerzadeh


Online advertising is a multi-billion dollar growing market with very interesting properties. I briefly explain some of these properties, and talk about ad allocation problems in the context of sponsored search and display advertising.

References:

  • Dynamic Cost-Per-Action Mechanisms and Applications to Online Advertising, with Amin Saberi, and Rakesh Vohra.
    Proceedings of the 17th International World Wide Web Conference (WWW), 179-188, 2008.
  • A Combinatorial Allocation Mechanism with Penalties For Banner Advertising, with Uriel Feige, Nicole Immorlica, and Vahab S. Mirrokni.
    Proceedings of the 17th International World Wide Web Conference (WWW), 169-178, 2008.
  • Advertisement Allocation for Generalized Second Pricing Schemes, with Ashish Goel, Mohammad Mahdian, and Amin Saberi
    Fourth Workshop on Ad Auctions, 2008
  • Allocating Online Advertisement Space with Unreliable Estimates, with Mohammad Mahdian and Amin Saberi.
    Proceedings of the 8th ACM Conference on Electronic Commerce (EC), 288-294, 2007.



From Bayesian to Worst-Case Optimal Auction Design
Tim Roughgarden


We define a general template for auction design that explicitly connects Bayesian optimal mechanism design, the dominant paradigm in economics, with worst-case analysis. In particular, we establish a general and principled way to identify appropriate performance benchmarks for prior-free optimal auction design. This framework enables, for the first time, meaningful competitive analysis for auction problems with allocation asymmetries, bidder asymmetries, and costly payments.

The above is joint work with Jason Hartline (from STOC 2008 and ongoing). Time permitting, we will also briefly discuss another recent application of Bayesian optimal auction design: rigorous efficiency-revenue approximation trade-offs (joint work with Shaddin Dughmi and Mukund Sundararajan).

Social Design
Yury Lifshits
, Caltech, http://yury.name

Social Design is an art and science of setting and managing incentives in web applications. So far, machine learning is the dominant approach to all central algorithmic problems in the Web: search, advertising, recommendation systems. Social design can become an important counterpart to ML, if we find the ways to incentivise users to give us more and better input data and even the answers themselves.
This talk is an outline of a new research field of Social Design. We start with several stories including Internet press conference of president Putin, reputation crash in largest Russian social network, and Yahoo! SearchMonkey project. Then we attempt to extract, characterize and classify the commonly used patterns in social design. Finaly, we set up the research agenda in the area with focus on social aspects of semantic web and advertising technologies.
Slides available at http://yury.name/talks.html
Bio: Yury Lifshits obtained his PhD degree from Steklov Institute of Mathematics at St.Petersburg (2007). He is currently a postdoc at Caltech. He won two gold medals at International Mathematical Olympiads, received the Best Paper Award in Application Track of CSR'07 and the Yandex Teaching Award (2006) for his course "Algorithms for Internet". Yury is a maintainer of "The Homepage of Nearest Neighbors": http://simsearch.yury.name

Inferring Links and Initiators
Evimaria Terzi
,IBM

Consider a 0-1 observation matrix M, where rows correspond to entities and columns correspond to signals; a value of 1 (or 0) in cell (i; j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix we study the problem of inferring the underlying directed links between entities (rows) and finding which entities act as initiators. I will formally define this problem and propose an MCMC framework for estimating the links and the initiators given the matrix of observations M. I will also show how this framework can be extended to incorporate a temporal aspect; instead of considering a single observation matrix M we consider a sequence of observation matrices M1; : : : ;Mt over time. Finally I will show that this problem is a generalization of several problems studied in the field of social-network analysis.
Bio: Evimaria Terzi is a Research Staff Member at IBM Almaden since June 2007. She obtained her PhD from the University of Helsinki (Finland) in January 2007, MSc from Purdue University (USA) in 2002 and BSc from the University of Thessaloniki (Greece) in 2000.

Reputation Mechanisms for Online Markets
Christina Aperjis

In online marketplaces potential buyers have to decide whether to buy an item and how much to pay for it without being able to observe it and without knowing whether the seller is trustworthy. Reputation mechanisms promote trust by giving information about past transactions. With the goal of incentivizing sellers to advertise truthfully, we study how the marketplace is affected by (i) the reputation mechanism, and (ii) the way buyers interpret the seller?s reputation score.

Two-Stage Myopic Dynamics in Network Formation Games
Esteban Arcaute

We consider two-stage myopic dynamics for network formation games, where utility to a node is based on a function of the distance to all other nodes. Since our network formation game is a generalization of Myerson's announcement game, we use pairwise Nash stability as the solution concept.
We prove that, although the price of anarchy of the static game is unbounded (in the number of nodes in the network), our dynamics converge to a subset of pairwise Nash stable networks with a constant efficiency ratio. For some specific utility functions, we prove that our dynamics converge to efficient networks.
This is joint work with Ramesh Johari and Shie Mannor.

Measuring and Modeling the Web
Ying Xu

The last couple of decades have witnessed a phenomenal growth in the World Wide Web. The Web has become an ubiquitous channel for information sharing and dissemination. This talk describes several research contributions in an endeavor towards a better understanding of the Web.
One of the basic questions about the Web is its size, i.e., how many webpages there are on the Web. A closely related problem is the sizes of search engine indexes, as search engine index defines an important subset of the Web that is most easily accessible to users. In the last ten years both the scientific literature and the popular press dealt at length with methodologies and estimates for the sizes of the Web and public search engines. In the first part of the talk, we propose algorithms for measuring the Web size, including both empirical and theoretical results.
The Web is so large that even estimating its size is a hard problem. To better understand the Web, people study random web graph models -- a stochastic process that generates random graphs that with high probability resemble the Web graph. Random web graph models are useful theoretical tools to provide insights to the evolution of the Web graph, to predict future behaviors, and to generate synthetic data sets for testing and simulation. In the second half of the talk, we briefly present several results in analyzing web graph models and applying the models to real world applications.

Should ad networks bother fighting click fraud?
Bobji Mungamuru

Suppose an ad network decides that a given click-through is invalid (or "fraudulent"). The implication is that the ad network will not charge the advertiser for that click-through. Therefore, arguably, the ad network is "taking money out of his own pocket" by marking clicks invalid. As such, should ad networks even bother fighting fraud? We analyze a simple economic model of the online advertising market, and conclude that the answer is "yes".

Reputation systems
Robbie Yan

Effective reputation systems should incentivize users to acquire and reveal information about content quality. They should also counter distortions induced by strategic agents that benefit from manipulating reputations. Mechanisms have been proposed in the recent literature with the aim to provide such incentives. These mechanisms, which we will refer to as reputation markets, can be viewed as information markets designed to assess the quality of online content. In this paper we develop equilibrium models to study how incentives created by a reputation market should influence community behavior and the accuracy of assessments.

Optimal Marketing Strategies over Social Networks
Mukund Sundararajan

Abstract: We study revenue maximization in social networks. We identify a family of strategies called 'influence and exploit' strategies that are easy to implement, easy to optimize over and approximately optimal.
Based on a paper that will appear in WWW 08. Jointly with Jason Hartline (Northwestern University) and Vahab Mirrokni (MSR Redmond).i

Ratio Index for Budgeted Learning, with Applications
Brad Null

In the budgeted learning problem, we are allowed to experiment on a set of alternatives (given a fixed experimentation budget) with the goal of picking a single alternative with the largest possible expected payoff. Approximation algorithms for this problem were developed by Guha and Munagala by rounding a linear program that couples the various alternatives together. We present an index for this problem, which we call the ratio index, which also guarantees a constant factor approximation. Index based policies have the advantage that a single number (i.e. the index) can be computed for each alternative irrespective of all other alternatives, and the alternative with the highest index is experimented upon. This is analogous to the famous Gittins index for the discounted multi-armed bandit problem.
The ratio index has several interesting structural properties. First, it can be computed in strongly polynomial time; the techniques we develop may be useful for deriving strongly polynomial algorithms for other Markov Decision Problems. Second, with the appropriate discount factor, the Gittins index and the ratio index are constant factor approximations of each other, and hence the Gittins index also gives a constant factor approximation to the budgeted learning problem. Finally, the ratio index can be used to obtain a "discount oblivious" solution to the multi-armed bandit problem, which in turn gives constant factor approximations to a wide class of problems that we call "list ranking problems".
Joint work with Rajat Bhattacharjee, Ashish Goel, and Sanjeev Khanna

Web Graph Similarity for Anomaly Detection
Panagiotis Papadimitriou

Web graphs are approximate snapshots of the web, created by search engines. Their creation is an error-prone procedure that relies on the availability of Internet nodes and the faultless operation of multiple software and hardware units. Checking the validity of a web graph requires a notion of graph similarity. Web graph similarity helps measure the amount and significance of changes in consecutive web graphs. These measurements validate how well search engines acquire content from the web. In this work we study five similarity schemes: three of them adapted from existing graph similarity measures and two adapted from well-known document and vector similarity methods. We compare and evaluate all five schemes using a sequence of web graphs for Yahoo! and study if the schemes can identify anomalies that may occur due to hardware or other problems.

Adchemy
Brian Babcock
,Adchemy

I'll talk about a number of computational problems that arise in online advertising, primarily from the perspective of the buyer of advertising.
Topics to be discussed include:
* When buying sponsored search links from search engines, how does one manage a portfolio of millions of keywords with minimal human involvement? Challenges include determining how much to bid for each keyword and what ad text to display to attract the most customers.
* How can one learn what properties of a web site or advertisement will lead to the greatest number of purchases or clicks? The problem is made more difficult by the fact that different classes of users respond well to different marketing messages, so it's best to use different content for different user segments.
* Consider an auction where the items to be sold are not known in advance, but rather arrive online, and buyers are willing to pay different amount for each item (revealed as the items arrive). Each buyer has a maximum limit on the total number of items that they will buy. How should the seller assign items to buyers so as to maximize the seller's revenue?

An Axiomatic Approach to Personalized Ranking Systems
Alon Altman

Personalized ranking systems and trust systems are an essential tool for collaboration in a multi-agent environment. In these systems, trust relations between many agents are aggregated to produce a personalized trust rating of the agents. In this talk I introduce the first extensive axiomatic study of this setting, and explore a wide array of well-known and new personalized ranking systems. We adapt several axioms (basic criteria) from the literature on global ranking systems to the context of personalized ranking systems, and fully classify the set of systems that satisfy all of these axioms. We further show that all these axioms are necessary for this result.

Ad Auctions with Reserved Price
Arash Asadpour

I will talk about the Ad Auctions when the auctioneer enjoys a relevant ad that she can show in one of the slots if she wants to. The setting is to some extent similar to the Ad Auctions with reserved price. I will explain a randomized auction which is truthful in expectation for both the bidders and the auctioneer. Also our mechanism is individual rational and budget balanced. I will discus the revenue of this auction and also will talk about some other settings in which the technique used to design the auction can be helpful.
This is a joint work with Kamal Jain (Microsoft Research) and David Abraham (CMU).

Boosting the area under ROC curve
Phil Long
, Google

We show that any weak ranker that can achieve an area under the ROC curve slightly better than 1/2 (which can be achieved by random guessing) can be efficiently boosted to achieve an area under the ROC curve arbitrarily close to 1. This boosting can be performed even in the presence of independent misclassification noise, given access to a noise-tolerant weak ranker.
(This is joint work with Rocco Servedio.)

Random input models for Adwords
Aranyak Mehta
, Google

I will talk about a new result on Random Input models for the Adwords Problem: What are good algorithms for allocation when the query stream is modeled as a random process, rather than as a worst case input. How does the natural "highest-bidder-wins" greedy algorithm perform? This question leads to an interesting generalization of a classic online matching algorithm. I will also talk about a related result on extended bidding models for Ad auctions.

Who do you know: The Social Network Revolution
Orkut Buyukkokten
, Google

Online social networks fundamentally change the way we get connected. The people we cross paths with have the biggest influence in our lives. Now it's easier to cross paths than ever as we are much closer and so much more connected. In this talk I will discuss the motivation behind the development of orkut.com, touch on the social and technical aspects of implementing and maintaining a system that has over 60 million users and reflect on the lessons we learned.

Deterministic Decentralized Search in Random Graphs
Esteban Arcaute

We study a general framework for decentralized search in random graphs. Our main focus is on deterministic memoryless search algorithms that use only local information to reach their destination in a bounded number of steps in expectation. This class includes (with small modifications) the search algorithms used in Kleinberg's pioneering work on long-range percolation graphs and hierarchical network models. We give a characterization of searchable graphs in this model, and use this characterization to prove a monotonicity property for searchability.
Joint work with Ning Chen, Ravi Kumar, David Liben-Nowell, Mohammad Mahdian, Hamid Nazerzadeh and Ying Xu.

Interdomain Routing and Games
Michael Schapira
, Hebrew University of Jerusalem ( http://www.cs.huji.ac.il/~mikesch/)

We present a game-theoretic model that captures many of the intricacies of interdomain routing in today's Internet. In this model, the strategic agents are source-nodes located on a network, who aim to send traffic to a unique destination node . The model enables complex dynamic interactions between the agents (asynchronous, sequential, and based on partial-information).
We provide realistic conditions that guarantee that executing the Border gateway Protocol (BGP), the only protocol currently used for interdomain routing, is in the best interest of each of the players. Moreover, we show that even coalitions of players of any size cannot improve their routing outcomes by jointly deviating from BGP. Unlike the vast majority of works in mechanism design, these results do not require any monetary transfers (to or by the agents).
Based on joint work with Hagay Levin and Aviv Zohar. The Anatomy of Clickbot.A
Neil Daswani
, Google

In this talk, I will present a detailed case study of the architecture of the Clickbot.A botnet that attempted a low-noise click fraud attack against syndicated search engines. The botnet of over 100,000 machines was controlled using a HTTP-based botmaster. Google identified all clicks on its ads exhibiting Clickbot.A-like patterns and marked them as invalid. We disclose the results of our investigation of this botnet to educate the security research community and provide information regarding the novelties of the attack.

Data Stream Algorithms with Applications to Automatic Worm Fingerprinting
Dilys Thomas

We will first go over a few network worms and contemporary algorithms to detect these network worms. We will then go over some recent techniques to detect these worms using "vunlerability signatures". We will then use this to review two important problems in data stream computation -- distinct value estimation and frequent element computation. We show how these algorithms can be used for automatic worm fingerprinting.

Click Fraud Resistant Methods for Learning Click-Through Rates
Sergei Vassilvitskii

Presenting the paper "Click Fraud Resistant Methods for Learning Click-Through Rates" by N. Immorlica, K. Jain, M. Mahdian, and K. Talwar from WINE 2005.

Pricing Partially Compatible Products
Adam Meyerson
, UCLA

Suppose two competing companies provide roughly equivalent suites of software products. Each version of a product has a possibly distinct price and quality. Customers would like to purchase a low- price, high-quality set of products. The issue is that products made by different companies need not be fully compatible. We reduce the customer's decision process to a minimum cut, and provide approximation and hardness results for the problem of computing the company's best-response strategy in several different models. We also introduce a range of questions for further work.
These results will appear in the ACM Conference on Electronic Commerce (EC) 2007, and are joint work with David Kempe, Nainesh Solanki, and Ramnath Chellappa.

Google News Personalization: Scalable Online Collaborative Filtering
Mayur Datar
, Google

Several approaches to collaborative filtering have been studied but seldom have the studies been reported for large (several millions of users and items) and dynamic (the underlying item set is continually changing) settings. In this paper we describe our approach to collaborative filtering for generating personalized recommendations for users of Google News. We generate recommendations using three approaches: collaborative filtering using MinHash clustering, Probabilistic Latent Semantic Indexing (PLSI), and covisitation counts. We combine recommendations from different algorithms using a linear model. Our approach is content agnostic and consequently domain independent, making it easily adaptible for other applications and languages with minimal effort. This paper will describe our algorithms and system setup in detail, and report results of running the recommendations engine on Google News.

Detecting Near-Duplicates for Web Crawling
Anish Das Sarma

This is joint work with Gurmeet Singh Manku and Arvind Jain in Google.
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrelevant for web search. So the quality of a web crawler increases if it can assess whether a newly crawled web page is a near-duplicate of a previously crawled web page or not.
In my talk I will address the problem of efficiently detecting near-duplicates in the web crawl setting. First I shall review Charikar's fingerprinting technique used in our approach. Then, I shall formally define the problem and present algorithmic solutions for two versions of the problem: online queries (single new document) and batch queries (multiple new documents). Finally, I shall briefly talk about our experimental analysis of the techniques.

Clickthrough of and Bidding on keywords
Aleksandra Korolova / Dilys Thomas

Aleksandra will talk about the paper "Predicting ClickThrough Rate Using Keyword Clusters" by Regelson and Fain from a Workshop on Sponsored Search Auctions. The results in the paper are mostly experimental. Dilys will briefly talk on "how to bid on keywords".

Multi-path Routing
Mihaela Enachescu

I will present an approach for enabling multi-path routing in large networks. This is based on a project at NEC Labs, and is joint work with Ravi Kokku. Our approach is to create multiple paths between various source-destination pairs by selecting a special subset of relay nodes. We give hardness results and provide some approximation algorithms.
This work is motivated by the observation that given two end-host in the Internet there exist potentially many independent or partially overlapping paths between them. However, only one of these is usually used by the underlay routing, leading to many inefficiencies (the single path routings are usually slow to adapt in the presence of fluctuating network conditions). Multi-path routing provides a more adaptive framework, and it also raises a variety of (still open) problems for our enjoyment.

Collaborative tagging systems
Paul Heymann

Collaborative tagging systems have recently become prevalent on the web. These systems are similar to classical keyword-based systems, but harness the input of many users to label large datasets like URLs on the web, videos, and photos. In my talk, I will give a brief description of tagging systems, and then talk about two recent projects. The first is work to predict tags annotating URLs based on the page text of those URLs. The outcome of this task has important implications for web search and for the usefulness of data created by social bookmarking systems. The second is an earlier (continuing) project on tag hierarchies.

Is Efficiency Expensive?
Mukund Sundararajan

We study the simultaneous optimization of efficiency and revenue in pay-per-click keyword auctions in a Bayesian setting. Our main result is that the efficient keyword auction yields near-optimal revenue even under modest competition. In the process, we build on classical results in auction theory to prove that increasing the number of bidders by the number of slots outweighs the benefit of employing the optimal reserve price.
This is joint work with Tim Roughgarden.

Estimating Search Engine Index and Web Sizes
Ying Xu

I will talk about estimating the index size of search engines, both empirical and theoretical results. The talk is based on our CIKM paper "Estimating Corpus Size via Queries" and ICALP paper "Estimating Sum by Weighted Sampling".

Reusable and multiple goods
Ashish Goel

The talk is based on
Online auctions with re-usable goods by Mohammad Hajiaghayi, Robert Kleinberg, Mohammad Mahdian, and David Parkes, EC'05
The Sequential Auction Problem on eBay: An Empirical Analysis and a Solution by Adam I. Juda and David C. Parkes, EC'06

Mechanism design based on Cost-Per-Acquisition (CPA)
Hamid Nazerzadeh

The Internet advertising has changed dramatically by moving from Cost-Per-Impression charging mechanism to Cost-Per-Click (CPC). I am going to talk about the possibility of going one step further and using CPA instead of CPC, and some of our results about this problem.

Asymptotically Optimal Repeated Auctions for Sponsored Search
Nicolas Lambert

We investigate asymptotically optimal keyword auctions, that is, auctions which maximize revenue as the number of bidders grows. We do so under two alternative behavioral assumptions. The first explicitly models the repeated nature of keyword auctions. It introduces a novel (and, we argue, plausible) assumption on individual bidding, namely that bidders never overbid their value, and bid their actual value if shut out for long enough. Under these conditions we present a broad class of repeated auctions that are asymptotically optimal among all sequential auctions (a superset of repeated auctions). Those auctions have varying payment schemes but share the ranking method. The Google auction belongs to this class, but not the Yahoo auction, and indeed we show that the latter is not asymptotically optimal. (Nonetheless, with some additional distributional assumptions, the Yahoo auction can be shown to belong to a broad category of auctions that are asymptotically optimal among all auction mechanisms that do not rely on ad relevance.) We then look at the one-shot keyword auction, which can be taken to model repeated auctions in which relatively myopic bidders converge on the equilibrium of the full-information stage game. In this case we show that the Google auction remains asymptotically optimal and the Yahoo auction suboptimal. The distributional assumptions under which our theorems hold are quite general, and arguably hold in all real-world situations of interest. We do however show that the Google auction is not asymptotically revenue maximizing for general distributions.