Skip to content Skip to navigation

RAIN schedule for Fall Quarter 2009-10

RAIN will be held on Wednesdays from 12:30-1:30pm in Gates 400 (the location might change though, so please be sure to check the schedule below). And yes, lunch will be served!

Date Speaker Topic
Sep 30 Arvind Narayanan De-anonymizing Social Networks
Oct 7 Nicolas Lambert Eliciting Information on the Distribution of Future Outcomes
Oct 14 Lars Backstrom Optimizing Web Traffic via the Media Scheduling Problem
Oct 21 Michael Kearns Behavioral Experiments in Strategic Networks
Note special location: Terman Engineering Center, Room 453
Oct 28 Matthew Jackson How Social Network Structure affects Diffusion and Learning
Nov 4 Paul Valiant How to Design Profitable Auctions
Nov 11 Mohammad Mahdian Sort me if you can (or, Algorithms on Dynamic Data)
Nov 18 DJ Patil
Canceled!
Nov 25
-
No meeting -- Happy Thanksgiving!
Dec 2 Abdur Chowdhury
TBA


Google Calendar for RAIN


Previous year's talks

Archived talks can be accessed here.

Talk Abstracts

De-anonymizing Social Networks
Arvind Narayanan


Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc.

We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.

Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy "sybil" nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary's auxiliary information is small.


Bio:
--------------
Arvind Narayanan is a post-doctoral researcher working with Dan Boneh. He recently finished his Ph.D at the University of Texas at Austin, advised by Vitaly Shmatikov. His research is on the privacy and anonymity issues involved in publishing large-scale datasets about people. His thesis, in a sentence, is that the level of anonymity that society expects -- and companies claim to provide -- in published databases is fundamentally unrealizable. He blogs about his anonymity-breaking efforts and other research at http://33bits.org/.





Eliciting Information on the Distribution of Future Outcomes
Nicolas Lambert


This paper studies the problem of inducing a presumably knowledgeable expert to reveal true information regarding the probability distribution of a future random outcome. The information being considered is quite general, and includes in particular the statistics of the probability distribution (such as mean, median, variance), and categorical information (such as the most correlated pair of variables). I examine two types of incentive schemes: Those that reward the expert for being truthful, and, for the case of numerical information, those that reward the expert increasingly with the accuracy of the prediction. For both cases, I establish simple criteria to determine the information that can be elicited with such schemes, and offer a complete characterization of the associated schedule fee.





Optimizing Web Traffic via the Media Scheduling Problem
Lars Backstrom


Website traffic varies through time in consistent and predictable ways, with highest traffic in the middle of the day. When providing media content to visitors, it is important to present repeat visitors with new content so that they keep coming back. In this paper we present an algorithm to balance the need to keep a website fresh with new content with the desire to present the best content to the most visitors at times of peak traffic. We formulate this as the me- dia scheduling problem, where we attempt to maximize total clicks, given the overall traffic pattern and the time varying clickthrough rates of available media content. We present an efficient algorithm to perform this scheduling under certain conditions and apply this algorithm to real data obtained from server logs, showing evidence of significant improvements in traffic from our algorithmic sched- ules. Finally, we analyze the click data, presenting models for why and how the clickthrough rate for new content declines as it ages.





Behavioral Experiments in Strategic Networks
Michael Kearns


For four years now, we have been conducting "medium-scale" experiments in how human subjects behave in strategic and economic settings mediated by an underlying network structure. We have explored a wide range of networks inspired by generative models from the literature, and a diverse set of collective strategic problems, including biased voting, graph coloring, consensus, and networked trading. These experiments have yielded a wealth of both specific findings and emerging general themes about how populations of human subjects interact in strategic networks. I will review these findings and themes, with an emphasis on the many more questions they raise than answer.





How Social Network Structure affects Diffusion and Learning
Matthew Jackson


We examine how diffusion and learning processes operating through social networks are affected by homophily (the tendency of individuals to associate with others similar to themselves) and other network characteristics. Homophily has no effect in pure diffusion processes where only shortest paths matter; only connection density matters. In contrast, homophily substantially slows learning based on repeated averaging of neighbors' information, while connection density has no effect.





How to Design Profitable Auctions
Paul Valiant


This talk is a high-level overview of results from two years of work with Silvio Micali and Jing Chen on a question that may best be phrased as: what is the best way to sell many items to many players? The standard formalizations of this question were fleshed out in the 1960's and led to answers that are unsatisfactory in practice, leading to today's Ebay that sells items by the auction method known to the ancient Greeks. In this talk we present new formulations of this question, elicited by fundamental notions from computer science: competitive analysis, collusion, and "knowledge". We present several concrete (and near-optimal) auctions in these models, along with many open questions.





Sort me if you can (or, Algorithms on Dynamic Data)
Mohammad Mahdian


We formulate and study a new computational model for dynamic data. In this model the data changes gradually and the goal of an algorithm is to compute the solution to some problem on the data at each time step, under the constraint that it only has a limited access to the data each time. As the data is constantly changing and the algorithm might be unaware of these changes, it cannot be expected to always output the exact right solution; we are interested in algorithms that guarantee to output an approximate solution. In particular, we focus on the fundamental problems of sorting and selection, where the true ordering of the elements changes slowly. We provide algorithms with performance close to the optimal in expectation and with high probability. If time permits, we will discuss new results on graph algorithms in our framework.

This is based on joint work with Aris Anagnostopoulos, Ravi Kumar, and Eli Upfal.