Mapping the World's Photos

Dr. David Crindall, Department of Computer Science, Cornell University


The rapid rise of social photo-sharing websites has created immense collections of photographs online, with Flickr and Facebook alone now hosting over 20 billion images. The sheer size of these sites raises the question of how to organize large photo collections effectively. Current photo-sharing sites rely on relatively primitive technology like keyword tags provided by the photographer, causing untagged or poorly-tagged photographs to be essentially impossible to find. Ideally we'd like to build systems that organize photos based primarily on the visual content of the images themselves, while also taking advantage of any other information produced by the social photo-sharing process.

In this talk I'll describe how we combine visual information with non-visual metadata (including geospatial coordinates, text tags, and timestamps) to organize photo collections automatically, using a set of nearly 100 million images downloaded from Flickr. We use geotagged photographs to automatically discover and build models of highly-photographed places. We then use those models to predict where unlabeled photos were taken, using combinations of visual, textual, and temporal features. To our knowledge these are among the largest-scale experiments in the visual object recognition literature to date, demonstrating how relatively simple computer vision techniques can be combined with massive amounts of training data to produce scalable, practical object recognition systems. I'll also show how this process lets us discover interesting things about the world and human behavior, underscoring how Flickr and other social computing systems present a rich source of data for interdisciplinary studies with fields as diverse as sociology, economics, and ecology.

Host: Professor J. Griffioen