Computer Science Colloquium

September 6, 2006
Time: 4:00p.m
Place: CB 122

Managing Imprecisions with Probabilistic Databases

Dan Suciu
University of Washington

ABSTRACT


Traditional databases are deterministic: every item is either in the database or is not. But many applications today need to manage data that is imprecise, such as in fuzzy object matching, managing data extracted from unstructured text, or dealing with imprecise schema alignments. I will argue that probabilistic databases can be used in some of these applications. Each tuple has some probability of being in the database, and so has each row returned by a SQL query. I will discuss then the query evaluation problem on probabilistic databases: while the query complexity ranges between PTIME and #P, it is possible to answer queries efficiently if one focuses on finding and ranking the top k rows of a query rather than computing all output probabilities exactly.

Joint work with Nilesh Dalvi and Chris Re

Bio: Dan Suciu is an associate professor in Computer Science at the University of Washington. He received his Ph.D. from the University of Pennsylvania, was principal member of the technical staff at Bell Labs, then at AT&T Labs, and in 2000 joined the University of Washington in Seattle. Suciu is conducting research in data management, with an emphasis on topics that arise from sharing data on the Internet, such as management of semistructured and heterogeneous data, data security, and managing imprecisions in data. He is a co-author of the book Data on the Web: from Relations to Semistructured Data and XML, holds six US patents, received the 2000 ACM SIGMOD Best Paper Award, is a recipient of the NSF Career Award and of an Alfred P. Sloan Fellowship. He likes to work on problems that require nontrivial theoretical solutions, but he also likes to contribute (through his students) to practical tools in the public domain, such as XMill (the XML compressor), and SilkRoute (an XML publishing system with a comprehensive translator from XQuery to SQL).



Host: Prof. Alex Dekhtyar


Refreshments Served at 3:30p.m in
Room 763, Anderson Hall

Back to Computer Science Colloquium Page
-->