Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 68644 invoked from network); 31 Aug 2009 13:22:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 31 Aug 2009 13:22:26 -0000 Received: (qmail 96259 invoked by uid 500); 31 Aug 2009 13:22:24 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 96175 invoked by uid 500); 31 Aug 2009 13:22:23 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 96165 invoked by uid 99); 31 Aug 2009 13:22:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 31 Aug 2009 13:22:23 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [128.8.132.61] (HELO mrouter3.umiacs.umd.edu) (128.8.132.61) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 31 Aug 2009 13:22:13 +0000 Received: from [192.168.93.25] (unknown [128.8.118.5]) by mrouter3.umiacs.umd.edu (Postfix) with ESMTP id 1D2B513D28A for ; Mon, 31 Aug 2009 09:21:53 -0400 (EDT) Message-ID: <4A9BCE71.4060605@umd.edu> Date: Mon, 31 Aug 2009 09:21:53 -0400 From: Jimmy Lin User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: common-user@hadoop.apache.org Subject: Talk in DC area: MapReduce and Parallel DBMSs: A Comparison of Approaches to Large-Scale Data Analysis Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Dear Hadoopers, For those of you in the DC area, you might be interested in this talk at the University of Maryland this week... Best, Jimmy ------ MapReduce and Parallel DBMSs: A Comparison of Approaches to Large-Scale Data Analysis Complete info at http://tinyurl.com/knh83k Andy Pavlo (Brown University) (http://www.cs.brown.edu/~pavlo/) Thursday, September 3, 2009 4pm, AVW 3258 (Directions: http://www.umiacs.umd.edu/about/directions.htm) = Abstract The MapReduce (MR) paradigm has been heralded as a revolutionary new platform for large-scale, massively parallel data access. Some proponents claim that the extreme scalability of MR will relegate relational database management systems (DBMS) to the status legacy technology. In this talk, however, we discuss the results from our recent benchmark study from that suggest that using MR systems to perform tasks that are best suited for DBMSs yields less than satisfactory results. This leads us to conclude that MR is more akin to an Extract-Transform-Load (ETL) system than a DBMS, as it is quickly able to load and analyze large amounts of data in an ad hoc manner. As such, it is complementary to DBMS technology, rather than a competitor. We also discuss the various differences in the architectural decisions of MR systems and database systems, and provide insight on how the two systems should complement one another. = About the Speaker Andrew Pavlo is a third year Computer Science PhD student at Brown University's Data Management Group under the guidance of Dr. Stanley Zdonik.