Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D109B48B7 for ; Wed, 25 May 2011 22:51:02 +0000 (UTC) Received: (qmail 66591 invoked by uid 500); 25 May 2011 22:51:02 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 66548 invoked by uid 500); 25 May 2011 22:51:02 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 66540 invoked by uid 99); 25 May 2011 22:51:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 May 2011 22:51:01 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mlortiz@uci.cu designates 200.55.140.180 as permitted sender) Received: from [200.55.140.180] (HELO mx3.uci.cu) (200.55.140.180) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 25 May 2011 22:50:55 +0000 Received: (qmail 19234 invoked by uid 507); 25 May 2011 22:50:29 -0000 Received: from 10.0.0.183 by ns3.uci.cu (envelope-from , uid 501) with qmail-scanner-2.01st (avp: 5.0.2.0. spamassassin: 3.0.6. perlscan: 2.01st. Clear:RC:1(10.0.0.183):. Processed in 0.640692 secs); 25 May 2011 22:50:29 -0000 Received: from unknown (HELO ucimail2.uci.cu) (10.0.0.183) by 0 with SMTP; 25 May 2011 22:50:28 -0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by ucimail2.uci.cu (Postfix) with ESMTP id 657283DC926; Wed, 25 May 2011 18:52:12 -0400 (CDT) X-Virus-Scanned: amavisd-new at uci.cu Received: from ucimail2.uci.cu ([127.0.0.1]) by localhost (ucimail2.uci.cu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cwHsq61hqMHA; Wed, 25 May 2011 18:52:07 -0400 (CDT) Received: from [10.8.46.234] (unknown [10.8.46.234]) by ucimail2.uci.cu (Postfix) with ESMTP id B0ED03DC91E; Wed, 25 May 2011 18:52:07 -0400 (CDT) Message-ID: <4DDD8F29.9080704@uci.cu> Date: Wed, 25 May 2011 18:52:17 -0430 From: Marcos Ortiz User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110424 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: mapreduce-user@hadoop.apache.org CC: Giridhar Addepalli Subject: Re: DBOutputFormat with one reducer References: <08F2F6DC119CC343B6B546B94B5E7AFC09606EB1@EXVBE012-17.exch012.intermedia.net> In-Reply-To: <08F2F6DC119CC343B6B546B94B5E7AFC09606EB1@EXVBE012-17.exch012.intermedia.net> Content-Type: multipart/alternative; boundary="------------090802070009000408010404" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------090802070009000408010404 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 05/25/2011 04:27 PM, Giridhar Addepalli wrote: > > Hi, > > We have MapReduce program which writes data to mysql database using > DBOutputFormat. > > Our program has one reducer. > > I understand that all the inserts happen during the close() operation > of the reducer. > > Is it gauranteed that this operation is atomic ? i.e; what happens if > the writes fail in the middle of the operation. > > Does it mean that only partial number of rows get in to mysql database? > > What does it take to make the write operation atomic ? > > Any suggestions around our situation( alternative solutions ) are welcome. > > Thanks, > > Giridhar. > Sqoop is done for that kind of tasks. Definition ======== Sqoop is an open-source tool that allows users to extract data from a relational database into Hadoop for further processing. This processing can be done with MapReduce programs or other higher-level tools such as Hive. When the final results of an analytic pipeline are available, Sqoop can export these results back to the database for consumption by other clients. Source code ========== http://github.com/cloudera/sqoop Regards -- Marcos Luis Ortiz Valmaseda Software Engineer (Distributed Systems) http://uncubanitolinuxero.blogspot.com --------------090802070009000408010404 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 05/25/2011 04:27 PM, Giridhar Addepalli wrote:

Hi,

 

We have MapReduce program which writes data to mysql database using DBOutputFormat.

Our program has one reducer.

I understand that all the inserts happen during the close() operation of the reducer.

Is it gauranteed that this operation is atomic ? i.e; what happens if the writes fail in the middle of the operation.

Does it mean that only partial number of rows get in to mysql database?

 

What does it take to make the write operation atomic ?

 

Any suggestions around our situation( alternative solutions ) are welcome.

 

Thanks,

Giridhar.

Sqoop is done for that kind of tasks.

Definition
========
 Sqoop is an open-source tool that allows users to extract data from a relational database into Hadoop for further processing.
 This processing can be done with MapReduce programs or other higher-level tools such  as Hive. When the final results of an analytic pipeline are available, Sqoop can export
 these results back to the database for consumption by other clients.

Source code
==========
 http://github.com/cloudera/sqoop

Regards

-- 
Marcos Luis Ortiz Valmaseda
 Software Engineer (Distributed Systems)
 http://uncubanitolinuxero.blogspot.com
--------------090802070009000408010404--