hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: risks of using Hadoop
Date Sun, 18 Sep 2011 01:10:12 GMT
Data loss in a batch-oriented environment is different than data loss in an online/production
environment.  It's a trade-off, and I personally think many folks don't weigh the costs well.

As you mention - Hadoop is becoming more production oriented in utilization.  *In those cases*,
you definitely don't want to shrug off data loss / downtime.  However, there's many people
who simply don't need this.

If I'm told that I can buy a 10% larger cluster by accepting up to 15 minutes of data loss,
I'd do it in a heartbeat where I work.


On Sep 17, 2011, at 6:38 PM, Tom Deutsch wrote:

> I disagree Brian - data loss and system down time (both potentially non-trival) should
not be taken lightly. Use cases and thus availability requirements do vary, but I would not
encourage anyone to shrug them off as "overblown", especially as Hadoop become more production
oriented in utilization.
> ---------------------------------------
> Sent from my Blackberry so please excuse typing and spelling errors.
> ----- Original Message -----
> From: Brian Bockelman [bbockelm@cse.unl.edu]
> Sent: 09/17/2011 05:11 PM EST
> To: common-user@hadoop.apache.org
> Subject: Re: risks of using Hadoop
> On Sep 16, 2011, at 11:08 PM, Uma Maheswara Rao G 72686 wrote:
>> Hi Kobina,
>> Some experiences which may helpful for you with respective to DFS.
>> 1. Selecting the correct version.
>>   I will recommend to use 0.20X version. This is pretty stable version and all other
organizations prefers it. Well tested as well.
>> Dont go for 21 version.This version is not a stable version.This is risk.
>> 2. You should perform thorough test with your customer operations.
>> (of-course you will do this :-))
>> 3. 0.20x version has the problem of SPOF.
>>  If NameNode goes down you will loose the data.One way of recovering is by using
the secondaryNameNode.You can recover the data till last checkpoint.But here manual intervention
is required.
>> In latest trunk SPOF will be addressed bu HDFS-1623.
>> 4. 0.20x NameNodes can not scale. Federation changes included in latest versions.
( i think in 22). this may not be the problem for your cluster. But please consider this aspect
as well.
> With respect to (3) and (4) - these are often completely overblown for many Hadoop use
cases.  If you use Hadoop as originally designed (large scale batch data processing), these
likely don't matter.
> If you're looking at some of the newer use cases (low latency stuff or time-critical
processing), or if you architect your solution poorly (lots of small files), these issues
become relevant.  Another case where I see folks get frustrated is using Hadoop as a "plain
old batch system"; for non-data workflows, it doesn't measure up against specialized systems.
> You really want to make sure that Hadoop is the best tool for your job.
> Brian

View raw message