Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: general@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MailScanner-NULL-Check: 1318583846.62015@P297mJIyApE2GF/DIvN8hQ
Message-ID: <4E8EC39F.904@apache.org>
Date: Fri, 07 Oct 2011 10:17:19 +0100
From: Steve Loughran <stevel@apache.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US;
 rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: general@hadoop.apache.org
Subject: Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better
 for HBase?
References: <CAB329DA.B115%Milind.Bhandarkar@emc.com>
In-Reply-To: <CAB329DA.B115%Milind.Bhandarkar@emc.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 06/10/2011 17:49, Milind.Bhandarkar@emc.com wrote:
> Steve,
>
>> Summary: I'm not sure that HDFS is the right FS in this world, as it
>> contains a lot of assumptions about system stability and HDD persistence
>> that aren't valid any more. With the ability to plug in new placers you
>> could do tricks like ensure 1 replica lives in a persistent blockstore
>> (and rely on it always being there), and add other replicas in transient
>> storage if the data is about to be needed in jobs.
>
> Can you please shed more light on the statement "... as it
> contains a lot of assumptions about system stability and HDD persistence
> that aren't valid any more..." ?
>
> I know that you were doing some analysis of disk failure modes sometime
> ago. Is this the result of that research ? I am very interested.

no, it's unrelated -experience in hosting virtual hadoop 
infrastructures. Which is how my short-lived clusters exist today

-you don't know the hostname of the master nodes until allocated, so you 
need to allocate them and dynamically push out configs to the workers

-the Datanodes spin when the namenode goes down, forever, rather than 
checking somewhere to see if its changed. HDFS HA may fix that.

-It's dangerously easy to have >1 DN on the same physical host, losing 
independence of that replica.

-It's possible for the entire cluster to go down without warning.

MR-layer issues

-again, the TaskTrackers spin when the JT goes down, rather than look to 
see if its moved.

-Blacklisting isn't the right way to deal with task tracker failures: 
termination of VM is.

-if the TT's are idle, VM termination may be the best action

Hadoop is optimised for large physical clusters. If you look at the 
Stratosphere work at TuBerlin, they've designed something that includes 
VM allocation in the execution plan.

you can improve Hadoop to make it more agile; my defunct Hadoop 
lifecycle branch did a lot of that, but you have to have everyone else 
using Hadoop to be willing to let the changes go in -and those changes 
mustn't impose a cost or risk to the physical cluster model.