hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Hadoop Java Versions
Date Fri, 01 Jul 2011 17:22:42 GMT
Although this thread is wandering a bit, I disagree strongly that it is
inappropriate to discuss other vendor specific features (or competing
compute platform features) on general@.  The topic has become the factors
that influence hardware purchase choices, and one of those is how the
system deals with disk failure.  Compare/contrast with other platforms is
healthy for the Hadoop project!

On 6/30/11 9:47 PM, "Ian Holsman" <hadoop@holsman.net> wrote:

>On Jul 1, 2011, at 2:08 PM, M. C. Srivas wrote:
>> On Thu, Jun 30, 2011 at 5:24 PM, Todd Lipcon <todd@cloudera.com> wrote:
>>> I'd advise you to look at "stock hadoop" again. This used to be true,
>>> was fixed a long while back by HDFS-457 and several followup JIRAs.
>>> If MapR does something fancier, I'm sure we'd be interested to hear
>>> it
>>> so we can compare the approaches.
>>> -Todd
>> MapR tracks disk responsiveness. In other words, a moving histogram of
>> IO-completion times is maintained internally, and if a disk starts
>> really slow, it is pre-emptively taken offline so it does not create
>> tails for running jobs (and the data on the disk is re-replicated using
>> whatever re-replication policy is in place).  One of the benefits of
>> managing the disks directly instead of through ext3 / xfs / or other ...
>> All these stats can be fed into Ganglia (or pushed out centrally via a
>> file that can be pulled out using NFS)  if historical info about disk
>> behavior (and failures) needs to be preserved.
>> - Srivas.
>While I am intrigued about how MapR performs internally, I don't think
>this is the forum for it.
>please keep MapR (and other vendor specific discussions) on their
>respective support forums.

View raw message