hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: How will Hadoop handle it when a datanode server with total hardware failure?
Date Sun, 05 Apr 2015 05:54:33 GMT
With rep of 3 you would have to lose 3 entire nodes to lose data. The rep
factor is 3 nodes, not 3 spindles.. The number of disks (sort of) determine
how hdfs spreads io across the spindles for the single copy of the data
(one of 3 nodes with copies) that the node owns. Note that things get
slightly complicated when the FIRST datum is written to a cluster. (But
that was not your question ; {)
On Apr 4, 2015 10:39 PM, "Arthur Chan" <arthur.hk.chan@gmail.com> wrote:

> Hi,
>
> I use the default replication factor 3 here, the cluster has 10 nodes,
> each of my datanode has 8 hard disks.  If one of the nodes is down because
> of hardware failure, i.e. the 8 hard disks will no longer be available
> immediately during the down time of this machine, does it mean that I will
> have data lost? (8 hard disks >  3 replicated)
>
> Or what would be the maximum number of servers that are allowed to be down
> without data lost here?
>
> Regards
> Arthur
>
> On Wednesday, December 17, 2014, Harshit Mathur <mathursharp@gmail.com>
> wrote:
>
>> Hi Arthur,
>>
>> In HDFS there will be block level replication, In case of total failure
>> of a datanode the lost blocks will get under replicated hence the namenode
>> will create copy of these under replicated blocks on some other datanode.
>>
>> BR,
>> Harshit
>>
>> On Wed, Dec 17, 2014 at 11:35 AM, Arthur.hk.chan@gmail.com <
>> arthur.hk.chan@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> If each of  my datanode servers has 8 hard disks (a 10-node cluster) and
>>> I use the default replication factor of 3, how will Hadoop handle it when a
>>> datanode with total hardware failure suddenly?
>>>
>>> Regards
>>> Arthur
>>>
>>
>>
>>
>> --
>> Harshit Mathur
>>
>

Mime
View raw message