hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: missing data blocks after active name node crashes
Date Wed, 11 Feb 2015 15:14:41 GMT
Thanks guys.

On Wed, Feb 11, 2015 at 8:03 AM, dlmarion <dlmarion@comcast.net> wrote:

> https://issues.apache.org/jira/browse/HDFS-7097
>
>
>
>
> -------- Original message --------
> From: Chen Song <chen.song.82@gmail.com>
> Date:02/11/2015 7:48 AM (GMT-05:00)
> To: user@hadoop.apache.org
> Cc:
> Subject: Re: missing data blocks after active name node crashes
>
> Thanks David.
>
> Do you have the relative Jira ticket number handy?
>
> Chen
>
> On Tue, Feb 10, 2015 at 5:54 PM, david marion <dlmarion@hotmail.com>
> wrote:
>
>>  I believe therr was an issue fixed in 2.5 or 2.6 where the standby NN
>> would not process block reports from the DNs when it was dealing with the
>> checkpoint process. The missing blocks will get reported eventually.
>>
>>
>> -------- Original message --------
>> From: Chen Song <chen.song.82@gmail.com>
>> Date:02/10/2015 2:44 PM (GMT-05:00)
>> To: user@hadoop.apache.org, Ravi Prakash <ravihoo@ymail.com>
>> Cc:
>> Subject: Re: missing data blocks after active name node crashes
>>
>>  Thanks for the reply, Ravi.
>>
>>  In my case, what I see constantly is there are always missing blocks
>> every time active name node crashes. The active name node crashes because
>> of timeout on journal nodes.
>>
>>  Could this be a specific case which could lead to missing blocks?
>>
>>  Chen
>>
>> On Tue, Feb 10, 2015 at 2:20 PM, Ravi Prakash <ravihoo@ymail.com> wrote:
>>
>>  Hi Chen!
>>
>>  From my understanding, every operation on the Namenode is logged (and
>> flushed) to disk / QJM / shared storage. This includes the addBlock
>> operation. So when a client requests to write a new block, the metadata is
>> logged by the active NN, so even if it crashes later on, the new active NN
>> would still see the creation of the block.
>>
>>  HTH
>> Ravi
>>
>>
>>   On Tuesday, February 10, 2015 9:38 AM, Chen Song <
>> chen.song.82@gmail.com> wrote:
>>
>>
>>   When the active name node crashes, it seems there is always a chance
>> that the data blocks in flight will be missing.
>>  My understanding is that when the active name node crashes, the metadata
>> of data blocks in transition which exist in active name node memory is not
>> successfully captured by journal nodes and thus not available on standby
>> name node when it is promoted to active by zkfc.
>>  Is my understanding correct? Any way to mitigate this problem or race
>> condition?
>>
>>  --
>> Chen Song
>>
>>
>>
>>
>>
>>
>>  --
>> Chen Song
>>
>>
>
>
> --
> Chen Song
>
>


-- 
Chen Song

Mime
View raw message