hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: missing data blocks after active name node crashes
Date Wed, 11 Feb 2015 12:48:50 GMT
Thanks David.

Do you have the relative Jira ticket number handy?

Chen

On Tue, Feb 10, 2015 at 5:54 PM, david marion <dlmarion@hotmail.com> wrote:

>  I believe therr was an issue fixed in 2.5 or 2.6 where the standby NN
> would not process block reports from the DNs when it was dealing with the
> checkpoint process. The missing blocks will get reported eventually.
>
>
> -------- Original message --------
> From: Chen Song <chen.song.82@gmail.com>
> Date:02/10/2015 2:44 PM (GMT-05:00)
> To: user@hadoop.apache.org, Ravi Prakash <ravihoo@ymail.com>
> Cc:
> Subject: Re: missing data blocks after active name node crashes
>
>  Thanks for the reply, Ravi.
>
>  In my case, what I see constantly is there are always missing blocks
> every time active name node crashes. The active name node crashes because
> of timeout on journal nodes.
>
>  Could this be a specific case which could lead to missing blocks?
>
>  Chen
>
> On Tue, Feb 10, 2015 at 2:20 PM, Ravi Prakash <ravihoo@ymail.com> wrote:
>
>  Hi Chen!
>
>  From my understanding, every operation on the Namenode is logged (and
> flushed) to disk / QJM / shared storage. This includes the addBlock
> operation. So when a client requests to write a new block, the metadata is
> logged by the active NN, so even if it crashes later on, the new active NN
> would still see the creation of the block.
>
>  HTH
> Ravi
>
>
>   On Tuesday, February 10, 2015 9:38 AM, Chen Song <chen.song.82@gmail.com>
> wrote:
>
>
>   When the active name node crashes, it seems there is always a chance
> that the data blocks in flight will be missing.
>  My understanding is that when the active name node crashes, the metadata
> of data blocks in transition which exist in active name node memory is not
> successfully captured by journal nodes and thus not available on standby
> name node when it is promoted to active by zkfc.
>  Is my understanding correct? Any way to mitigate this problem or race
> condition?
>
>  --
> Chen Song
>
>
>
>
>
>
>  --
> Chen Song
>
>


-- 
Chen Song

Mime
View raw message