hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sangmin Lee <sangmin....@gmail.com>
Subject Re: Question regarding HDFS Recovery
Date Thu, 21 May 2009 06:09:57 GMT
On Wed, May 20, 2009 at 5:43 PM, Dhruba Borthakur <dhruba@gmail.com> wrote:

> > What if all datanodes in INodeFileUnderConstruction targets are dead ?
>
> If all datanodes in a pipeline are dead, than that file cannot be recovered
> at all. This is expected and most file-systems behave this way when the
> underlying storage goes bad.


Yeah, I understand that. But I don't see how the lease will be removed.
That is, when the client and all datanodes are dead, I don't see any code to
handle this.

Apart from this, I have another question regarding append.
Suppose that you are trying to append to a file.
And your replication factor is 3 and the NN's minReplication is also 3.
As a part of appending, client asks datanodes (which store the last block)
to sync but one of them fails.
The primary DN will do commitBlockSynchronisation with only two DNs.
(I believe the NN should do something at this point since it will never
receive enough blockreceived msgs)
And Client also proceeds with two DNs.
Then later, when client wants to allocate another block, it will get
NotReplicatedYetException.

Thanks,
Sangmin





>
>
> >I thought generationStamp should be checked when the NN process
> blockreports from DN,
>
> The generation stamp is used to compute the hashCode for a Block object.
>
> thanks,
> dhruba
>
>
> On Wed, May 20, 2009 at 11:58 AM, Sangmin Lee <sangmin.dev@gmail.com>
> wrote:
>
> > Dhruba,
> >
> > Thanks for the response.
> > What if all datanodes in INodeFileUnderConstruction.targets are dead ?
> > I don't see any code to handle this case.
> >
> > One other thing I wonder is that when is the generationStamp used by the
> NN
> > ?
> > I thought generationStamp should be checked when the NN process block
> > reports from DN, but I can only see it checks blocks length. Am I missing
> > something here?
> >
> > Thanks,
> > Sangmin
> >
> >
> > On Wed, May 20, 2009 at 12:24 PM, Dhruba Borthakur <dhruba@gmail.com>
> > wrote:
> >
> > > The NN has a timer for dead-clients. When the HARD_LIMIT (1 hour)
> > expires,
> > > the NN extracts the primary datanode from the
> > > INodeFileUnderConstruction.targets and asks the primary datanode to
> > recover
> > > the lease. At the end of the lease recovery, the primary datanode
> invokes
> > > NameNode.commitBlockSynchronisation method, and the lease recovery is
> > > complete.
> > >
> > > hope this helps,
> > > thanks,
> > > dhruba
> > >
> > >
> > >
> > > On Wed, May 20, 2009 at 9:14 AM, Sangmin Lee <sangmin.dev@gmail.com>
> > > wrote:
> > >
> > > > I am looking at 0.19.0(or maybe 0.19.1) and 0.20.0.
> > > > In fact, I am still curious about the case (maybe too much extream
> > case)
> > > > where
> > > > a client open a file, request a block and prematurely dies.
> > > > Also all datanodes go dead.
> > > > I don't see how the lease will be recovered or reaped in this case.
> > > > Don't we need some mechanism that discards the block and removes the
> > > lease
> > > > after several attempts for lease recovery ?
> > > >
> > > > Thanks,
> > > > Sangmin
> > > >
> > > > On Wed, May 20, 2009 at 10:40 AM, Edward J. Yoon <
> > edwardyoon@apache.org
> > > > >wrote:
> > > >
> > > > > Can I ask what version do you read? You looks reach so deeply into
> > the
> > > > > architecture of a system...
> > > > >
> > > > > On Thu, May 21, 2009 at 12:28 AM, Sangmin Lee <
> sangmin.dev@gmail.com
> > >
> > > > > wrote:
> > > > > > Okay.. I was going dumb by misreading some source code.
> > > > > > Please ignore my question regarding this.
> > > > > > Sorry about this.
> > > > > >
> > > > > > Sangmin
> > > > > >
> > > > > > On Tue, May 19, 2009 at 11:59 PM, Sangmin Lee <
> > sangmin.dev@gmail.com
> > > >
> > > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> I have some question regarding the hdfs recovery mechanism.
> > > > > >>
> > > > > >> I see that INodeFileUnderConstruction has a "targets" field
that
> > > > stores
> > > > > >> list of datanodes which store its last block.
> > > > > >> However, I don't see them being used at all except that
> > > > > >> "internalReleaseLease" function uses the length of the datanode
> > > list.
> > > > > >> Is there any other use of the "target" fields rather than
> checking
> > > its
> > > > > >> length?
> > > > > >>
> > > > > >> Could anyone shed some light on this?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Sangmin
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Edward J. Yoon @ NHN, corp.
> > > > > edwardyoon@apache.org
> > > > > http://blog.udanax.org
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message