hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: hbase mttr vs. hdfs
Date Wed, 18 Jul 2012 10:00:13 GMT
The proposal seems good to me.  Its minimally intrusive.

See also below...

On Mon, Jul 16, 2012 at 7:08 PM, N Keywal <nkeywal@gmail.com> wrote:
> And to continue on this, for the files still opened (i.e. our wal
> files), we've got two calls to the dead DN:
>
> one, during the input stream opening, from DFSClient#updateBlockInfo.
> This calls fails, but the exception is shallowed without being logged.
> The node info is not updated, but there is no error, so we continue
> without the right info. The timeout will be 60 seconds. This call is
> one the port 50020.

> the second, will be the one already mentioned for the data transfer,
> with the timeout of 69 seconds. The dead nodes list is not updated by
> the first failure, leading to a total wait time >2 minutes if we got
> directed to the bad location.
>

Saving this extra second timeout is worth our doing a bit of work.

The NN is like the federal government.  It has general high-level
policies and knows about 'conditions' from the macro level; network
topologies, placement policies.  The DFSInput/OutputStream is like
local government.  It reacts to the local conditions reordering the
node list if it just timed out the node in position zero.  Whats
missing is state government, smarts in DFSClient, a means of being
able to inform adjacent local governments about conditions that might
effect their operation; dead of lagging DNs, etc.

St.Ack

Mime
View raw message