hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From N Keywal <nkey...@gmail.com>
Subject Re: hbase mttr vs. hdfs
Date Tue, 17 Jul 2012 17:14:20 GMT
> Adding this list of configs to the manual in a table would be
> generally useful I think (these and the lease ones below), especially
> if had a note on what happens if you change the configs.
> The 69s timeout is a bit rough espeically if a read on another open
> file already figured the DN dead; ditto on the write.

Aggreed. I'm currently doing that. I have as well a set of log
analysis that could make it to the ref book. I will create a Jira to
propose them.

>> On paper, it would be great to set "dfs.socket.timeout" to a minimal
>> value during a log split, as we know we will get a dead DN 33% of the
>> time. It may be more complicated in real life as the connections are
>> shared per process. And we could still have the issue with the
>> ipc.Client.
>>
>
> Seems like we read the DFSClient.this.socketTimeout opening
> connections to blocks.
>
>> As a conclusion, I think it could be interesting to have a third
>> status for DN in HDFS: between live and dead as today, we could have
>> "sick". We would have:
>> 1) Dead, known as such => As today: Start to replicate the blocks to
>> other nodes. You enter this state after 10 minutes. We could even wait
>> more.
>> 2) Likely to be dead: don't propose it for write blocks, put it with a
>> lower priority for read blocks. We would enter this state in two
>> conditions:
>>   2.1) No heartbeat for 30 seconds (configurable of course). As there
>> is an existing heartbeat of 3 seconds, we could even be more
>> aggressive here.
>>   2.2) We could have a shutdown hook in hdfs such as when a DN dies
>> 'properly' it says to the NN, and the NN can put it in this 'half dead
>> state'.
>>   => In all cases, the node stays in the second state until the 10.30
>> timeout is reached or until a heartbeat is received.
>
> I suppose as Todd suggests, we could do this client side.  The extra
> state would complicate NN (making it difficult to get such a change

After some iterations I came to a solution close to his proposition,
mentionned in my mail from yesterday.
To me we should fix this, and this includes HBASE-6401. The question
is mainly on which hdfs branch hbase would need it, as HDFS code
changed between the 1.0.3 release and the branch 2. HADOOP-8144 is
also important for people configuring the topology imho.

> in).  The API to mark a DN dead seems like a nice-to-have.   Master or
> client could pull on it when it knows a server dead (not just the RS).

Yes, there is a mechanism today to tell the NN to decommision a NN,
but it's complex, we need to write a file with the 'unwanted' nodes,
and we need to tell the NN to reload it. Not really a 'mark as dead"
function.

Mime
View raw message