hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: [jira] Commented: (HADOOP-4584) Slow generation of blockReport at DataNode causes delay of sending heartbeat to NameNode
Date Tue, 24 Feb 2009 19:53:58 GMT
Raghu Angadi wrote:
> jason hadoop wrote:
>> Any reason for not using an internal or external agent that receives
>> notification from the operating system about filesystem operations in the
>> block storage subtree?
> lack of a patch to do so, may be?

Please let us know if there is a Java prototype implementation of this. 
I think NIO2 has some interface for this.. but not sure if there is some 
equivalent solution for JDK 1.6. Once this is available, it could be 
optionally enabled.


> Raghu.
>> On Tue, Feb 24, 2009 at 9:36 AM, Raghu Angadi (JIRA) 
>> <jira@apache.org>wrote:
>>>    [
>>> https://issues.apache.org/jira/browse/HADOOP-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676344#action_12676344]

>>> Raghu Angadi commented on HADOOP-4584:
>>> --------------------------------------
>>> Ideally there is no requirement for block reports. It is essentially to
>>> used as 'catch all' for various bugs and errors. (of course, it is now
>>> overloaded with job of informing about deletions to NameNode, this 
>>> should
>>> removed).
>>> Yes, it specifically removes disk scan without fundamentally changing
>>> meaning of block reports. Now DN informs NameNode about the the block 
>>> that
>>> it thinks it had. because :
>>>  * 'rm -r' by admin is just one form of many many things that can go 
>>> wrong
>>> with blocks on datanode. There is no perticular reason we should have 
>>> this
>>> very costly disk scan (with a global lock held) just for this.
>>>    ** In fact 'rm -r' is probably the least probable error (haven't seen
>>> even once in practice).
>>>  * We have periodic block verification that does handle various 
>>> things that
>>> can go wrong with a block (it can improve further).
>>>      ** So 'rm -r' will be handled, just at the rate of rest of the 
>>> block
>>> problems.
>>>  * on the other hand many users have complained about datanode scans 
>>> taking
>>> 10s of minutes and making datanodes lose heartbeats.
>>>     ** This makes the system pretty unusable and a major obstruction for
>>> graceful degradation under load and for scalability.
>>>    ** One can argue that those users should not have so many blocks. 
>>> But I
>>> think DN should still handle it to the best of it abilities and not 
>>> die on
>>> them.
>>>    ** Disks might be slow for many other reasons (other tasks on the
>>> machine, etc).
>>>  * I think this is orthogonal to HADOOP-1079 since it addresses RPC and
>>> NameNode overhead of block reports. This jira is only about DataNode 
>>> side.
>>> Yes, this is a bigger change in semantics than what we proposed 
>>> earlier :
>>> to scan the directories slowly, without holding the global lock... but
>>> offline scan looks like a workaround for a problem that does not need 
>>> to be
>>> solved. Not scanning is much simpler than handling offline scan.
>>> Eventually we need to reduce the frequency of block reports.. this 
>>> can be
>>> done as soon as we add acks for block deletions. This JIRA is major 
>>> step in
>>> that direction.
>>>> Slow generation of blockReport at DataNode causes delay of sending
>>> heartbeat to NameNode
>>> ----------------------------------------------------------------------------------------

>>>>                 Key: HADOOP-4584
>>>>                 URL: https://issues.apache.org/jira/browse/HADOOP-4584
>>>>             Project: Hadoop Core
>>>>          Issue Type: Bug
>>>>          Components: dfs
>>>>            Reporter: Hairong Kuang
>>>>            Assignee: Suresh Srinivas
>>>>             Fix For: 0.20.0
>>>>         Attachments: 4584.patch, 4584.patch, 4584.patch, 4584.patch,
>>> 4584.patch, 4584.patch
>>>> sometimes due to disk or some other problems, datanode takes minutes or
>>> tens of minutes to generate a block report. It causes the datanode 
>>> not able
>>> to send heartbeat to NameNode every 3 seconds. In the worst case, it 
>>> makes
>>> NameNode to detect a lost heartbeat and wrongly decide that the 
>>> datanode is
>>> dead.
>>>> It would be nice to have two threads instead. One thread is for 
>>>> scanning
>>> data directories and generating block report, and executes the 
>>> requests sent
>>> by NameNode; Another thread is for sending heartbeats, block reports, 
>>> and
>>> picking up the requests from NameNode. By having these two threads, the
>>> sending of heartbeats will not get delayed by any slow block report 
>>> or slow
>>> execution of NameNode requests.
>>> -- 
>>> This message is automatically generated by JIRA.
>>> -
>>> You can reply to this email to add a comment to the issue online.

View raw message