hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brahma Reddy Battula (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.
Date Wed, 16 Mar 2016 02:25:33 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196635#comment-15196635
] 

Brahma Reddy Battula commented on HDFS-9917:
--------------------------------------------

bq. I suggest that NN could just ignore the pending IBRs before the first full BR. Would it
fix the problem?

Yes, I think its same as clearing on reRegister() at datanode itself.
Advantage of clearing on reRegister() in DN itself, is 
unnecessary RPC will go to namenode and Namenode need to unnecessary GC for these IBR's..

We may also need to limit the DN keep accumulating the IBRs and use lot of memory

> IBR accumulate more objects when SNN was down for sometime.
> -----------------------------------------------------------
>
>                 Key: HDFS-9917
>                 URL: https://issues.apache.org/jira/browse/HDFS-9917
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Brahma Reddy Battula
>            Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it became unreponsive
because 
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), where as each
datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue. 
> To recover this( to clear this objects) ,restarted all the DN's one by one..This issue
happened in 2.4.1 where split of blockreport was not available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message