hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-9198) Coalesce IBR processing in the NN
Date Thu, 22 Oct 2015 01:23:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968317#comment-14968317
] 

Uma Maheswara Rao G edited comment on HDFS-9198 at 10/22/15 1:22 AM:
---------------------------------------------------------------------

Thank Daryn for the Nice work here. This is interesting to me.
I have just review the patch. Following are my comments:

# runBlockOp: how about naming it as runBlockReportOp ?
# nit: {code}
while (namesystem.isRunning()) {
+        NameNodeMetrics metrics = NameNode.getNameNodeMetrics();
{code}
May be we can take metrics outside loop and use it?
# I think we need to handle throwable for this BR processing thread? incase of any unexpected
errors, this thread should not die silently as its one of the important processing thread…
? we may have to terminate the system in such cases.
minor suggestion: method names in BM could be like runBlockReportOpSync and runBlockReportAsync
? 
# code format missed for this lines:
{code}
metrics.setBlockOpsQueued(queue.size()+1);
metrics.addBlockOpsBatched(processed-1);
{code}
# Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR processing. But
now we handle Exceptions as NN itself and can not pass to DN as due to async. So now we sendImmdeiateIBR
happens only for IPC level exceptions. Have you thought about it. Missing such info would
have to wait until next BR right?
# Tests looking great to me. minor suggestion is could you please add javadoc for tests?



was (Author: umamaheswararao):
Thank Daryn for the Nice work here. This is interesting to me.
I have just review the patch. Following are my comments:

# runBlockOp: how about naming it as runBlockReportOp ?
# nit: {code}
while (namesystem.isRunning()) {
+        NameNodeMetrics metrics = NameNode.getNameNodeMetrics();
{code}
May be we can take metrics outside loop and use it?
# I think we need to handle throwable for this BR processing thread? incase of any unexpected
errors, this thread should not die silently as its one of the important processing thread…
? we may have to terminate the system in such cases.
minor suggestion: method names in BM could be like runBlockReportOpSync and runBlockReportAsync
? 
# code format missed for this lines:
{code}
metrics.setBlockOpsQueued(queue.size()+1);
metrics.addBlockOpsBatched(processed-1);
{code}
# Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR processing. But
now we handle Exceptions as NN itself and can not pass to DN as due to async. So now we sendImmdeiateIBR
happens only for IPC level exceptions. Have you thought about it. Missing such info would
have to wait until next BR right?
# Tests looking great to me. minor suggestion is could you please ass javadoc for tests?


> Coalesce IBR processing in the NN
> ---------------------------------
>
>                 Key: HDFS-9198
>                 URL: https://issues.apache.org/jira/browse/HDFS-9198
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch,
HDFS-9198-trunk.patch
>
>
> IBRs from thousands of DNs under load will degrade NN performance due to excessive write-lock
contention from multiple IPC handler threads.  The IBR processing is quick, so the lock contention
may be reduced by coalescing multiple IBRs into a single write-lock transaction.  The handlers
will also be freed up faster for other operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message