hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liang xie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4359) remove an unnecessary synchronized keyword in BPOfferService.java
Date Wed, 16 Jan 2013 01:51:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554629#comment-13554629

liang xie commented on HDFS-4359:

Yes, the root cause is at namenode side,  this issue is just against to datanode,  we can
remove the "synchronized" keyword safely per thread dump and source code,  though it didn't
help for the whole hung accident:)

bq. the thread holding the lock is stuck in a 'versionRequest()' RPC. Any idea why this RPC
is taking a long time hearing back from the NN?
yes, we've figured it out several days before, one of DNS servers is in accident, but the
thread dump is really interesting, i've uploaded the NN thread dump for you enjoy it:)   btw,
the JUCL lock is not easy to find the lock-holder, which make us difficult to analyis...
> remove an unnecessary synchronized keyword in BPOfferService.java
> -----------------------------------------------------------------
>                 Key: HDFS-4359
>                 URL: https://issues.apache.org/jira/browse/HDFS-4359
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 3.0.0, 2.0.2-alpha
>            Reporter: liang xie
>            Assignee: liang xie
>         Attachments: dn.jstack, HDFS-4359.txt, nn_dns_broken.jstack
> we encountered a NN&DN hung issue, the DN hung was caused by no NN response for heartbeat.
Per DN thread dump, i think we can have a little improvement on this detail code :
>   synchronized List<BPServiceActor> getBPServiceActors() {
>     return Lists.newArrayList(bpServices);
>   }
> the bpServices is declared as :
>   private List<BPServiceActor> bpServices =
>     new CopyOnWriteArrayList<BPServiceActor>();
> It's a thread-safe variant indead, so we can remove the above synchronized keyword safely,
> Here is a simple statistic for thread dump: 
> xieliang@xieliang:/tmp$ grep 0x00000007b00289f0 dn.jstack |wc -l
> 252

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message