hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3673) Deadlock in Datanode RPC servers
Date Thu, 03 Jul 2008 03:26:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610112#action_12610112
] 

Hadoop QA commented on HADOOP-3673:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12385160/3673_20080702e.patch
  against trunk revision 673517.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2788/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2788/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2788/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2788/console

This message is automatically generated.

> Deadlock in Datanode RPC servers
> --------------------------------
>
>                 Key: HADOOP-3673
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3673
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: dhruba borthakur
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: 3673_20080702.patch, 3673_20080702b.patch, 3673_20080702c.patch,
3673_20080702d.patch, 3673_20080702e.patch
>
>
> There is a deadlock scenario in the way Lease Recovery is triggered using the Datanode
RPC server via HADOOP-3310.
> Each Datanode has dfs.datanode.handler.count handler threads (default of 3). These handler
threads are used to support the generation-stamp-dance protocol as described in HADOOP-1700.
> Let me try to explain the scenario with an example. Suppose, a cluster has two datanodes.
Also, let's assume that dfs.datanode.handler.count is set to 1. Suppose that there are two
clients, each writing to a separate file with a replication factor of 2. Let's assume that
both clients encounter an IO error and triggers the generation-stamp-dance protocol. The first
client may invoke recoverBlock on the first datanode while the second client may invoke recoverBlock
on the second datanode. Now, each of the datanode will try to make a getBlockMetaDataInfo()
to the other datanode. But since each datanode has only 1 server handler threads, both threads
will block for eternity. Deadlock!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message