hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5859) FindBugs : fix "wait() or sleep() with locks held" warnings in hdfs
Date Tue, 27 Jul 2010 01:16:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892592#action_12892592
] 

Todd Lipcon commented on HADOOP-5859:
-------------------------------------

Note that this patch doesn't address the root issue in most cases (I've seen confusion from
several users on this front).

Usually the root issue here is that a thread gets stuck in processDatanodeError, particularly
the RPC to the recovering DN. I've seen this in a number of situations, usually due to other
IPC bugs that cause hung clients in various error scenarios, and sometimes due to bugs in
recovery on the DN side.

So, if you see a stack trace that looks like this issue, it's probably a symptom of some other
bug, and this patch would just rejigger the stack traces, but not actually solve the underlying
problem.

Luke: do you have the stack trace from the DataStreamer thread? Is it stuck in an IPC call?
If so, do you have the jstack from the recovery datanode?

> FindBugs : fix "wait() or sleep() with locks held" warnings in hdfs
> -------------------------------------------------------------------
>
>                 Key: HADOOP-5859
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5859
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.21.0
>
>         Attachments: 5859-21.patch, 5859-22.patch, 5859-26.patch, 5859-33.patch, 5859-35.patch,
5859-36.patch, 5859-38.patch, 5859-4.patch, 5859-40.patch, 5859-41.patch, 5859-5.patch, 5859-8.patch
>
>
> This JIRA fixes the following warnings:
> SWL	org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal() calls Thread.sleep()
with a lock held
> TLW	wait() with two locks held in org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal()
> TLW	wait() with two locks held in org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal()
> TLW	wait() with two locks held in org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(byte[],
int, int, byte[])

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message