hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4679) Datanode prints tons of log messages: Waiting for threadgroup to exit, active theads is XX
Date Tue, 25 Nov 2008 19:55:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650704#action_12650704
] 

Raghu Angadi commented on HADOOP-4679:
--------------------------------------

After talking to Hairong:

  # DataXceiverServer should handle SocketTimeoutException. Right now an idle DN prints exception
every 10 seconds.
  # the timeout for serever socket could be lower.. that test will finish faster.
  # The unit test need not create files in a tight loop.
  # immedateShutdown is not really necessary. The way shutdown() works, it should only be
called from offerService() thread. I think javadoc JavaDoc should state it explicitly. 
  # The reason log was printed in a tight infinite loop (with out sleep) is that thread inturrupts
itself before calling sleep().. so sleep returns immediately!

I think this should go into 0.18. No one likes disks filling up with these log messages.
  

> Datanode prints tons of log messages: Waiting for threadgroup to exit, active theads
is XX
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4679
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4679
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: diskError.patch, diskError1.patch
>
>
> When a data receiver thread sees a disk error, it immediately calls shutdown to shutdown
DataNode. But the shutdown method does not return before all data receiver threads exit, which
will never happen. Therefore the DataNode gets into a dead/live lock state, emitting tons
of log messages: Waiting for threadgroup to exit, active threads is XX.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message