hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1486) ReplicationMonitor thread goes away
Date Mon, 25 Jun 2007 23:09:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508015
] 

Doug Cutting commented on HADOOP-1486:
--------------------------------------

Like Hairong, I am not completely comfortable with this patch.  Wouldn't it be safer to, in
the added catch clause, set fsRunning to be false so that the namenode exits when an unexpected
exception is encountered?  And, also, shouldn't we explicitly try to fix the IllegalArgumentException
problem that caused this?

> ReplicationMonitor thread goes away 
> ------------------------------------
>
>                 Key: HADOOP-1486
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1486
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.3
>            Reporter: Koji Noguchi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: catchThrowable2.patch
>
>
> Saw many over/under replicated blocks in fsck output.
> .out file showed
> Exception in thread "org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor@2785982c"
java.lang.IllegalArgumentException: Unexpected non-existing data node: /99.9.99.0/99.9.99.42:99999
>   at org.apache.hadoop.net.NetworkTopology.checkArgument(NetworkTopology.java:379)
>   at org.apache.hadoop.net.NetworkTopology.isOnSameRack(NetworkTopology.java:424)
>   at org.apache.hadoop.dfs.FSNamesystem$ReplicationTargetChooser.chooseTarget(FSNamesystem.java:2853)
>   at org.apache.hadoop.dfs.FSNamesystem$ReplicationTargetChooser.chooseTarget(FSNamesystem.java:2816)
>   at org.apache.hadoop.dfs.FSNamesystem.pendingTransfers(FSNamesystem.java:2658)
>   at org.apache.hadoop.dfs.FSNamesystem.computeDatanodeWork(FSNamesystem.java:1774)
>   at org.apache.hadoop.dfs.FSNamesystem$ReplicationMonitor.run(FSNamesystem.java:1723)
>   at java.lang.Thread.run(Thread.java:619)
> (same as HADOOP-1232)
> And, jstack showed no ReplicationMonitor thread.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message