hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3998) Got an exception from ClientFinalizer when the JT is terminated
Date Tue, 24 Feb 2009 21:51:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676425#action_12676425
] 

dhruba borthakur commented on HADOOP-3998:
------------------------------------------

I think the client can get a RemoteException even if the primary is alive. The client makes
a RPC to the primary. Now, if the primary is unable to contact any of the secondary datanode(s),
then it throws an RemoteException to the client. In this case, the primary is as good as dead
because it has been partitioned off from the secondary datanodes. The client should now retry
the recoverBlock with the remaining datanode(s).

> Dhruba, would it be better to discuss lease recovery at HADOOP-5311? This jira seems
to be about client close order.

I agree. However, this fix cannot be separated into two parts. otherwise unit tests will fail
(sometimes). 

> Got an exception from ClientFinalizer when the JT is terminated
> ---------------------------------------------------------------
>
>                 Key: HADOOP-3998
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3998
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Amar Kamat
>            Assignee: dhruba borthakur
>             Fix For: 0.19.2
>
>         Attachments: closeAll.patch, closeAll.patch
>
>
> This happens when we terminate the JT using _control-C_. It throws the following exception
> {noformat}
> Exception closing file my-file
> java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:193)
>         at org.apache.hadoop.hdfs.DFSClient.access$700(DFSClient.java:64)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:2868)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:2837)
>         at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:808)
>         at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:205)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:253)
>         at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1367)
>         at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:234)
>         at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:219)
> {noformat}
> Note that _my-file_ is some file used by the JT.
> Also if there is some file renaming done, then the exception states that the earlier
file does not exist. I am not sure if this is a MR issue or a DFS issue. Opening this issue
for investigation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message