hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Prakash (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2011) Removal and restoration of storage directories on checkpointing failure doesn't work properly
Date Fri, 03 Jun 2011 14:31:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043370#comment-13043370
] 

Ravi Prakash commented on HDFS-2011:
------------------------------------

Hi Matt,

Thanks a ton for your review! I learned a lot from your detailed explanations. :) I followed
all of your suggestions. Couple of things to note

1. To be able to throw exceptions like you suggested, I had to make my two functions individual
jUnit tests. I hope that is fine. (Earlier they were being called from testCheckpoint() throws
IOException)
2. Thanks for the tip to use toURI. :) However, when I used new Path(System.getProperty("test.build.data","/tmp"),
"storageDirToCheck").toUri(), the test failed saying 
{noformat} 
Testcase: testSetCheckpointTimeInStorageHandlesIOException took 0.077 sec
        Caused an ERROR
Undefined scheme for /home/raviprak/Code/hadoop/hadoop-hdfs/build/test/data/storageDirToCheck
java.io.IOException: Undefined scheme for /home/raviprak/Code/hadoop/hadoop-hdfs/build/test/data/storageDirToCheck
        at org.apache.hadoop.hdfs.server.namenode.NNStorage.checkSchemeConsistency(NNStorage.java:348)
        at org.apache.hadoop.hdfs.server.namenode.NNStorage.setStorageDirectories(NNStorage.java:306)
        at org.apache.hadoop.hdfs.server.namenode.TestCheckpoint.testSetCheckpointTimeInStorageHandlesIOException(TestCheckpoint.java:179)
{noformat} 
So I changed it to use new File(...).toURI(). I hope that is fine too. 
3. In the comment, I meant to convey that the block of code was for when writeCheckpointTime
incurred an IOException. I've removed the comment seeing that it had been already mentioned
by the comment above it. Sorry for the ambiguity.
4. When I separated the bufCurrent and bufReady cases, the test failed saying 
{noformat} 
Testcase: testEditLogFileOutputStreamCloses took 0.042 sec
        Caused an ERROR
Bad file descriptor
java.io.IOException: Bad file descriptor
        at sun.nio.ch.FileChannelImpl.position0(Native Method)
        at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:284)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileOutputStream.close(EditLogFileOutputStream.java:141)
        at org.apache.hadoop.hdfs.server.namenode.TestCheckpoint.testEditLogFileOutputStreamCloses(TestCheckpoint.java:154)
{noformat} 
This was because these lines (more specifically the 1st) were still being called.  
{noformat} 
    // remove the last INVALID marker from transaction log.
    fc.truncate(fc.position());
    fp.close();
{noformat} 
I've let it remain the same. 

> Removal and restoration of storage directories on checkpointing failure doesn't work
properly
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2011
>                 URL: https://issues.apache.org/jira/browse/HDFS-2011
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>         Attachments: HDFS-2011.3.patch, HDFS-2011.4.patch, HDFS-2011.patch, HDFS-2011.patch,
HDFS-2011.patch
>
>
> Removal and restoration of storage directories on checkpointing failure doesn't work
properly. Sometimes it throws a NullPointerException and sometimes it doesn't take off a failed
storage directory

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message