hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-668) TestFileAppend3#TC7 sometimes hangs
Date Fri, 16 Oct 2009 19:01:32 GMT

    [ https://issues.apache.org/jira/browse/HDFS-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766660#action_12766660
] 

Tsz Wo (Nicholas), SZE commented on HDFS-668:
---------------------------------------------

- In BlocksMap,
{code}
+   * Update the old block with the new block.
+   * 
+   * The new block has a newer generation stamp so it requires remove
+   * the old entry first and reinsert the new entry
+   * 
+   * @return the removed stored block in the map
+   */
+  BlockInfo updateBlock(Block oldBlock, Block newBlock) {
+    BlockInfo blockInfo = map.remove(oldBlock);
+    blockInfo.setGenerationStamp(newBlock.getGenerationStamp());
+    blockInfo.setNumBytes(newBlock.getNumBytes());
+    map.put(blockInfo, blockInfo);
+    return blockInfo;
+  }
{code}
-* It is better to check oldBlock.getBlockId() == newBlock.getBlockId() or change updateBlock(..)
to updateBlock(Block b, long newGenerationStamp, long newLength).
-* The stored block is added back.  So the javadoc "@return the removed stored block in the
map" sounds incorrect.

- In FSNamesystem,
{code}
@@ -1399,6 +1399,9 @@
       //
       for (BlockInfo block: v.getBlocks()) {
         if (!blockManager.checkMinReplication(block)) {
+          NameNode.stateChangeLog.info("BLOCK* NameSystem.checkFileProgress: "
+              + "block " + block + " has not reached minimal replication "
+              + blockManager.minReplication);
           return false;
         }
       }
@@ -1408,6 +1411,9 @@
       //
       BlockInfo b = v.getPenultimateBlock();
       if (b != null && !blockManager.checkMinReplication(b)) {
+        NameNode.stateChangeLog.info("BLOCK* NameSystem.checkFileProgress: "
+            + "block " + b + " has not reached minimal replication "
+            + blockManager.minReplication);
         return false;
       }
     }
{code}
-* These two log messages does not look like "state changes".  Should we use FSNamesystem.LOG
instead?

- In FSNamesystem,
{code}
-    final BlockInfo oldblockinfo = pendingFile.getLastBlock();
+    final BlockInfoUnderConstruction blockinfo = pendingFile.getLastBlock();
{code}
-* Could blockinfo be null?
-* Is it the case that the last block must be a BlockInfoUnderConstruction?  I am afarid that
an IOException caused by a ClassCastException may be thrown by getLastBlock().  The existing
code shown below looks incorrect: It first suppress unchecked warnings and then convert ClassCastException
to an IOException.  This makes it very hard to use it.  How can the caller handle such IOException?
{code}
//INodeFile
  <T extends BlockInfo> T getLastBlock() throws IOException {
    if (blocks == null || blocks.length == 0)
      return null;
    T returnBlock = null;
    try {
      @SuppressWarnings("unchecked")  // ClassCastException is caught below
      T tBlock = (T)blocks[blocks.length - 1];
      returnBlock = tBlock;
    } catch(ClassCastException cce) {
      throw new IOException("Unexpected last block type: " 
          + blocks[blocks.length - 1].getClass().getSimpleName());
    }
    return returnBlock;
  }
{code}

> TestFileAppend3#TC7 sometimes hangs
> -----------------------------------
>
>                 Key: HDFS-668
>                 URL: https://issues.apache.org/jira/browse/HDFS-668
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 0.21.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: hdfs-668.patch, loop.patch
>
>
> TestFileAppend3 hangs because it fails on close the file. The following is the snippet
of logs that shows the cause of the problem:
>     [junit] 2009-10-01 07:00:00,719 WARN  hdfs.DFSClient (DFSClient.java:setupPipelineForAppendOrRecovery(3004))
- Error Recovery for block blk_-4098350497078465335_1007 in pipeline 127.0.0.1:58375, 127.0.0.1:36982:
bad datanode 127.0.0.1:36982
>     [junit] 2009-10-01 07:00:00,721 INFO  datanode.DataNode (DataXceiver.java:opWriteBlock(224))
- Receiving block blk_-4098350497078465335_1007 src: /127.0.0.1:40252 dest: /127.0.0.1:58375
>     [junit] 2009-10-01 07:00:00,721 INFO  datanode.DataNode (FSDataset.java:recoverClose(1248))
- Recover failed close blk_-4098350497078465335_1007
>     [junit] 2009-10-01 07:00:00,723 INFO  datanode.DataNode (DataXceiver.java:opWriteBlock(369))
- Received block blk_-4098350497078465335_1008 src: /127.0.0.1:40252 dest: /127.0.0.1:58375
of size 65536
>     [junit] 2009-10-01 07:00:00,724 INFO  hdfs.StateChange (BlockManager.java:addStoredBlock(1006))
- BLOCK* NameSystem.addStoredBlock: addStoredBlock request received for blk_-4098350497078465335_1008
on 127.0.0.1:58375 size 65536 But it does not belong to any file.
>     [junit] 2009-10-01 07:00:00,724 INFO  namenode.FSNamesystem (FSNamesystem.java:updatePipeline(3946))
- updatePipeline(block=blk_-4098350497078465335_1007, newGenerationStamp=1008, newLength=65536,
newNodes=[127.0.0.1:58375], clientName=DFSClient_995688145)
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message