hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-3883) DFSClient NPE due to missing block when opening a file
Date Sat, 01 Sep 2012 01:53:07 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eli Collins updated HDFS-3883:
------------------------------

    Description: 
I saw the following NPE for a client on branch-1 that looks like it accessed a block not in
the volume map, probably because the block was already deleted (otherwise the primary should
have a block file). Looks like a create is racing with a delete.

DFSClient.java in updateBlockInfo..
{code}
Block newBlock = primary.getBlockInfo(last.getBlock());
long newBlockSize = newBlock.getNumBytes();    <--------
{code}

>From getBlockInfo to getStoredBlock..
{code}
  public synchronized Block getStoredBlock(long blkid) throws IOException {
    File blockfile = findBlockFile(blkid);
    if (blockfile == null) {
      return null;
    }
{code}

Digging into findBlockFile..
{code}
  public synchronized File findBlockFile(long blockId) {
    final Block b = new Block(blockId);
    File blockfile = null;
    ActiveFile activefile = ongoingCreates.get(b);
    if (activefile != null) {
      blockfile = activefile.file;
    }
    if (blockfile == null) {
      blockfile = getFile(b);
    }
    if (blockfile == null) {
      if (DataNode.LOG.isDebugEnabled()) {
        DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
        DataNode.LOG.debug("volumeMap=" + volumeMap);
      }
    }
    return blockfile;
{code}

Into getFile..
{code}
  public synchronized File getFile(Block b) {
    DatanodeBlockInfo info = volumeMap.get(b);
    if (info != null) {
      return info.getFile();
    }
    return null;
{code}

  was:
I saw the following NPE for a client on branch-1 that looks like it accessed a block not in
the volume map, probably because the block was already deleted (otherwise the primary should
have a block file). We should throw an IOE in this case.

DFSClient.java..
{code}
Block newBlock = primary.getBlockInfo(last.getBlock());
long newBlockSize = newBlock.getNumBytes();    <--------
{code}

>From getBlockInfo to getStoredBlock..
{code}
  public synchronized Block getStoredBlock(long blkid) throws IOException {
    File blockfile = findBlockFile(blkid);
    if (blockfile == null) {
      return null;
    }
{code}

Digging into findBlockFile..
{code}
  public synchronized File findBlockFile(long blockId) {
    final Block b = new Block(blockId);
    File blockfile = null;
    ActiveFile activefile = ongoingCreates.get(b);
    if (activefile != null) {
      blockfile = activefile.file;
    }
    if (blockfile == null) {
      blockfile = getFile(b);
    }
    if (blockfile == null) {
      if (DataNode.LOG.isDebugEnabled()) {
        DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
        DataNode.LOG.debug("volumeMap=" + volumeMap);
      }
    }
    return blockfile;
{code}

Into getFile..
{code}
  public synchronized File getFile(Block b) {
    DatanodeBlockInfo info = volumeMap.get(b);
    if (info != null) {
      return info.getFile();
    }
    return null;
{code}

    
> DFSClient NPE due to missing block when opening a file
> ------------------------------------------------------
>
>                 Key: HDFS-3883
>                 URL: https://issues.apache.org/jira/browse/HDFS-3883
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>            Priority: Minor
>
> I saw the following NPE for a client on branch-1 that looks like it accessed a block
not in the volume map, probably because the block was already deleted (otherwise the primary
should have a block file). Looks like a create is racing with a delete.
> DFSClient.java in updateBlockInfo..
> {code}
> Block newBlock = primary.getBlockInfo(last.getBlock());
> long newBlockSize = newBlock.getNumBytes();    <--------
> {code}
> From getBlockInfo to getStoredBlock..
> {code}
>   public synchronized Block getStoredBlock(long blkid) throws IOException {
>     File blockfile = findBlockFile(blkid);
>     if (blockfile == null) {
>       return null;
>     }
> {code}
> Digging into findBlockFile..
> {code}
>   public synchronized File findBlockFile(long blockId) {
>     final Block b = new Block(blockId);
>     File blockfile = null;
>     ActiveFile activefile = ongoingCreates.get(b);
>     if (activefile != null) {
>       blockfile = activefile.file;
>     }
>     if (blockfile == null) {
>       blockfile = getFile(b);
>     }
>     if (blockfile == null) {
>       if (DataNode.LOG.isDebugEnabled()) {
>         DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
>         DataNode.LOG.debug("volumeMap=" + volumeMap);
>       }
>     }
>     return blockfile;
> {code}
> Into getFile..
> {code}
>   public synchronized File getFile(Block b) {
>     DatanodeBlockInfo info = volumeMap.get(b);
>     if (info != null) {
>       return info.getFile();
>     }
>     return null;
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message