hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12678) Handle empty rename pending metadata file during atomic rename in redo path
Date Wed, 06 Jan 2016 19:33:39 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086136#comment-15086136
] 

Chris Nauroth commented on HADOOP-12678:
----------------------------------------

Thank you, [~madhuch-ms].  The error handling in {{deleteRenamePendingFile}} still needs some
work.  Here is the code from your v005 patch.

{code}
      } catch (IOException e) {
        // If the rename metadata was not found then somebody probably
        // raced with us and finished the delete first
        Throwable t = e.getCause();
        if (t != null && t instanceof StorageException) {
          StorageException se = (StorageException) t;
          if (se.getErrorCode().equals(("BlobNotFound"))) {
            LOG.warn("rename pending file " + redoFile + " is already deleted");
          } else {
            throw e;
          }
        }
      }
{code}

If there is a general {{IOException}} not caused by an Azure {{StorageException}}, then this
logic would stifle the exception without either throwing it or logging it.  An example of
this could be loss of network connectivity to the Azure Storage backend, which Java would
report as an {{IOException}} with no cause and a message describing the network error.  We'd
want to make sure errors like this propagate to the caller, so please stick with the code
I gave in my last comment:

{code}
      } catch (IOException e) {
        Throwable cause = e.getCause();
        if (cause != null && cause instanceof StorageException &&
            "BlobNotFound".equals(((StorageException)cause).getErrorCode())) {
          LOG.warn("rename pending file " + redoFile + " is already deleted");
        } else {
          throw e;
        }
      }
{code}

This ensures that only the BlobNotFound error would get swallowed, and any other {{IOException}},
whether or not its root cause is in Azure Storage, would propagate to the caller.  It also
clarifies that there are really only two cases for this code: swallow BlobNotFound, else rethrow.

The JavaDoc warnings from the last pre-commit run don't require any action.  These are pre-existing
warnings unrelated to this patch.  The patch is shifting the line numbers and therefore making
it appear that new warnings were introduced.

> Handle empty rename pending metadata file during atomic rename in redo path
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-12678
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12678
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>            Reporter: madhumita chakraborty
>            Assignee: madhumita chakraborty
>            Priority: Critical
>         Attachments: HADOOP-12678.001.patch, HADOOP-12678.002.patch, HADOOP-12678.003.patch,
HADOOP-12678.004.patch, HADOOP-12678.005.patch
>
>
> Handle empty rename pending metadata file during atomic rename in redo path
> During atomic rename we create metadata file for rename(-renamePending.json). We create
that in 2 steps
> 1. We create an empty blob corresponding to the .json file in its real location
> 2. We create a scratch file to which we write the contents of the rename pending which
is then copied over into the blob described in 1
> If process crash occurs after step 1 and before step 2 is complete - we will be left
with a zero size blob corresponding to the pending rename metadata file.
> This kind of scenario can happen in the /hbase/.tmp folder because it is considered a
candidate folder for atomic rename. Now when HMaster starts up it executes listStatus on the
.tmp folder to clean up pending data. At this stage due to the lazy pending rename complete
process we look for these json files. On seeing an empty file the process simply throws a
fatal exception assuming something went wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message