hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12334) Change Mode Of Copy Operation of HBase WAL Archiving to bypass Azure Storage Throttling after retries
Date Fri, 16 Oct 2015 22:37:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961464#comment-14961464

Chris Nauroth commented on HADOOP-12334:

[~gouravk], thank you for taking a look.

I was hopeful that we might be able to hook in some failure simulation in a test, similar
to the {{TestAzureFileSystemErrorConditions#injectTransientError}} method.  I just spent some
time experimenting with this and trying to hook on to the various event listeners exposed
by the Azure Storage SDK.  Unfortunately, these don't appear to give us a deep enough hook
for the failure injection I had in mind.  I don't see a way to rewrite the whole outbound
HTTP response to make it look like a server-busy error.  I think your manual testing will
have to suffice for this patch.

There is still one more piece of unresolved feedback on patch v06.  Please see my comment
from 21/Sep/15 about guaranteeing that the streams get closed.  After that is addressed, I
expect this patch will be ready to go.  Thanks!

> Change Mode Of Copy Operation of HBase WAL Archiving to bypass Azure Storage Throttling
after retries
> -----------------------------------------------------------------------------------------------------
>                 Key: HADOOP-12334
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12334
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools
>            Reporter: Gaurav Kanade
>            Assignee: Gaurav Kanade
>         Attachments: HADOOP-12334.01.patch, HADOOP-12334.02.patch, HADOOP-12334.03.patch,
HADOOP-12334.04.patch, HADOOP-12334.05.patch, HADOOP-12334.06.patch
> HADOOP-11693 mitigated the problem of HMaster aborting regionserver due to Azure Storage
Throttling event during HBase WAL archival. The way this was achieved was by applying an intensive
exponential retry when throttling occurred.
> As a second level of mitigation we will change the mode of copy operation if the operation
fails even after all retries -i.e. we will do a client side copy of the blob and then copy
it back to destination. This operation will not be subject to throttling and hence should
provide a stronger mitigation. However it is more expensive, hence we do it only in the case
we fail after all retries

This message was sent by Atlassian JIRA

View raw message