hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-14660) wasb: improve throughput by 34% when account limit exceeded
Date Fri, 14 Jul 2017 03:00:01 GMT
Thomas created HADOOP-14660:

             Summary: wasb: improve throughput by 34% when account limit exceeded
                 Key: HADOOP-14660
                 URL: https://issues.apache.org/jira/browse/HADOOP-14660
             Project: Hadoop Common
          Issue Type: Improvement
          Components: fs/azure
            Reporter: Thomas
            Assignee: Thomas

Big data workloads frequently exceed the Azure Storage max ingress and egress limits (https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits).
 For example, the max ingress limit for a GRS account in the United States is currently 10
Gbps.  When the limit is exceeded, the Azure Storage service fails a percentage of incoming
requests, and this causes the client to initiate the retry policy.  The retry policy delays
requests by sleeping, but the sleep duration is independent of the client throughput and account
limit.  This results in low throughput, due to the high number of failed requests and thrashing
causes by the retry policy.

To fix this, we introduce a client-side throttle which minimizes failed requests and maximizes
throughput.  Tests have shown that this improves throughtput by ~34% when the storage account
max ingress and/or egress limits are exceeded. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message