hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Marquardt (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15446) WASB: PageBlobInputStream.skip breaks HBASE replication
Date Fri, 04 May 2018 23:02:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464483#comment-16464483

Thomas Marquardt commented on HADOOP-15446:

Hi Steve,

I have attached HADOOP-15446-003.patch.  I renamed the test class ITestPageBlobInputStream,
since tests that connect to Azure are supposed to begin with ITest.  However, since i) these
are not scale tests and ii) scale tests are not required to be run before committing changes
(according to testing_azure.md), I have not marked them as scale tests.  These tests cover
functional behavior, not performance or scalability, and should be run before every commit
in my opinion.  The slowest of these tests takes me 3 seconds to run, and all of the tests
in ITestPageBlobInputStream takes me between 7 and 9 seconds to run.  I really do not consider
them to be scale tests by any means.  I know it takes longer for you over the internet to
Ireland, but hopefully these new tests can still be run in under 60 seconds.  The file that
they create is about 6 MB, so it is tiny in the big data world.  Let me know if I'm missing


> WASB: PageBlobInputStream.skip breaks HBASE replication
> -------------------------------------------------------
>                 Key: HADOOP-15446
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15446
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 2.9.0, 3.0.2
>            Reporter: Thomas Marquardt
>            Assignee: Thomas Marquardt
>            Priority: Major
>         Attachments: HADOOP-15446-001.patch, HADOOP-15446-002.patch, HADOOP-15446-003.patch
> Page Blobs are primarily used by HBASE.  HBASE replication, which apparently has not
been used with WASB until recently, performs non-sequential reads on log files using PageBlobInputStream. 
There are bugs in this stream implementation which prevent skip and seek from working properly, and
eventually the stream state becomes corrupt and unusable.
> I believe this bug affects all releases of WASB/HADOOP.  It appears to be a day-0 bug
in PageBlobInputStream.  There were similar bugs opened in the past (HADOOP-15042) but the
issue was not properly fixed, and no test coverage was added.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message