hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks
Date Tue, 25 Aug 2015 08:50:48 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710911#comment-14710911
] 

Hadoop QA commented on HBASE-14283:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12752147/HBASE-14283.patch
  against master branch at commit d0873f5a8cc060adbc5a1ae0ed52b84a8942a868.
  ATTACHMENT ID: 12752147

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 11 new or modified
tests.

    {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions
(2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the total number of
protoc compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 checkstyle{color}.  The applied patch does not increase the total number
of checkstyle errors

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines longer than
100:
    +      HTableDescriptor htd = new HTableDescriptor(TableName.valueOf("testReverseScanRandomTable"+encoding.name()));
+    return String.format("Failed to match on iteration %s between %s and %s", iteration,
Bytes.toStringBinary(a), Bytes.toStringBinary(b));
+        + Bytes.toStringBinary(keys[i]) + ")", true, scanner.seekBefore(new KeyValue.KeyOnlyKeyValue(keys[i])));
+        + Bytes.toStringBinary(keys[i]) + ")", false, scanner.seekBefore(new KeyValue.KeyOnlyKeyValue(keys[i])));

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15241//testReport/
Release Findbugs (version 2.0.3) 	warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15241//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15241//artifact/patchprocess/checkstyle-aggregate.html

  Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15241//console

This message is automatically generated.

> Reverse scan doesn’t work with HFile inline index/bloom blocks
> --------------------------------------------------------------
>
>                 Key: HBASE-14283
>                 URL: https://issues.apache.org/jira/browse/HBASE-14283
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ben Lau
>            Assignee: Ben Lau
>         Attachments: HBASE-14283.patch, hfile-seek-before.patch
>
>
> Reverse scans do not work if an HFile contains inline bloom blocks or leaf level index
blocks.  The reason is because the seekBefore() call calculates the previous data block’s
size by assuming data blocks are contiguous which is not the case in HFile V2 and beyond.
> Attached is a first cut patch (targeting bcef28eefaf192b0ad48c8011f98b8e944340da5 on
trunk) which includes:
> (1) a unit test which exposes the bug and demonstrates failures for both inline bloom
blocks and inline index blocks
> (2) a proposed fix for inline index blocks that does not require a new HFile version
change, but is only performant for 1 and 2-level indexes and not 3+.  3+ requires an HFile
format update for optimal performance.    
> This patch does not fix the bloom filter blocks bug.  But the fix should be similar to
the case of inline index blocks.  The reason I haven’t made the change yet is I want to
confirm that you guys would be fine with me revising the HFile.Reader interface.
> Specifically, these 2 functions (getGeneralBloomFilterMetadata and getDeleteBloomFilterMetadata)
need to return the BloomFilter.  Right now the HFileReader class doesn’t have a reference
to the bloom filters (and hence their indices) and only constructs the IO streams and hence
has no way to know where the bloom blocks are in the HFile.  It seems that the HFile.Reader
bloom method comments state that they “know nothing about how that metadata is structured”
but I do not know if that is a requirement of the abstraction (why?) or just an incidental
current property. 
> We would like to do 3 things with community approval:
> (1) Update the HFile.Reader interface and implementation to contain and return BloomFilters
directly rather than unstructured IO streams
> (2) Merge the fixes for index blocks and bloom blocks into open source
> (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ field
in the block header in the next HFile version, so that seekBefore() calls can not only be
correct but performant in all cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message