Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A4D48183AD for ; Tue, 27 Oct 2015 00:47:28 +0000 (UTC) Received: (qmail 67626 invoked by uid 500); 27 Oct 2015 00:47:28 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 67569 invoked by uid 500); 27 Oct 2015 00:47:28 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 67517 invoked by uid 99); 27 Oct 2015 00:47:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Oct 2015 00:47:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D59312C1F77 for ; Tue, 27 Oct 2015 00:47:27 +0000 (UTC) Date: Tue, 27 Oct 2015 00:47:27 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: =?utf-8?Q?[jira]_[Commented]_(HBASE-14283)_Reverse_scan_d?= =?utf-8?Q?oesn=E2=80=99t_work_with_HFile_inline_index/bloom_blocks?= MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14283?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D149= 75440#comment-14975440 ]=20 Hudson commented on HBASE-14283: -------------------------------- FAILURE: Integrated in HBase-1.1 #724 (See [https://builds.apache.org/job/H= Base-1.1/724/]) HBASE-14283 Reverse scan doesn=C3=A2=C2=80=C2=99t work with HFile inline in= dex/bloom (apurtell: rev 0db04a1705e5e8cc04cc9c010ddfc5612f60cfec) * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekBefor= eWithInlineBlocks.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2= .java * hbase-common/src/main/java/org/apache/hadoop/hbase/CellUtil.java > Reverse scan doesn=E2=80=99t work with HFile inline index/bloom blocks > -------------------------------------------------------------- > > Key: HBASE-14283 > URL: https://issues.apache.org/jira/browse/HBASE-14283 > Project: HBase > Issue Type: Bug > Reporter: Ben Lau > Assignee: Ben Lau > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14283-0.98.patch, HBASE-14283-branch-1.0.patch= , HBASE-14283-branch-1.1.patch, HBASE-14283-branch-1.2.patch, HBASE-14283-b= ranch-1.patch, HBASE-14283-master.patch, HBASE-14283-reupload-master.patch,= HBASE-14283-v2.patch, HBASE-14283.patch, hfile-seek-before.patch > > > Reverse scans do not work if an HFile contains inline bloom blocks or lea= f level index blocks. The reason is because the seekBefore() call calculat= es the previous data block=E2=80=99s size by assuming data blocks are conti= guous which is not the case in HFile V2 and beyond. > Attached is a first cut patch (targeting bcef28eefaf192b0ad48c8011f98b8e9= 44340da5 on trunk) which includes: > (1) a unit test which exposes the bug and demonstrates failures for both = inline bloom blocks and inline index blocks > (2) a proposed fix for inline index blocks that does not require a new HF= ile version change, but is only performant for 1 and 2-level indexes and no= t 3+. 3+ requires an HFile format update for optimal performance. =20 > This patch does not fix the bloom filter blocks bug. But the fix should = be similar to the case of inline index blocks. The reason I haven=E2=80=99= t made the change yet is I want to confirm that you guys would be fine with= me revising the HFile.Reader interface. > Specifically, these 2 functions (getGeneralBloomFilterMetadata and getDel= eteBloomFilterMetadata) need to return the BloomFilter. Right now the HFil= eReader class doesn=E2=80=99t have a reference to the bloom filters (and he= nce their indices) and only constructs the IO streams and hence has no way = to know where the bloom blocks are in the HFile. It seems that the HFile.R= eader bloom method comments state that they =E2=80=9Cknow nothing about how= that metadata is structured=E2=80=9D but I do not know if that is a requir= ement of the abstraction (why?) or just an incidental current property.=20 > We would like to do 3 things with community approval: > (1) Update the HFile.Reader interface and implementation to contain and r= eturn BloomFilters directly rather than unstructured IO streams > (2) Merge the fixes for index blocks and bloom blocks into open source > (3) Create a new Jira ticket for open source HBase to add a =E2=80=98prev= BlockSize=E2=80=99 field in the block header in the next HFile version, so = that seekBefore() calls can not only be correct but performant in all cases= . -- This message was sent by Atlassian JIRA (v6.3.4#6332)