Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F33E191E9 for ; Mon, 7 Mar 2016 11:50:41 +0000 (UTC) Received: (qmail 29666 invoked by uid 500); 7 Mar 2016 11:50:41 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 29576 invoked by uid 500); 7 Mar 2016 11:50:41 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 29530 invoked by uid 99); 7 Mar 2016 11:50:41 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Mar 2016 11:50:41 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CA3522C1F5D for ; Mon, 7 Mar 2016 11:50:40 +0000 (UTC) Date: Mon, 7 Mar 2016 11:50:40 +0000 (UTC) From: "Phil Yang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-15398) Cells loss or disorder when using family essential filter and partial scanning protocol MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Yang updated HBASE-15398: ------------------------------ Attachment: HBASE-15398.v1.txt TestPartialResultsFromClientSide can pass locally. Have a test to see if other tests pass. > Cells loss or disorder when using family essential filter and partial scanning protocol > --------------------------------------------------------------------------------------- > > Key: HBASE-15398 > URL: https://issues.apache.org/jira/browse/HBASE-15398 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners > Affects Versions: 1.2.0, 1.1.3 > Reporter: Phil Yang > Assignee: Phil Yang > Priority: Critical > Attachments: 15398-test.txt, HBASE-15398.v1.txt > > > In RegionScannerImpl, we have two heaps, storeHeap and joinedHeap. If we have a filter and it doesn't apply to all cf, the stores whose families needn't be filtered will be in joinedHeap. We scan storeHeap first, then joinedHeap, and merge the results and sort and return to client. We need sort because the order of Cell is rowkey/cf/cq/ts and a smaller cf may be in the joinedHeap. > However, after HBASE-11544 we may transfer partial results when we get SIZE_LIMIT_REACHED_MID_ROW or other similar states. We may return a larger cf first because it is in storeHeap and then a smaller cf because it is in joinedHeap. Server won't hold all cells in a row and client doesn't have a sorting logic. The order of cf in Result for user is wrong. > And a more critical bug is, if we get a LIMIT_REACHED_MID_ROW on the last cell of a row in storeHeap, we will break scanning in RegionScannerImpl and in populateResult we will change the state to SIZE_LIMIT_REACHED because next peeked cell is next row. But this is only the last cell of one and we have two... And SIZE_LIMIT_REACHED means this Result is not partial (by ScannerContext.partialResultFormed), client will see it and merge them and return to user with losing data of joinedHeap. On next scan we will read next row of storeHeap and joinedHeap is forgotten and never be read... -- This message was sent by Atlassian JIRA (v6.3.4#6332)