Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 00BF018463 for ; Fri, 8 Jan 2016 00:27:41 +0000 (UTC) Received: (qmail 95719 invoked by uid 500); 8 Jan 2016 00:27:40 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 95645 invoked by uid 500); 8 Jan 2016 00:27:40 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 95144 invoked by uid 99); 8 Jan 2016 00:27:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Jan 2016 00:27:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 3CF302C14F4 for ; Fri, 8 Jan 2016 00:27:40 +0000 (UTC) Date: Fri, 8 Jan 2016 00:27:40 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-14177) Full GC on client may lead to missing scan results MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14177: ----------------------------------- Fix Version/s: (was: 0.98.17) 0.98.18 > Full GC on client may lead to missing scan results > -------------------------------------------------- > > Key: HBASE-14177 > URL: https://issues.apache.org/jira/browse/HBASE-14177 > Project: HBase > Issue Type: Bug > Components: Client > Affects Versions: 0.98.12, 0.98.13, 1.0.2 > Reporter: James Estes > Priority: Critical > Labels: dataloss > Fix For: 1.0.4, 0.98.18 > > > After adding a large row, scanning back that row winds up being empty. After a few attempts it will succeed (all attempts over the same data on an hbase getting no other writes). > Looking at logs, it seems this happens when there is memory pressure on the client and there are several Full GCs that happen. Then messages that indicate that region locations are being removed from the local client cache: > 2015-07-31 12:50:24,647 [main] DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Removed 192.168.1.131:50981 as a location of big_row_1438368609944,,1438368610048.880c849594807bdc7412f4f982337d6c. for tableName=big_row_1438368609944 from cache > Blaming the GC may sound fanciful, but if the test is run with -Xms4g -Xmx4g then it always passes on the first scan attempt. Maybe the pause is enough to remove something from the cache, or the client is using weak references somewhere? > More info http://mail-archives.apache.org/mod_mbox/hbase-user/201507.mbox/%3CCAE8tVdnFf%3Dob569%3DfJkpw1ndVWOVTkihYj9eo6qt0FrzihYHgw%40mail.gmail.com%3E > Test used to reproduce: > https://github.com/housejester/hbase-debugging#fullgctest > I tested and had failures in: > 0.98.12 client/server > 0.98.13 client 0.98.12 server > 0.98.13 client/server > 1.1.0 client 0.98.13 server > 0.98.13 client and 1.1.0 server > 0.98.12 client and 1.1.0 server > I tested without failure in: > 1.1.0 client/server -- This message was sent by Atlassian JIRA (v6.3.4#6332)