Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6C32410BCD for ; Mon, 1 Dec 2014 04:33:13 +0000 (UTC) Received: (qmail 50342 invoked by uid 500); 1 Dec 2014 04:33:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 50295 invoked by uid 500); 1 Dec 2014 04:33:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 50284 invoked by uid 99); 1 Dec 2014 04:33:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Dec 2014 04:33:13 +0000 Date: Mon, 1 Dec 2014 04:33:13 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10844) Coprocessor failure during batchmutation leaves the memstore datastructs in an inconsistent state MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229404#comment-14229404 ] Andrew Purtell commented on HBASE-10844: ---------------------------------------- So with this patch we'd remove the assert and replace it with a warning that memstore datastructures have been only partially updated? {code} --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java @@ -1109,7 +1109,13 @@ public class HRegion implements HeapSize { // , Writable{ // close each store in parallel for (final Store store : stores.values()) { - assert abort? true: store.getFlushableSize() == 0; + if (store.getFlushableSize() != 0) { + LOG.warn("store.getFlushableSize for " + store + " is not zero! It's " + + store.getFlushableSize() + ". Maybe a coprocessor " + + "operation failed and " + + "left the memstore datastructures in a partially updated state. " + + "Current memstoreSize " + this.getMemstoreSize().get()); + } completionService .submit(new Callable>>() { @Override {code} Shouldn't we be aborting in that case anyway? Or replace the assert with an abort()? > Coprocessor failure during batchmutation leaves the memstore datastructs in an inconsistent state > ------------------------------------------------------------------------------------------------- > > Key: HBASE-10844 > URL: https://issues.apache.org/jira/browse/HBASE-10844 > Project: HBase > Issue Type: Bug > Components: regionserver > Reporter: Devaraj Das > Assignee: Devaraj Das > Attachments: 10844-1-0.98.txt, 10844-1.txt > > > Observed this in the testing with Phoenix. The test in Phoenix - MutableIndexFailureIT deliberately fails the batchmutation call via the installed coprocessor. But the update is not rolled back. That leaves the memstore inconsistent. In particular, I observed that getFlushableSize is updated before the coprocessor was called but the update is not rolled back. When the region is being closed at some later point, the assert introduced in HBASE-10514 in the HRegion.doClose() causes the RegionServer to shutdown abnormally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)