hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10844) Coprocessor failure during batchmutation leaves the memstore datastructs in an inconsistent state
Date Mon, 01 Dec 2014 04:33:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229404#comment-14229404
] 

Andrew Purtell commented on HBASE-10844:
----------------------------------------

So with this patch we'd remove the assert and replace it with a warning that memstore datastructures
have been only partially updated? 
{code}
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
@@ -1109,7 +1109,13 @@ public class HRegion implements HeapSize { // , Writable{
 
         // close each store in parallel
         for (final Store store : stores.values()) {
-          assert abort? true: store.getFlushableSize() == 0;
+          if (store.getFlushableSize() != 0) {
+            LOG.warn("store.getFlushableSize for " + store + " is not zero! It's " 
+                + store.getFlushableSize() + ". Maybe a coprocessor "
+                + "operation failed and "
+                + "left the memstore datastructures in a partially updated state. "
+                + "Current memstoreSize " + this.getMemstoreSize().get());
+          }
           completionService
               .submit(new Callable<Pair<byte[], Collection<StoreFile>>>()
{
                 @Override
{code}
Shouldn't we be aborting in that case anyway? Or replace the assert with an abort()?

> Coprocessor failure during batchmutation leaves the memstore datastructs in an inconsistent
state
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10844
>                 URL: https://issues.apache.org/jira/browse/HBASE-10844
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>         Attachments: 10844-1-0.98.txt, 10844-1.txt
>
>
> Observed this in the testing with Phoenix. The test in Phoenix - MutableIndexFailureIT
deliberately fails the batchmutation call via the installed coprocessor. But the update is
not rolled back. That leaves the memstore inconsistent. In particular, I observed that getFlushableSize
is updated before the coprocessor was called but the update is not rolled back. When the region
is being closed at some later point, the assert introduced in HBASE-10514 in the HRegion.doClose()
causes the RegionServer to shutdown abnormally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message