hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-15837) More gracefully handle a negative memstoreSize
Date Mon, 16 May 2016 18:14:12 GMT
Josh Elser created HBASE-15837:

             Summary: More gracefully handle a negative memstoreSize
                 Key: HBASE-15837
                 URL: https://issues.apache.org/jira/browse/HBASE-15837
             Project: HBase
          Issue Type: Improvement
          Components: regionserver
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 2.0.0

Over in PHOENIX-2883, I've been trying to figure out how to track down the root cause of an
issue we were seeing where a negative memstoreSize was ultimately causing an RS to abort.
The tl;dr version is

* Something causes memstoreSize to be negative (not sure what is doing this yet)
* All subsequent flushes short-circuit and don't run because they think there is no data to
* The region is eventually closed (commonly, for a move).
* A final flush is attempted on each store before closing (which also short-circuit for the
same reason), leaving unflushed data in each store.
* The sanity check that each store's size is zero fails and the RS aborts.

I have a little patch which I think should improve our failure case around this, preventing
the RS abort safely (forcing a flush when memstoreSize is negative) and logging a calltrace
when an update to memstoreSize make it negative (to find culprits in the future).

This message was sent by Atlassian JIRA

View raw message