hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12457) Regions in transition for a long time when CLOSE interleaves with a slow compaction
Date Thu, 13 Nov 2014 12:47:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209704#comment-14209704
] 

ramkrishna.s.vasudevan commented on HBASE-12457:
------------------------------------------------

[~larsh]
{code}
writestate.wait(millis);
            if (millis > 0 && EnvironmentEdgeManager.currentTime() - start >=
millis) {
              // if we waited once for compactions to finish, interrupt them, and try again
              if (LOG.isDebugEnabled()) {
                LOG.debug("Waited for " + millis
                  + " ms for compactions to finish on close. Interrupting "
                  + currentCompactions.size() + " compactions.");
              }
              for (Thread t : currentCompactions.keySet()) {
                // interrupt any current IO in the currently running compactions.
                t.interrupt();
              }
              millis = 0;
            }
{code}
In this code we interrupt all the threads and set the millis = 0.  So again the code goes
to the outerloop and will once again wait for writeState.wait(0), expecting notify will happen.
But what if by this time all the threads were interrupted and the notifyAll was also called.
{code}
finally {
        if (wasStateSet) {
          synchronized (writestate) {
            --writestate.compacting;
            if (writestate.compacting <= 0) {
              writestate.notifyAll();
            }
          }
        }
{code}
We will end up in infinite waiting?
I may be wrong here pls correct me.

> Regions in transition for a long time when CLOSE interleaves with a slow compaction
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-12457
>                 URL: https://issues.apache.org/jira/browse/HBASE-12457
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.7
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 2.0.0, 0.98.8, 0.99.2
>
>         Attachments: 12457-combined-0.98-v2.txt, 12457-combined-0.98.txt, 12457-combined-trunk.txt,
12457-minifix.txt, 12457.interrupt-v2.txt, 12457.interrupt.txt, HBASE-12457.patch
>
>
> Under heave load we have observed regions remaining in transition for 20 minutes when
the master requests a close while a slow compaction is running.
> The pattern is always something like this:
> # RS starts a compaction
> # HM request the region to be closed on this RS
> # Compaction is not aborted for another 20 minutes
> # The region is in transition and not usable.
> In every case I tracked down so far the time between the requested CLOSE and abort of
the compaction is almost exactly 20 minutes, which is suspicious.
> Of course part of the issue is having compactions that take over 20 minutes, but maybe
we can do better here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message