hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16960) RegionServer hang when aborting
Date Tue, 01 Nov 2016 08:43:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624799#comment-15624799
] 

Yu Li commented on HBASE-16960:
-------------------------------

Wow, clever method to reproduce the issue [~aoxiang]!

Skimmed the patch, overall LGTM, some minor comments:
1. Add some comments about the steps of the case, something like:
{code}
  /**
   * Reproduce locking up that happens when there's no further syncs after append fails, and
causing
   * an isolated sync then infinite wait. See HBASE-16960. If below is broken, we will see
this test
   * timeout because it is locked up.
   * <p/>
   * Steps for reproduce:<br/>
   * 1. Trigger server abort through dodgyWAL1<br/>
   * 2. Add a {@link DummyWALActionsListener} to dodgyWAL2 to cause ringbuffer event handler
thread
   * sleep for a while thus keeping {@code endOfBatch} false<br/>
   * 3. Publish a sync then an append which will throw exception, check whether the sync could
   * return
   */
  @Test(timeout = 20000)
  public void testLockup16960() throws IOException {
{code}

2. Add some comments around {{DummyWALActionsListener}} for better understanding, like
{code}
    // Add a listener to force ringbuffer event handler sleep for a while
    dodgyWAL2.registerWALActionsListener(new DummyWALActionsListener());
{code}

Good job!

> RegionServer hang when aborting
> -------------------------------
>
>                 Key: HBASE-16960
>                 URL: https://issues.apache.org/jira/browse/HBASE-16960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: 16960.ut.missing.final.piece.txt, HBASE-16960.patch, HBASE-16960_master_v2.patch,
HBASE-16960_master_v3.patch, RingBufferEventHandler.png, RingBufferEventHandler_exception.png,
SyncFuture.png, SyncFuture_exception.png, rs1081.jstack
>
>
> We see regionserver hang when aborting several times and cause all regions on this regionserver
out of service and then all affected applications stop works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message