hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16960) RegionServer hang when aborting
Date Tue, 01 Nov 2016 08:43:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624799#comment-15624799

Yu Li commented on HBASE-16960:

Wow, clever method to reproduce the issue [~aoxiang]!

Skimmed the patch, overall LGTM, some minor comments:
1. Add some comments about the steps of the case, something like:
   * Reproduce locking up that happens when there's no further syncs after append fails, and
   * an isolated sync then infinite wait. See HBASE-16960. If below is broken, we will see
this test
   * timeout because it is locked up.
   * <p/>
   * Steps for reproduce:<br/>
   * 1. Trigger server abort through dodgyWAL1<br/>
   * 2. Add a {@link DummyWALActionsListener} to dodgyWAL2 to cause ringbuffer event handler
   * sleep for a while thus keeping {@code endOfBatch} false<br/>
   * 3. Publish a sync then an append which will throw exception, check whether the sync could
   * return
  @Test(timeout = 20000)
  public void testLockup16960() throws IOException {

2. Add some comments around {{DummyWALActionsListener}} for better understanding, like
    // Add a listener to force ringbuffer event handler sleep for a while
    dodgyWAL2.registerWALActionsListener(new DummyWALActionsListener());

Good job!

> RegionServer hang when aborting
> -------------------------------
>                 Key: HBASE-16960
>                 URL: https://issues.apache.org/jira/browse/HBASE-16960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: 16960.ut.missing.final.piece.txt, HBASE-16960.patch, HBASE-16960_master_v2.patch,
HBASE-16960_master_v3.patch, RingBufferEventHandler.png, RingBufferEventHandler_exception.png,
SyncFuture.png, SyncFuture_exception.png, rs1081.jstack
> We see regionserver hang when aborting several times and cause all regions on this regionserver
out of service and then all affected applications stop works.

This message was sent by Atlassian JIRA

View raw message