hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16960) RegionServer hang when aborting
Date Thu, 03 Nov 2016 20:04:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634074#comment-15634074
] 

Yu Li commented on HBASE-16960:
-------------------------------

Thanks for checking this [~mantonov]

bq. How bad is this for you? How frequently you see that Yu Li and binlijin?
It's really bad, as mentioned in the JIRA description:
{noformat}
We see regionserver hang when aborting several times and cause all regions on this regionserver
out of service and then all affected applications stop works.
{noformat}

This happens not that frequently but still some, like in our case we saw such hang before
addressing it twice a week or so.

bq. Any follow-ups on that?...
Not yet, but will do. This is another topic and will be big I assume...

bq. Thinking to move it to 1.3.1, where I'd bring those changed and bake in. Thoughts (that
depends on how bad this issue is) ?
I'd suggest to have it in 1.3.0 since it will cause severe problem and with some frequency.
The issue is already reproduced and covered in the UT case here so it's relatively safe. What's
more, we already have this fix running online here in Alibaba with really high workload for
about one week and no regression issue observed, and I guess this makes it safer to bring
in (Smile)? Thanks.

> RegionServer hang when aborting
> -------------------------------
>
>                 Key: HBASE-16960
>                 URL: https://issues.apache.org/jira/browse/HBASE-16960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: 16960.ut.missing.final.piece.txt, HBASE-16960.branch-1.v1.patch,
HBASE-16960.patch, HBASE-16960_master_v2.patch, HBASE-16960_master_v3.patch, HBASE-16960_master_v4.patch,
RingBufferEventHandler.png, RingBufferEventHandler_exception.png, SyncFuture.png, SyncFuture_exception.png,
rs1081.jstack
>
>
> We see regionserver hang when aborting several times and cause all regions on this regionserver
out of service and then all affected applications stop works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message