hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19457) Debugging flaky TestTruncateTableProcedure#testRecoveryAndDoubleExecutionPreserveSplits
Date Sat, 16 Dec 2017 00:09:22 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293435#comment-16293435
] 

stack commented on HBASE-19457:
-------------------------------

I can't find a case of three tiers of proc. Would need to try it.  I don't see why not.

Yeah, its turning up some greens now when it used to be solid red: https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html

The new failure types -- 2 out 5 -- seem different. As you say, lets keep an eye on it.

Thanks for new JIRAs. Yeah, lets sort out state in meta.

> Debugging flaky TestTruncateTableProcedure#testRecoveryAndDoubleExecutionPreserveSplits
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-19457
>                 URL: https://issues.apache.org/jira/browse/HBASE-19457
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Appy
>            Assignee: Appy
>         Attachments: HBASE-19457.master.001.patch, patch1, test-output.txt
>
>
> Trying to explain the bug in a more general way where understanding of ProcedureV2 is
not required.
> Truncating table operation:
> ....
> delete region states from meta
> delete table state from meta
> ....
> add new regions to meta with state null.
> ....crash
> ....recovery: TableStateManager treats table with null state as ENABLED. AM treats regions
with null state as offline. Combined result - AM starts assigning the new regions from incomplete
truncate operation.
> Fix: Mark table as disabled instead of deleting it's state.
> ----
> *patch1*
> Just added some logging to help with debugging:
> - 60s was too less time, increased timeout
> - Added some useful log statements



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message