hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7006) [MTTR] Improve Region Server Recovery Time - Distributed Log Replay
Date Fri, 17 May 2013 07:35:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660436#comment-13660436
] 

Jeffrey Zhong commented on HBASE-7006:
--------------------------------------

[~zjushch] 
{quote}
I think we should store the last flushed sequence id for each store of region, otherwise would
cause the problem of data correctness when replaying logs.
{quote}
You're right. Actually yesterday [~enis] showed me HBASE-6059, I realized that we need to
do the above as you suggested. I considered that before and thought storing one id per region
is enough which turns out not true. I'll fix that as a follow up issue. Thanks for mentioning
this!
                
> [MTTR] Improve Region Server Recovery Time - Distributed Log Replay
> -------------------------------------------------------------------
>
>                 Key: HBASE-7006
>                 URL: https://issues.apache.org/jira/browse/HBASE-7006
>             Project: HBase
>          Issue Type: New Feature
>          Components: MTTR
>            Reporter: stack
>            Assignee: Jeffrey Zhong
>            Priority: Critical
>             Fix For: 0.98.0, 0.95.1
>
>         Attachments: hbase-7006-addendum.patch, hbase-7006-combined.patch, hbase-7006-combined-v1.patch,
hbase-7006-combined-v4.patch, hbase-7006-combined-v5.patch, hbase-7006-combined-v6.patch,
hbase-7006-combined-v7.patch, hbase-7006-combined-v8.patch, hbase-7006-combined-v9.patch,
LogSplitting Comparison.pdf, ProposaltoimprovelogsplittingprocessregardingtoHBASE-7006-v2.pdf
>
>
> Just saw interesting issue where a cluster went down  hard and 30 nodes had 1700 WALs
to replay.  Replay took almost an hour.  It looks like it could run faster that much of the
time is spent zk'ing and nn'ing.
> Putting in 0.96 so it gets a look at least.  Can always punt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message