hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1008) [performance] The replay of logs on server crash takes way too long
Date Sat, 16 May 2009 05:46:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710074#action_12710074
] 

stack commented on HBASE-1008:
------------------------------

J-D, is it true that we read in all the logs before we start splitting?  It looks that way
after going back to the patch.  If so, I missed that -- my fault -- and I think this a prob.

Theoretically, we can have at most 64 logs under a regionserver, each of which has ~64MB of
edits.  Thats 4G of edits that we need to pull in before we start processing.

Can we not run the writer threads every Nth file read, say, every 5 or 10 even?

Thanks.

> [performance] The replay of logs on server crash takes way too long
> -------------------------------------------------------------------
>
>                 Key: HBASE-1008
>                 URL: https://issues.apache.org/jira/browse/HBASE-1008
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.20.0, 0.19.3
>
>         Attachments: 1008-v2.patch, hbase-1008-3.patch, hbase-1008-v4-0.19.patch, hbase-1008-v4.patch
>
>
> Watching recovery from a crash on streamy.com where there were 1048 logs and repay is
running at rate of about 20 seconds each.  Meantime these regions are not online.  This is
way too long to wait on recovery for a live site.  Marking critical.  Performance related
so priority and in 0.20.0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message