hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingyun Tian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19358) Improve the stability of splitting log when do fail over
Date Tue, 28 Nov 2017 12:00:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16268630#comment-16268630

Jingyun Tian commented on HBASE-19358:

[~carp84] relay on my recently tests, they perform good when the number of threads is 50 and
max splitter is 5. But I'm not sure how many threads can take full usage of HDFS capcacity.
I will update the performance number after all my test done.

> Improve the stability of splitting log when do fail over
> --------------------------------------------------------
>                 Key: HBASE-19358
>                 URL: https://issues.apache.org/jira/browse/HBASE-19358
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>    Affects Versions: 0.98.24
>            Reporter: Jingyun Tian
>         Attachments: newLogic.jpg, previousLogic.jpg
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting log, which
means it will create one WriterAndPath for each region. If the cluster is small and the number
of regions per rs is large, it will create too many HDFS streams at the same time. Then it
is prone to failure since each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> We cached the recovered edits unless exceeds the memory limits we set or reach the end,
then  we have a thread pool to do the rest things: write them to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during splitting
> it will not exceeds *_hbase.regionserver.wal.max.splitters * hbase.regionserver.hlog.splitlog.writer.threads_*,
but before it is *_hbase.regionserver.wal.max.splitters * the number of region the hlog contains_*.

This message was sent by Atlassian JIRA

View raw message