hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
Date Tue, 05 Jun 2012 01:37:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289067#comment-13289067
] 

chunhui shen commented on HBASE-6134:
-------------------------------------

In the review board ,Prakash Khemani saied
bq."But the old code had a serious drawback – it would read the entire log file in memory
before writing it out. Also the old code assumed that multiple log files were being split
at the same time, but that is no longer true with distributed log splitting.
Whatever approach we take, I don’t think we should re-introduce buffering of the entire
log file in memory."

Since we set maximal buffer size 128MB, I don't know what effect would cause if using buffer.
Anyway, splitting log happens infrequently.

what others consider?


                
> Improvement for split-worker to speed up distributed-split-log
> --------------------------------------------------------------
>
>                 Key: HBASE-6134
>                 URL: https://issues.apache.org/jira/browse/HBASE-6134
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch
>
>
> First,we do the test between local-master-splitting and distributed-log-splitting
> Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting
work), 400 regions in one hlog file
> local-master-split:60s+
> distributed-log-splitting:165s+
> In fact, in our production environment, distributed-log-splitting also took 60s with
30 regionservers for 34 hlog files (regionserver may be in high load)
> We found split-worker split one log file took about 20s
> (30ms~50ms per writer.close(); 10ms per create writers )
> I think we could do the improvement for this:
> Parallelizing the create and close writers in threads
> In the patch, change the logic for  distributed-log-splitting same as the local-master-splitting
and parallelizing the close in threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message