hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
Date Tue, 05 Jun 2012 04:15:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289136#comment-13289136
] 

stack commented on HBASE-6134:
------------------------------

@Chunhui You are saying we only buffer 128M worth of the WAL, not the entire log file as Prakash
says (You sure?  Do we not read it all up into memory to bucket it by region and then after
that is done, then we start writing out the recovered.edits file?  I could be wrong -- going
by memory).  If we are only using a 128M buffer flushing to the recovered.edits file as we
go, then that'd be fine I'd say.   Are there other improvements to be had given the circumstance
is now one reader only writing multiple files?  Good on you.
                
> Improvement for split-worker to speed up distributed-split-log
> --------------------------------------------------------------
>
>                 Key: HBASE-6134
>                 URL: https://issues.apache.org/jira/browse/HBASE-6134
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch
>
>
> First,we do the test between local-master-splitting and distributed-log-splitting
> Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting
work), 400 regions in one hlog file
> local-master-split:60s+
> distributed-log-splitting:165s+
> In fact, in our production environment, distributed-log-splitting also took 60s with
30 regionservers for 34 hlog files (regionserver may be in high load)
> We found split-worker split one log file took about 20s
> (30ms~50ms per writer.close(); 10ms per create writers )
> I think we could do the improvement for this:
> Parallelizing the create and close writers in threads
> In the patch, change the logic for  distributed-log-splitting same as the local-master-splitting
and parallelizing the close in threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message