hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
Date Tue, 05 Jun 2012 05:13:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289155#comment-13289155
] 

Anoop Sam John commented on HBASE-6134:
---------------------------------------

>From the patch and current HLogSplitter code I can see that 128M (def value) buffer only
is used. From the WAL data is read into this buffer and parallel writers get data from buffer
and write to appropriate recovered.edits file corresponding to region. Reading into the buffer
and write by the writers done parallely like producer consumer.

This way split happens when RS doing split for one file. If the same RS doing split of another
file another HLogSplitter instance will be created for that which will contain another buffer.

                
> Improvement for split-worker to speed up distributed-split-log
> --------------------------------------------------------------
>
>                 Key: HBASE-6134
>                 URL: https://issues.apache.org/jira/browse/HBASE-6134
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch
>
>
> First,we do the test between local-master-splitting and distributed-log-splitting
> Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting
work), 400 regions in one hlog file
> local-master-split:60s+
> distributed-log-splitting:165s+
> In fact, in our production environment, distributed-log-splitting also took 60s with
30 regionservers for 34 hlog files (regionserver may be in high load)
> We found split-worker split one log file took about 20s
> (30ms~50ms per writer.close(); 10ms per create writers )
> I think we could do the improvement for this:
> Parallelizing the create and close writers in threads
> In the patch, change the logic for  distributed-log-splitting same as the local-master-splitting
and parallelizing the close in threads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message