hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3308) SplitTransaction.splitStoreFiles slows splits a lot
Date Wed, 08 Dec 2010 01:04:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969110#action_12969110
] 

HBase Review Board commented on HBASE-3308:
-------------------------------------------

Message from: stack@duboce.net

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1273/#review2043
-----------------------------------------------------------

Ship it!


+1  Minor comment below.


/branches/0.90/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
<http://review.cloudera.org/r/1273/#comment6447>

    Why not have an upper bound?  If 100 files thats 100 threads doing FS operations.  I bet
if you had upper bound of 10 on the executorservice, it complete faster than an unbounded
executorservice?


- stack





> SplitTransaction.splitStoreFiles slows splits a lot
> ---------------------------------------------------
>
>                 Key: HBASE-3308
>                 URL: https://issues.apache.org/jira/browse/HBASE-3308
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> Recently I've been seeing some slow splits in our production environment triggering timeouts,
so I decided to take a closer look into the issue.
> According to my debugging, we spend almost all the time it takes to split on creating
the reference files. Each file in my testing takes at least 300ms to create, and averages
around 600ms. Since we create two references per store file, it means that a region with 4
store file can easily take up to 5 seconds to split just to create those references.
> An intuitive improvement would be to create those files in parallel, so at least it wouldn't
be much slower when we're splitting a higher number of files. Stack left the following comment
in the code:
> {noformat}
> // TODO: If the below were multithreaded would we complete steps in less
> // elapsed time?  St.Ack 20100920
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message