hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Krishna Dara (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13959) Region splitting takes too long because it uses a single thread in most common cases
Date Fri, 26 Jun 2015 11:03:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602700#comment-14602700
] 

Hari Krishna Dara commented on HBASE-13959:
-------------------------------------------

Lars, Is something missing in your comment? 

The current number of 8 threads is just something I thought would keep the max split duration
acceptable. With 20 storefiles and about 700ms overhead from the reference file creation alone,
8 threads can keep the total split time to around 5 to 6 seconds. If we have a sense of the
max number of storefiles, and have a specific target for the split time, then we can set the
thread count to meet that goal. In our case, the client times out in 11 seconds, so keeping
the max split time well under that is the goal. Instead of having a fixed thread count that
is configurable, one alternative is to have configurable goal in seconds and then choose the
number of threads based to meet it (using heuristics).

If minimizing the split time is to be our goal, we could have one thread for each reference
file creation (i.e., 40 threads in case of 20 storefiles) and then the split time will always
stay low at about 2 to 4 seconds. 

I also traced the reference file creation and found that the final FSDataOutputStream.close()
takes almost all the time that goes into the reference file creation, which is between 300
to 400ms. In the same environment (using NNBenchWithoutMR), I sometime back found that small
file creations take a minimum of 200ms and peak to about 378ms under load (200+ clients),
so the timings I am seeing with the reference file creation seems about right.

> Region splitting takes too long because it uses a single thread in most common cases
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-13959
>                 URL: https://issues.apache.org/jira/browse/HBASE-13959
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.98.12
>            Reporter: Hari Krishna Dara
>            Assignee: Hari Krishna Dara
>            Priority: Critical
>             Fix For: 0.98.14
>
>         Attachments: 13959-suggest.txt, HBASE-13959-2.patch, HBASE-13959-3.patch, HBASE-13959-4.patch,
HBASE-13959.patch, region-split-durations-compared.png
>
>
> When storefiles need to be split as part of a region split, the current logic uses a
threadpool with the size set to the size of the number of stores. Since most common table
setup involves only a single column family, this translates to having a single store and so
the threadpool is run with a single thread. However, in a write heavy workload, there could
be several tens of storefiles in a store at the time of splitting, and with a threadpool size
of one, these files end up getting split sequentially.
> With a bit of tracing, I noticed that it takes on an average of 350ms to create a single
reference file, and splitting each storefile involves creating two of these, so with a storefile
count of 20, it takes about 14s just to get through this phase alone (2 reference files for
each storefile), pushing the total time the region is offline to 18s or more. For environments
that are setup to fail fast, this makes the client exhaust all retries and fail with NotServingRegionException.
> The fix should increase the concurrency of this operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message