hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11263) Share the open/close store file thread pool for all store in a region
Date Fri, 30 May 2014 17:09:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013948#comment-14013948

Andrew Purtell commented on HBASE-11263:

Do we need storeFileThreadPoolCounter? Looks like this is being used effectively as a boolean
flag when you could achieve the same by testing if storeFileThreadPool is null or not in an
exclusive section? Save some heap per HRegion.

What happens if you use one thread pool for opening and closing store files per RegionServer?
If we have thousands of HRegions and they all have thread pools, surely we are not able to
fully schedule that degree of parallelism?

> Share the open/close store file thread pool for all store in a region
> ---------------------------------------------------------------------
>                 Key: HBASE-11263
>                 URL: https://issues.apache.org/jira/browse/HBASE-11263
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.99.0
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>            Priority: Minor
>         Attachments: HBASE-11263-trunk-v1.diff
> Currently, the open/close store file thread pool is divided equally to all stores of
a region. 
> {code}
>   protected ThreadPoolExecutor getStoreFileOpenAndCloseThreadPool(
>       final String threadNamePrefix) {
>     int numStores = Math.max(1, this.htableDescriptor.getFamilies().size());
>     int maxThreads = Math.max(1,
>         conf.getInt(HConstants.HSTORE_OPEN_AND_CLOSE_THREADS_MAX,
>             / numStores);
>     return getOpenAndCloseThreadPool(maxThreads, threadNamePrefix);
>   }
> {code}
> This is not very optimal in following scenarios:
> # The data of some column families are very large and there are many hfiles in those
stores, and others may be very small and in-memory column families. 
> # Usually we preserve some column families for later needs. The thread pool for these
column families are wasted。
> The simple way is to share a big thread pool for all stores to open/close hfiles.  
> Suggestions are welcomed. 
> Thanks. 

This message was sent by Atlassian JIRA

View raw message