hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vihang Karajgaonkar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method
Date Fri, 24 Feb 2017 19:04:44 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883338#comment-15883338
] 

Vihang Karajgaonkar commented on HIVE-15879:
--------------------------------------------

I agree that the patch does not improve the case of have 1 level of partition. It performs
similar to existing approach. Did a simple test with single partitioned key table with ~1800
partitions on S3. Both the implementations take about the same time ~60 sec. But we quickly
start seeing the benefits of this approach as soon as the number of partition keys increase.

Repeated the test above with a 2 partition keys with 10*10 = 100 partitions. Results shown
below show significant performance gain with the default configs.

|| Default pool size ||  Before || After ||
|| Time taken (sec) | 19.8 | 3.27 |

Hi [~rajesh.balamohan] I can change the JIRA description and category to "Improvement" if
you think that is more appropriate. Thanks!

Also updating the review board with patch HIVE-15879.03.patch


> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --------------------------------------------------
>
>                 Key: HIVE-15879
>                 URL: https://issues.apache.org/jira/browse/HIVE-15879
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>         Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch, HIVE-15879.03.patch
>
>
> HIVE-15803 fixes the msck hang issue in HiveMetaStoreChecker.checkPartitionDirs method
by adding a check to see if the Threadpool has any spare threads. If not it uses single threaded
listing of the files.
> {noformat}
>     if (pool != null) {
>       synchronized (pool) {
>         // In case of recursive calls, it is possible to deadlock with TP. Check TP usage
here.
>         if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>           useThreadPool = true;
>         }
>         if (!useThreadPool) {
>           if (LOG.isDebugEnabled()) {
>             LOG.debug("Not using threadPool as active count:" + pool.getActiveCount()
>                 + ", max:" + pool.getMaximumPoolSize());
>           }
>         }
>       }
>     }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing tasks.
> it returns only approximate number of threads and it cannot be guaranteed that it always
returns the exact number of active threads. This still exposes the method implementation to
the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into the possibility
of changing the recursive implementation to an iterative implementation where worker threads
pick tasks from a queue until the queue is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message