hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan P (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-12077) MSCK Repair table should fix partitions in batches
Date Fri, 09 Oct 2015 12:38:26 GMT
Ryan P created HIVE-12077:
-----------------------------

             Summary: MSCK Repair table should fix partitions in batches 
                 Key: HIVE-12077
                 URL: https://issues.apache.org/jira/browse/HIVE-12077
             Project: Hive
          Issue Type: Bug
          Components: Hive
            Reporter: Ryan P


If a user attempts to run MSCK REPAIR TABLE on a directory with a large number of untracked
partitions HMS will OOME. I suspect this is because it attempts to do one large bulk load
in an effort to save time. Ultimately this can lead to a collection so large in size that
HMS eventually hits an Out of Memory Exception. 

Instead I suggest that Hive include a configurable batch size that HMS can use to break up
the load. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message