hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <j...@apache.org>
Subject [jira] [Created] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max
Date Fri, 10 Feb 2017 22:35:41 GMT
Sergio Peña created HIVE-15881:
----------------------------------

             Summary: Use new thread count variable name instead of mapred.dfsclient.parallelism.max
                 Key: HIVE-15881
                 URL: https://issues.apache.org/jira/browse/HIVE-15881
             Project: Hive
          Issue Type: Task
          Components: Query Planning
            Reporter: Sergio Peña
            Assignee: Sergio Peña
            Priority: Minor


The Utilities class has two methods, {{getInputSummary}} and {{getInputPaths}}, that use the
variable {{mapred.dfsclient.parallelism.max}} to get the summary of a list of input locations
in parallel. These methods are Hive related, but the variable name does not look it is specific
for Hive.

Also, the above variable is not on HiveConf nor used anywhere else. I just found a reference
on the Hadoop MR1 code.

I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, and use a different
variable name, such as {{hive.get.input.listing.num.threads}}, that reflects the intention
of the variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message