hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Illya Yalovyy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max
Date Mon, 13 Feb 2017 21:27:41 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864477#comment-15864477
] 

Illya Yalovyy commented on HIVE-15881:
--------------------------------------

I think for Utilities#getInputSummary the property name should be "hive.exec.input.summary.max.threads",
to be consistent with other properties in HiveConf. This value is not used as is to create
a thread pool, it is only an upper limit for the thread pool size. If number of input paths
is less than hive.exec.input.summary.max.threads it will be used instead. It means the actual
number of threads will be <= hive.exec.input.summary.max.threads.

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-15881
>                 URL: https://issues.apache.org/jira/browse/HIVE-15881
>             Project: Hive
>          Issue Type: Task
>          Components: Query Planning
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>            Priority: Minor
>
> The Utilities class has two methods, {{getInputSummary}} and {{getInputPaths}}, that
use the variable {{mapred.dfsclient.parallelism.max}} to get the summary of a list of input
locations in parallel. These methods are Hive related, but the variable name does not look
it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just found a reference
on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, and use
a different variable name, such as {{hive.get.input.listing.num.threads}}, that reflects the
intention of the variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message