hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siying Dong (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-2051) getInputSummary() to call FileSystem.getContentSummary() in parallel
Date Fri, 18 Mar 2011 22:21:29 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008657#comment-13008657
] 

Siying Dong commented on HIVE-2051:
-----------------------------------

Joydeep, sorry you were talking about ExecutionException about InterruptedException. In that
case, I'll just rethrow it.

> getInputSummary() to call FileSystem.getContentSummary() in parallel
> --------------------------------------------------------------------
>
>                 Key: HIVE-2051
>                 URL: https://issues.apache.org/jira/browse/HIVE-2051
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>            Priority: Minor
>         Attachments: HIVE-2051.1.patch, HIVE-2051.2.patch, HIVE-2051.3.patch, HIVE-2051.4.patch
>
>
> getInputSummary() now call FileSystem.getContentSummary() one by one, which can be extremely
slow when the number of input paths are huge. By calling those functions in parallel, we can
cut latency in most cases.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message