Return-Path: Delivered-To: apmail-hive-dev-archive@www.apache.org Received: (qmail 10659 invoked from network); 18 Mar 2011 05:10:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Mar 2011 05:10:53 -0000 Received: (qmail 94289 invoked by uid 500); 18 Mar 2011 05:10:52 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 94257 invoked by uid 500); 18 Mar 2011 05:10:52 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 94249 invoked by uid 500); 18 Mar 2011 05:10:51 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 94246 invoked by uid 99); 18 Mar 2011 05:10:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Mar 2011 05:10:51 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Mar 2011 05:10:50 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 8B9CF3AF57F for ; Fri, 18 Mar 2011 05:10:29 +0000 (UTC) Date: Fri, 18 Mar 2011 05:10:29 +0000 (UTC) From: "MIS (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <902258588.11095.1300425029568.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1816002944.14574.1299881039536.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (HIVE-2051) getInputSummary() to call FileSystem.getContentSummary() in parallel MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008328#comment-13008328 ] MIS commented on HIVE-2051: --------------------------- Yes it is necessary for the executor to be terminated if the jobs have been submitted to it, even though submitted jobs may have been completed. However, what we need not do here is, after the executor is shutdown, await till the termination gets over, since this is redundant. As all the submitted jobs to the executor will be completed by the time we shutdown the executor. This is what is ensured when we do result.get() i.e., the following piece of code is not required. + do { + try { + executor.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS); + executorDone = true; + } catch (InterruptedException e) { + } + } while (!executorDone); > getInputSummary() to call FileSystem.getContentSummary() in parallel > -------------------------------------------------------------------- > > Key: HIVE-2051 > URL: https://issues.apache.org/jira/browse/HIVE-2051 > Project: Hive > Issue Type: Improvement > Reporter: Siying Dong > Assignee: Siying Dong > Priority: Minor > Attachments: HIVE-2051.1.patch, HIVE-2051.2.patch, HIVE-2051.3.patch, HIVE-2051.4.patch > > > getInputSummary() now call FileSystem.getContentSummary() one by one, which can be extremely slow when the number of input paths are huge. By calling those functions in parallel, we can cut latency in most cases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira