Return-Path: X-Original-To: apmail-crunch-commits-archive@www.apache.org Delivered-To: apmail-crunch-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5CB541101C for ; Mon, 2 Jun 2014 01:49:44 +0000 (UTC) Received: (qmail 21762 invoked by uid 500); 2 Jun 2014 01:49:44 -0000 Delivered-To: apmail-crunch-commits-archive@crunch.apache.org Received: (qmail 21726 invoked by uid 500); 2 Jun 2014 01:49:44 -0000 Mailing-List: contact commits-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list commits@crunch.apache.org Received: (qmail 21719 invoked by uid 99); 2 Jun 2014 01:49:44 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jun 2014 01:49:44 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 0C291885137; Mon, 2 Jun 2014 01:49:44 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: jwills@apache.org To: commits@crunch.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: git commit: CRUNCH-408: Make HFileSource correctly estimate file sizes when there are wildcards in the path. Contributed by Chao Shi. Date: Mon, 2 Jun 2014 01:49:44 +0000 (UTC) Repository: crunch Updated Branches: refs/heads/master c135bba63 -> fcb861edc CRUNCH-408: Make HFileSource correctly estimate file sizes when there are wildcards in the path. Contributed by Chao Shi. Project: http://git-wip-us.apache.org/repos/asf/crunch/repo Commit: http://git-wip-us.apache.org/repos/asf/crunch/commit/fcb861ed Tree: http://git-wip-us.apache.org/repos/asf/crunch/tree/fcb861ed Diff: http://git-wip-us.apache.org/repos/asf/crunch/diff/fcb861ed Branch: refs/heads/master Commit: fcb861edce7a5b2a76e2a21300b054932174bc47 Parents: c135bba Author: Josh Wills Authored: Sun Jun 1 13:29:46 2014 -0700 Committer: Josh Wills Committed: Sun Jun 1 13:29:46 2014 -0700 ---------------------------------------------------------------------- .../org/apache/crunch/io/hbase/HFileSource.java | 19 +------------------ 1 file changed, 1 insertion(+), 18 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/crunch/blob/fcb861ed/crunch-hbase/src/main/java/org/apache/crunch/io/hbase/HFileSource.java ---------------------------------------------------------------------- diff --git a/crunch-hbase/src/main/java/org/apache/crunch/io/hbase/HFileSource.java b/crunch-hbase/src/main/java/org/apache/crunch/io/hbase/HFileSource.java index b8b6df2..c21cc47 100644 --- a/crunch-hbase/src/main/java/org/apache/crunch/io/hbase/HFileSource.java +++ b/crunch-hbase/src/main/java/org/apache/crunch/io/hbase/HFileSource.java @@ -120,28 +120,11 @@ public class HFileSource extends FileSourceImpl implements ReadableSou long sum = 0; for (Path path : getPaths()) { try { - sum += getSizeInternal(conf, path); + sum += SourceTargetHelper.getPathSize(conf, new Path(path, "*")); } catch (IOException e) { LOG.warn("Failed to estimate size of " + path); } } return sum; } - - private long getSizeInternal(Configuration conf, Path path) throws IOException { - FileSystem fs = path.getFileSystem(conf); - FileStatus[] statuses = fs.listStatus(path, HFileInputFormat.HIDDEN_FILE_FILTER); - if (statuses == null) { - return 0; - } - long sum = 0; - for (FileStatus status : statuses) { - if (status.isDir()) { - sum += SourceTargetHelper.getPathSize(fs, status.getPath()); - } else { - sum += status.getLen(); - } - } - return sum; - } }