Return-Path: X-Original-To: apmail-spark-commits-archive@minotaur.apache.org Delivered-To: apmail-spark-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D2955109B7 for ; Wed, 1 Jan 2014 01:48:58 +0000 (UTC) Received: (qmail 31201 invoked by uid 500); 1 Jan 2014 01:48:58 -0000 Delivered-To: apmail-spark-commits-archive@spark.apache.org Received: (qmail 31169 invoked by uid 500); 1 Jan 2014 01:48:58 -0000 Mailing-List: contact commits-help@spark.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@spark.incubator.apache.org Delivered-To: mailing list commits@spark.incubator.apache.org Received: (qmail 31157 invoked by uid 99); 1 Jan 2014 01:48:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jan 2014 01:48:58 +0000 X-ASF-Spam-Status: No, hits=-2000.4 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 01 Jan 2014 01:48:53 +0000 Received: (qmail 30530 invoked by uid 99); 1 Jan 2014 01:48:32 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jan 2014 01:48:32 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id B532B911087; Wed, 1 Jan 2014 01:48:31 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: rxin@apache.org To: commits@spark.incubator.apache.org Date: Wed, 01 Jan 2014 01:48:48 -0000 Message-Id: <0226bd1037344a668c9bf64fbf794b1a@git.apache.org> In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [18/20] git commit: minor improvements X-Virus-Checked: Checked by ClamAV on apache.org minor improvements Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/acb03230 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/acb03230 Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/acb03230 Branch: refs/heads/master Commit: acb0323053d270a377e497e975b2dfe59e2f997c Parents: d6cded7 Author: Hossein Falaki Authored: Tue Dec 31 15:34:26 2013 -0800 Committer: Hossein Falaki Committed: Tue Dec 31 15:34:26 2013 -0800 ---------------------------------------------------------------------- core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala | 5 ++--- core/src/main/scala/org/apache/spark/rdd/RDD.scala | 4 +++- 2 files changed, 5 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/acb03230/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala ---------------------------------------------------------------------- diff --git a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala index 1dc5f8d..088b298 100644 --- a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala +++ b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala @@ -229,9 +229,8 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self: RDD[(K, V)]) } val mergeHLL = (h1: SerializableHyperLogLog, h2: SerializableHyperLogLog) => h1.merge(h2) - combineByKey(createHLL, mergeValueHLL, mergeHLL, partitioner).map { - case (k, v) => (k, v.value.cardinality()) - } + combineByKey(createHLL, mergeValueHLL, mergeHLL, partitioner).mapValues(_.value.cardinality()) + } /** http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/acb03230/core/src/main/scala/org/apache/spark/rdd/RDD.scala ---------------------------------------------------------------------- diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala b/core/src/main/scala/org/apache/spark/rdd/RDD.scala index 74fab48..161fd06 100644 --- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala +++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala @@ -809,7 +809,9 @@ abstract class RDD[T: ClassTag]( } def mergeCounters(c1: SerializableHyperLogLog, c2: SerializableHyperLogLog) = c1.merge(c2) - mapPartitions(hllCountPartition).reduce(mergeCounters).value.cardinality() + val zeroCounter = new SerializableHyperLogLog(new HyperLogLog(relativeSD)) + mapPartitions(hllCountPartition).aggregate(zeroCounter)(mergeCounters, mergeCounters) + .value.cardinality() } /**