spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From marmb...@apache.org
Subject spark git commit: [SPARK-10321] sizeInBytes in HadoopFsRelation
Date Thu, 27 Aug 2015 23:38:14 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.5 351e849bb -> fc4c3bf43


[SPARK-10321] sizeInBytes in HadoopFsRelation

Having sizeInBytes in HadoopFsRelation to enable broadcast join.

cc marmbrus

Author: Davies Liu <davies@databricks.com>

Closes #8490 from davies/sizeInByte.

(cherry picked from commit 54cda0deb6bebf1470f16ba5bcc6c4fb842bdac1)
Signed-off-by: Michael Armbrust <michael@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc4c3bf4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fc4c3bf4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fc4c3bf4

Branch: refs/heads/branch-1.5
Commit: fc4c3bf43626ecce75a909d9d0f1acd973f75fbf
Parents: 351e849
Author: Davies Liu <davies@databricks.com>
Authored: Thu Aug 27 16:38:00 2015 -0700
Committer: Michael Armbrust <michael@databricks.com>
Committed: Thu Aug 27 16:38:10 2015 -0700

----------------------------------------------------------------------
 .../src/main/scala/org/apache/spark/sql/sources/interfaces.scala   | 2 ++
 1 file changed, 2 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/fc4c3bf4/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala b/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
index dff726b..7b030b7 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
@@ -518,6 +518,8 @@ abstract class HadoopFsRelation private[sql](maybePartitionSpec: Option[Partitio
 
   override def inputFiles: Array[String] = cachedLeafStatuses().map(_.getPath.toString).toArray
 
+  override def sizeInBytes: Long = cachedLeafStatuses().map(_.getLen).sum
+
   /**
    * Partition columns.  Can be either defined by [[userDefinedPartitionColumns]] or automatically
    * discovered.  Note that they should always be nullable.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message