spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From saru...@apache.org
Subject spark git commit: [SPARK-14368][PYSPARK] Support python.spark.worker.memory with upper-case unit.
Date Tue, 05 Apr 2016 03:20:04 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.6 91530b09e -> 285cb9c66


[SPARK-14368][PYSPARK] Support python.spark.worker.memory with upper-case unit.

## What changes were proposed in this pull request?

This fix tries to address the issue in PySpark where `spark.python.worker.memory`
could only be configured with a lower case unit (`k`, `m`, `g`, `t`). This fix
allows the upper case unit (`K`, `M`, `G`, `T`) to be used as well. This is to
conform to the JVM memory string as is specified in the documentation .

## How was this patch tested?

This fix adds additional test to cover the changes.

Author: Yong Tang <yong.tang.github@outlook.com>

Closes #12163 from yongtang/SPARK-14368.

(cherry picked from commit 7db56244fa3dba92246bad6694f31bbf68ea47ec)
Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.co.jp>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/285cb9c6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/285cb9c6
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/285cb9c6

Branch: refs/heads/branch-1.6
Commit: 285cb9c66238d67ea8dc8c07358802b57a0d9f84
Parents: 91530b0
Author: Yong Tang <yong.tang.github@outlook.com>
Authored: Tue Apr 5 12:19:20 2016 +0900
Committer: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Committed: Tue Apr 5 12:19:50 2016 +0900

----------------------------------------------------------------------
 python/pyspark/rdd.py   |  2 +-
 python/pyspark/tests.py | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/285cb9c6/python/pyspark/rdd.py
----------------------------------------------------------------------
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py
index 00bb9a6..1ed098c 100644
--- a/python/pyspark/rdd.py
+++ b/python/pyspark/rdd.py
@@ -115,7 +115,7 @@ def _parse_memory(s):
     2048
     """
     units = {'g': 1024, 'm': 1, 't': 1 << 20, 'k': 1.0 / 1024}
-    if s[-1] not in units:
+    if s[-1].lower() not in units:
         raise ValueError("invalid format: " + s)
     return int(float(s[:-1]) * units[s[-1].lower()])
 

http://git-wip-us.apache.org/repos/asf/spark/blob/285cb9c6/python/pyspark/tests.py
----------------------------------------------------------------------
diff --git a/python/pyspark/tests.py b/python/pyspark/tests.py
index 5cb0a1b..7e072c0 100644
--- a/python/pyspark/tests.py
+++ b/python/pyspark/tests.py
@@ -1966,6 +1966,18 @@ class ContextTests(unittest.TestCase):
             self.assertGreater(sc.startTime, 0)
 
 
+class ConfTests(unittest.TestCase):
+    def test_memory_conf(self):
+        memoryList = ["1T", "1G", "1M", "1024K"]
+        for memory in memoryList:
+            sc = SparkContext(conf=SparkConf().set("spark.python.worker.memory", memory))
+            l = list(range(1024))
+            random.shuffle(l)
+            rdd = sc.parallelize(l, 4)
+            self.assertEqual(sorted(l), rdd.sortBy(lambda x: x).collect())
+            sc.stop()
+
+
 @unittest.skipIf(not _have_scipy, "SciPy not installed")
 class SciPyTests(PySparkTestCase):
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message