spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From HyukjinKwon <...@git.apache.org>
Subject [GitHub] spark pull request #20902: [SPARK-23770][R] Exposes repartitionByRange in Sp...
Date Mon, 26 Mar 2018 06:42:22 GMT
GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/20902

    [SPARK-23770][R] Exposes repartitionByRange in SparkR

    ## What changes were proposed in this pull request?
    
    This PR proposes to expose `repartitionByRange`. 
    
    ```R
    > df <- createDataFrame(iris)
    ...
    > getNumPartitions(repartitionByRange(df, 3, col = df$Species))
    [1] 3
    ```
    
    ## How was this patch tested?
    
    Manually tested and the unit tests were added. The diff with `repartition` can be checked
as below:
    
    ```R
    > df <- createDataFrame(mtcars)
    > take(repartition(df, 10, df$wt), 3)
       mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    1 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
    2 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
    3 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
    > take(repartitionByRange(df, 10, df$wt), 3)
       mpg cyl disp hp drat    wt  qsec vs am gear carb
    1 30.4   4 75.7 52 4.93 1.615 18.52  1  1    4    2
    2 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
    3 27.3   4 79.0 66 4.08 1.935 18.90  1  1    4    1
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark r-repartitionByRange

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20902.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20902
    
----
commit 264b50e9f480647d0a807e8c591a7a36944322ce
Author: hyukjinkwon <gurwls223@...>
Date:   2018-03-26T04:37:53Z

    Expose repartitionByRange in SparkR

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message