spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From WeichenXu123 <...@git.apache.org>
Subject [GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...
Date Thu, 15 Mar 2018 02:49:30 GMT
Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/20806
  
    @viirya Yes. `treeAggregate` should only apply to global aggregate.
    But in this PR the API have to use `seqOp`/`combOp`.
    What I expect is that the dataframe version treeAggregate can exploit built-in agg function
(suppose in the future we have built-in agg function for vector type)
    
    `dataset.groupBy()` if do not given any key column then it will group the whole dataset
so it can match the case of treeAggregate, or do you have some better design ?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message