spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From WeichenXu123 <>
Subject [GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...
Date Thu, 15 Mar 2018 02:49:30 GMT
Github user WeichenXu123 commented on the issue:
    @viirya Yes. `treeAggregate` should only apply to global aggregate.
    But in this PR the API have to use `seqOp`/`combOp`.
    What I expect is that the dataframe version treeAggregate can exploit built-in agg function
(suppose in the future we have built-in agg function for vector type)
    `dataset.groupBy()` if do not given any key column then it will group the whole dataset
so it can match the case of treeAggregate, or do you have some better design ?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message