spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: Aggregating over sorted data
Date Tue, 20 Dec 2016 03:32:01 GMT
take a look at:
https://issues.apache.org/jira/browse/SPARK-15798


On Dec 19, 2016 00:17, "Robin East" <robin.east@xense.co.uk> wrote:

This is also a feature we need for our time-series processing



> On 19 Dec 2016, at 04:07, Liang-Chi Hsieh <viirya@gmail.com> wrote:
>
>
> Hi,
>
> As I know, Spark SQL doesn't provide native support for this feature now.
> After searching, I found only few database systems support it, e.g.,
> PostgreSQL.
>
> Actually based on the Spark SQL's aggregate system, I think it is not very
> difficult to add the support for this feature. The problem is how
frequently
> this feature is needed for Spark SQL users and if it is worth adding this,
> because as I see, this feature is not very common.
>
> Alternative possible to achieve this in current Spark SQL, is to use
> Aggregator with Dataset API. You can write your custom Aggregator which
has
> an user-defined JVM object as buffer to hold the input data into your
> aggregate function. But you may need to write necessary encoder for the
> buffer object.
>
> If you really need this feature, you may open a Jira to ask others'
opinion
> about this feature.
>
>
>
>
>
>
> -----
> Liang-Chi Hsieh | @viirya
> Spark Technology Center
> http://www.spark.tc/
> --
> View this message in context: http://apache-spark-
developers-list.1001551.n3.nabble.com/Aggregating-over-
sorted-data-tp19999p20273.html
> Sent from the Apache Spark Developers List mailing list archive at
Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Mime
View raw message