spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bahubali Jain <bahub...@gmail.com>
Subject Re: Time based aggregation in Real time Spark Streaming
Date Mon, 01 Dec 2014 16:41:30 GMT
Hi,
You can associate all the messages of a 3min interval with a unique key and
then group by and finally add up.

Thanks
On Dec 1, 2014 9:02 PM, "pankaj" <pankajentc@gmail.com> wrote:

> Hi,
>
> My incoming message has time stamp as one field and i have to perform
> aggregation over 3 minute of time slice.
>
> Message sample
>
> "Item ID" "Item Type" "timeStamp"
> 1                  X               1-12-2014:12:01
> 1                  X               1-12-2014:12:02
> 1                  X               1-12-2014:12:03
> 1                  y               1-12-2014:12:04
> 1                  y               1-12-2014:12:05
> 1                  y               1-12-2014:12:06
>
> Aggregation Result
> ItemId        ItemType      count   aggregationStartTime    aggrEndTime
> 1                  X                     3          1-12-2014:12:01
>   1-12-2014:12:03
> 1                  y                      3       1-12-2014:12:04
>  1-12-2014:12:06
>
> What is the best way to perform time based aggregation in spark.
> Kindly suggest.
>
> Thanks
>
> ------------------------------
> View this message in context: Time based aggregation in Real time Spark
> Streaming
> <http://apache-spark-user-list.1001560.n3.nabble.com/Time-based-aggregation-in-Real-time-Spark-Streaming-tp20102.html>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>

Mime
View raw message