flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Finding the average temperature
Date Fri, 19 Feb 2016 09:41:52 GMT
as I understand it the “temp_reading_timestamp” field is not a key on which you can partition
your data. This is a field that would be used for assigning the elements to timestamps. In
you data you also have the “probeID” field. This is a field that could be used to parallelize
computation, for example you could
do the following:

val inputStream = <define some source>

val result = inputStream
  .assignAscendingTimestamps { e => e.temp_reading_timestamp }
  .keyBy { e => e.probeID }
  .apply(new SumFunction(), new ComputeAverageFunction())


(Where SumFunction() would sum up temperatures and keep a count and ComputeAverageFunction()
would divide the sum by the count.)

In this way, computation is parallelized because it can be spread across several machines
and partitioned by the key. Without such a key everything has to be computed on one machine
because a global view of the data is required.

> On 18 Feb 2016, at 17:54, Nirmalya Sengupta <sengupta.nirmalya@gmail.com> wrote:
> Hello Aljoscha <aljoscha@apache.org>,
> You mentioned: '.. Yes, this is right if you temperatures don’t have any other field
on which you could partition them. '.
> What I am failing to understand is that if temperatures are partitioned on some other
field (in my use-case, I have one such: the temp_reading_timestamp), they will be pushed to
different nodes (different threads in local run) based on that field. Because they will be
computed (scattered) and later collected (gathered), how could I arrive at the _running_ average
temperature? The client application needs to know *how the average temperature is changing
over time'. 
> Could you please fill in the gap in my understanding?
> -- Nirmalya
> -- 
> Software Technologist
> http://www.linkedin.com/in/nirmalyasengupta
> "If you have built castles in the air, your work need not be lost. That is where they
should be.
> Now put the foundation under them."

View raw message