storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Xor <>
Subject Batch Process tuples emitted by different streams
Date Thu, 31 Jul 2014 02:14:13 GMT

 I have a scenario where I have a bolt that receives the outputs of
multiple spouts (each spout is a live-stream of emitted sensor values).
>From my understanding the processing bolt that is assigned to the task will
receive each tuple by it's own separately (per each stream).

The thing is that I want to process in the bolt the values of all the
streams in the same tick. One method (if I have only one thread in the
processing bolt) is to wait small-time period or some ticks (for example
process and emit per x received tuples while storing in a map the tuples
received from each stream); that helps to receive tuples of each stream and
process it as a batch.

Will that be a sound approach in a non-transactional topology or should I
use Trident in order to ensure ordering? Also in Storm's documentation I
could not find if the chronological-ordering is enforced in any way... for
example let's say that we have two spouts that each emit two tuples:

 Spout1: (Tuple1, t1), (Tuple2, t2)
 Spout2: (Tuple3, t1), (Tuple4, t2)

In which order will the bolt receive the tuples? Will the chronological
order be preserved in a trident topology?


View raw message