storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neil Carroll <>
Subject RE: Can Storm write an Aggregate Record to Postgres or SQL Server?
Date Tue, 08 Apr 2014 23:11:40 GMT
Many thanks! 
Date: Tue, 8 Apr 2014 15:05:48 -0500
Subject: Re: Can Storm write an Aggregate Record to Postgres or SQL Server?

Can you elaborate on how you want to "aggregate" data? If each log entry is essentially a
timestamp, a transaction type (since you mentioned this), and a numerical value (which you
cant to sum during the 5-minute window), then you don't need tick tuples.

The way we do aggregation is by mapping a timestamp into a bucket (e.g., your 5-minute window),
grouping by the timestamp and transactionType, and using trident's persistentAggregate functionality
to compute the sum.

Something like this:

    TridentTopology topology = new TridentTopology();
    Stream stream = topology.newStream("spout2", spout)
      .each(new Fields("bytes"), new BinaryToString(), new Fields("string"))

      .each(new Fields("string"), new LogParser(), new Fields("timestamp", "transactionType",

      .each(new Fields("timestamp"), new Bucket(entry.getValue()), new Fields("bucketStart",

      .groupBy(new Fields("bucketStart", "transactionType"))

      .persistentAggregate(stateFactory, new Fields("value"), new Sum(), new Fields("count"))

Of course, you have to write the LogParser yourself since it depends on the format of the
input messages. This example assumes a Kafka spout which is why it starts by parsing a "bytes"
field. You can see the various helper functions here:


On Tue, Apr 8, 2014 at 2:03 PM, Huang, Roger <> wrote:

Take a look at using “tick tuples”

and the Storm RDBMS bolt


From: Neil Carroll []

Sent: Tuesday, April 08, 2014 1:42 PM


Subject: Can Storm write an Aggregate Record to Postgres or SQL Server?


I'm new to Storm and want to use it to aggregate log data over a 5 minute period and write
aggregate records (for each transaction type) into a DCMS (SQL or Postgres).
 I believe Storm can do this - and is there sample code available?





Cody A. Ray, LEED AP


View raw message