incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <>
Subject Re: Loading data from Accumulo (or any other database)
Date Mon, 23 Jul 2012 21:09:01 GMT

This is certainly possible to do, through an "adapter" app, i.e. code that
converts some input into S4 events.

Have a look at the S4 piper implementation (about to be released), there is
an example application that loads data from a twitter stream.

In your case, you´d load data from a database, and would just need to
implement the database iterator. Depending on your requirements, you can
also parallelize the adapter (it´s an S4 application after all), instead of
relying on a single PE as in the twitter adapter example.

Hope this helps,


On Mon, Jul 23, 2012 at 10:17 PM, Sean Pines <> wrote:

> Hello,
> I was wondering if there was a way to load data from a database into a S4
> workflow. I'm not sure if you're familiar with Accumulo, but they provide a
> POJO to scan over the database one row at a time. I would like to be able
> to loop over the table one row at a time and emit an event (for each row)
> to the first Processing Element.
> If this is possible within S4, could you please provide an example of how
> to do so? Ideally I'd like to have all of this done within Java (the only
> 'ingest' examples I can find are via shell scripts).
> Thanks in advance!
> Sean

View raw message