streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Blackmon <>
Subject Re: Substantial commit to new branch
Date Tue, 21 Jan 2014 22:54:44 GMT

Following up to let everyone know this work has been merged with master.

Additionally, a few small changes to storm-core and an initial
implementation of storm wrappers have been committed.

Anyone interested in how storm would be used to deploy data pipelines,
pull down the latest trunk, install, and then build

At a high level, moreover-metabase-storm/pom.xml and
MoreoverMetabaseTopology are a template for how easy I think it should
be to assemble a storm topology from streams components.

Steve Blackmon

On Fri, Jan 10, 2014 at 4:47 PM, Steve Blackmon <> wrote:
> Greetings,
> Yesterday I completed a push of code we've been using to ingest data streams
> from several major data providers, validate their messages, and convert them
> to activitystreams format. There are some new top-level modules, including
>    a) streams-core - standard interfaces for the atomic units of streams -
> providers, persisters, and processors
>    b) streams-pojo - Jackson-compatible beans generated from activitystreams
> json schemas
>    c) streams-contrib - a collection of implementation modules, two or more
> of which can be imported into a new project and woven together to create a
> customized performant data stream to execute with java jar, storm jar,
> hadoop jar, yarn jar, etc...
>    d) streams-config - a typesafe-based configuration scheme that allows
> individual modules and coordinator code to pull the configuration parameters
> they require or support from supplied defaults, environment variables,
> run-time property files, command line parameters, or accessible HTTP
> end-points.
> I'd love to see this project emerge as a code workspace where social data
> vendors and consumers collaborate to ease the process of integration, and
> facilitate data interchange with public data schemas and protocols such as
> xml and json activitystreams formats.  No jvm-centric social data
> interoperability ecosystem exists today to my knowledge.  Hopefully this
> code will become a valuable starting point.  We have additional assets we
> will commit to streams-contrib in the coming months as we get them cleaned
> up, compliant with the streams-core interfaces, unit-tested, and real-world
> tested.
> I've also created a seperate external repository with some reference data
> pipelines that demonstrate how to assemble various modules into end-to-end
> streams at  Today it contains
> a working twitter gardenhose to activitystreams java process, and a
> storm-based firehose processor that is still WIP.  More to come in this repo
> as well.
> Would love to get feedback on the concepts, patterns, and interfaces
> proposed.  Will seek to merge with master in the standard 72 hours unless
> anyone objects.
> Best,
> Steve Blackmon

View raw message