streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Letourneau <>
Subject Re: Substantial commit to new branch
Date Tue, 21 Jan 2014 23:08:43 GMT
This is bad@$$ - hoping to run spin it up in the next couple of days!

On Tue, Jan 21, 2014 at 5:54 PM, Steve Blackmon <> wrote:
> Hello,
> Following up to let everyone know this work has been merged with master.
> Additionally, a few small changes to storm-core and an initial
> implementation of storm wrappers have been committed.
> Anyone interested in how storm would be used to deploy data pipelines,
> pull down the latest trunk, install, and then build
> At a high level, moreover-metabase-storm/pom.xml and
> MoreoverMetabaseTopology are a template for how easy I think it should
> be to assemble a storm topology from streams components.
> Steve Blackmon
> On Fri, Jan 10, 2014 at 4:47 PM, Steve Blackmon <> wrote:
>> Greetings,
>> Yesterday I completed a push of code we've been using to ingest data streams
>> from several major data providers, validate their messages, and convert them
>> to activitystreams format. There are some new top-level modules, including
>>    a) streams-core - standard interfaces for the atomic units of streams -
>> providers, persisters, and processors
>>    b) streams-pojo - Jackson-compatible beans generated from activitystreams
>> json schemas
>>    c) streams-contrib - a collection of implementation modules, two or more
>> of which can be imported into a new project and woven together to create a
>> customized performant data stream to execute with java jar, storm jar,
>> hadoop jar, yarn jar, etc...
>>    d) streams-config - a typesafe-based configuration scheme that allows
>> individual modules and coordinator code to pull the configuration parameters
>> they require or support from supplied defaults, environment variables,
>> run-time property files, command line parameters, or accessible HTTP
>> end-points.
>> I'd love to see this project emerge as a code workspace where social data
>> vendors and consumers collaborate to ease the process of integration, and
>> facilitate data interchange with public data schemas and protocols such as
>> xml and json activitystreams formats.  No jvm-centric social data
>> interoperability ecosystem exists today to my knowledge.  Hopefully this
>> code will become a valuable starting point.  We have additional assets we
>> will commit to streams-contrib in the coming months as we get them cleaned
>> up, compliant with the streams-core interfaces, unit-tested, and real-world
>> tested.
>> I've also created a seperate external repository with some reference data
>> pipelines that demonstrate how to assemble various modules into end-to-end
>> streams at  Today it contains
>> a working twitter gardenhose to activitystreams java process, and a
>> storm-based firehose processor that is still WIP.  More to come in this repo
>> as well.
>> Would love to get feedback on the concepts, patterns, and interfaces
>> proposed.  Will seek to merge with master in the standard 72 hours unless
>> anyone objects.
>> Best,
>> Steve Blackmon

View raw message