streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Blackmon <sblack...@apache.org>
Subject Substantial commit to new branch
Date Fri, 10 Jan 2014 22:47:16 GMT
Greetings,

Yesterday I completed a push of code we've been using to ingest data
streams from several major data providers, validate their messages, and
convert them to activitystreams format. There are some new top-level
modules, including
   a) streams-core - standard interfaces for the atomic units of streams -
providers, persisters, and processors
   b) streams-pojo - Jackson-compatible beans generated from
activitystreams json schemas
   c) streams-contrib - a collection of implementation modules, two or more
of which can be imported into a new project and woven together to create a
customized performant data stream to execute with java jar, storm jar,
hadoop jar, yarn jar, etc...
   d) streams-config - a typesafe-based configuration scheme that allows
individual modules and coordinator code to pull the configuration
parameters they require or support from supplied defaults, environment
variables, run-time property files, command line parameters, or accessible
HTTP end-points.

I'd love to see this project emerge as a code workspace where social data
vendors and consumers collaborate to ease the process of integration, and
facilitate data interchange with public data schemas and protocols such as
xml and json activitystreams formats.  No jvm-centric social data
interoperability ecosystem exists today to my knowledge.  Hopefully this
code will become a valuable starting point.  We have additional assets we
will commit to streams-contrib in the coming months as we get them cleaned
up, compliant with the streams-core interfaces, unit-tested, and real-world
tested.

I've also created a seperate external repository with some reference data
pipelines that demonstrate how to assemble various modules into end-to-end
streams at https://github.com/w2ogroup/streams-examples.  Today it contains
a working twitter gardenhose to activitystreams java process, and a
storm-based firehose processor that is still WIP.  More to come in this
repo as well.

Would love to get feedback on the concepts, patterns, and interfaces
proposed.  Will seek to merge with master in the standard 72 hours unless
anyone objects.

Best,
Steve Blackmon

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message