incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Update of "DataflowProposal" by jbonofre
Date Wed, 20 Jan 2016 16:42:58 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "DataflowProposal" page has been changed by jbonofre:
https://wiki.apache.org/incubator/DataflowProposal?action=diff&rev1=2&rev2=3

  
   * MapReduce - http://research.google.com/archive/mapreduce.html
   * Dataflow model  - http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf
-  * FlumeJava - http://notes.stephenholiday.com/FlumeJava.pdf
+  * FlumeJava - http://research.google.com/pubs/pub35650.html
   * MillWheel - http://research.google.com/pubs/pub41378.html
  
  Dataflow was designed from the start to provide a portable programming layer. When you define
a data processing pipeline with the Dataflow model, you are creating a job which is capable
of being processed by any number of Dataflow processing engines. Several engines have been
developed to run Dataflow pipelines in other open source runtimes, including a Dataflow runner
for Apache Flink and Apache Spark. There is also a “direct runner”, for execution on the
developer machine (mainly for dev/debug purposes). Another runner allows a Dataflow program
to run on a managed service, Google Cloud Dataflow, in Google Cloud Platform. The Dataflow
Java SDK is already available on GitHub, and independent from the Google Cloud Dataflow service.
Another Python SDK is currently in active development.

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message