Mailing-List: contact dev-help@flink.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@flink.incubator.apache.org
MIME-Version: 1.0
In-Reply-To: <20150107104034.45D51AC003E@hades.apache.org>
References: <20150107104034.45D51AC003E@hades.apache.org>
From: Robert Metzger <rmetzger@apache.org>
Date: Wed, 7 Jan 2015 11:43:46 +0100
Message-ID: 
 <CAGr9p8Cn67=YT+oXJPU886+KUDFEi8i64PDnu8927bO56XV_Rw@mail.gmail.com>
Subject: Re: svn commit: r1650029 - in /flink:
 _posts/2015-01-06-december-in-flink.md
 site/blog/index.html site/blog/page2/index.html site/blog/page3/index.html
 site/news/2015/ site/news/2015/01/ site/news/2015/01/06/
 site/news/2015/01/06/december-in-flink.html
To: "dev@flink.incubator.apache.org" <dev@flink.incubator.apache.org>
Content-Type: multipart/alternative; boundary=047d7b3a8192683abf050c0d9a41

--047d7b3a8192683abf050c0d9a41
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Just FYI, the svnpubsub for the website is currently not working.
This is the respective issue for the website migration:
https://issues.apache.org/jira/browse/INFRA-8915

On Wed, Jan 7, 2015 at 11:40 AM, <ktzoumas@apache.org> wrote:

> Author: ktzoumas
> Date: Wed Jan  7 10:40:31 2015
> New Revision: 1650029
>
> URL: http://svn.apache.org/r1650029
> Log:
> Added blog post - December 2014 in the Flink community
>
> Added:
>     flink/_posts/2015-01-06-december-in-flink.md
>     flink/site/news/2015/
>     flink/site/news/2015/01/
>     flink/site/news/2015/01/06/
>     flink/site/news/2015/01/06/december-in-flink.html
> Modified:
>     flink/site/blog/index.html
>     flink/site/blog/page2/index.html
>     flink/site/blog/page3/index.html
>
> Added: flink/_posts/2015-01-06-december-in-flink.md
> URL:
> http://svn.apache.org/viewvc/flink/_posts/2015-01-06-december-in-flink.md=
?rev=3D1650029&view=3Dauto
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> --- flink/_posts/2015-01-06-december-in-flink.md (added)
> +++ flink/_posts/2015-01-06-december-in-flink.md Wed Jan  7 10:40:31 2015
> @@ -0,0 +1,62 @@
> +---
> +layout: post
> +title:  'December 2014 in the Flink community'
> +date:   2015-01-06 10:00:00
> +categories: news
> +---
> +
> +This is the first blog post of a =C3=A2=E2=82=AC=C5=93newsletter=C3=A2=
=E2=82=AC  like series where we
> give a summary of the monthly activity in the Flink community. As the Fli=
nk
> project grows, this can serve as a "tl;dr" for people that are not
> following the Flink dev and user mailing lists, or those that are simply
> overwhelmed by the traffic.
> +
> +
> +###Flink graduation
> +
> +The biggest news is that the Apache board approved Flink as a top-level
> Apache project! The Flink team is working closely with the Apache press
> team for an official announcement, so stay tuned for details!
> +
> +###New Flink website
> +
> +The [Flink website](http://flink.apache.org) got a total make-over, both
> in terms of appearance and content.
> +
> +###Flink IRC channel
> +
> +A new IRC channel called #flink was created at irc.freenode.org. An easy
> way to access the IRC channel is through the [web client](
> http://webchat.freenode.net/).  Feel free to stop by to ask anything or
> share your ideas about Apache Flink!
> +
> +###Meetups and Talks
> +
> +Apache Flink was presented in the [Amsterdam Hadoop User Group](
> http://www.meetup.com/Netherlands-Hadoop-User-Group/events/218635152)
> +
> +##Notable code contributions
> +
> +**Note:** Code contributions listed here may not be part of a release or
> even the current snapshot yet.
> +
> +###[Streaming Scala API](
> https://github.com/apache/incubator-flink/pull/275)
> +
> +The Flink Streaming Java API recently got its Scala counterpart. Once
> merged, Flink Streaming users can use both Scala and Java for their
> development. The Flink Streaming Scala API is built as a thin layer on to=
p
> of the Java API, making sure that the APIs are kept easily in sync.
> +
> +###[Intermediate datasets](
> https://github.com/apache/incubator-flink/pull/254)
> +
> +This pull request introduces a major change in the Flink runtime.
> Currently, the Flink runtime is based on the notion of operators that
> exchange data through channels. With the PR, intermediate data sets that
> are produced by operators become first-class citizens in the runtime. Whi=
le
> this does not have any user-facing impact yet, it lays the groundwork for=
 a
> slew of future features such as blocking execution, fine-grained
> fault-tolerance, and more efficient data sharing between cluster and clie=
nt.
> +
> +###[Configurable execution mode](
> https://github.com/apache/incubator-flink/pull/259)
> +
> +This pull request allows the user to change the object-reuse behaviour.
> Before this pull request, some operations would reuse objects passed to t=
he
> user function while others would always create new objects. This introduc=
es
> a system wide switch and changes all operators to either reuse objects or
> don=C3=A2=E2=82=AC=E2=84=A2t reuse objects.
> +
> +###[Distributed Coordination via Akka](
> https://github.com/apache/incubator-flink/pull/149)
> +
> +Another major change is a complete rewrite of the JobManager /
> TaskManager components in Scala. In addition to that, the old RPC service
> was replaced by Actors, using the Akka framework.
> +
> +###[Sorting of very large records](
> https://github.com/apache/incubator-flink/pull/249 )
> +
> +Flink's internal sort-algorithms were improved to better handle large
> records (multiple 100s of megabytes or larger). Previously, the system di=
d
> in some cases hold instances of multiple large records, resulting in high
> memory consumption and JVM heap thrashing. Through this fix, large record=
s
> are streamed through the operators, reducing the memory consumption and G=
C
> pressure. The system now requires much less memory to support algorithms
> that work on such large records.
> +
> +###[Kryo Serialization as the new default fallback](
> https://github.com/apache/incubator-flink/pull/271)
> +
> +Flink=C3=A2=E2=82=AC=E2=84=A2s build-in type serialization framework is =
handles all common
> types very efficiently. Prior versions uses Avro to serialize types that
> the built-in framework could not handle.
> +Flink serialization system improved a lot over time and by now surpasses
> the capabilities of Avro in many cases. Kryo now serves as the default
> fallback serialization framework, supporting a much broader range of type=
s.
> +
> +###[Hadoop FileSystem support](
> https://github.com/apache/incubator-flink/pull/268)
> +
> +This change permits users to use all file systems supported by Hadoop
> with Flink. In practice this means that users can use Flink with Tachyon,
> Google Cloud Storage (also out of the box Flink YARN support on Google
> Compute Cloud), FTP and all the other file system implementations for
> Hadoop.
> +
> +##Heading to the 0.8.0 release
> +
> +The community is working hard together with the Apache infra team to
> migrate the Flink infrastructure to a top-level project. At the same time=
,
> the Flink community is working on the Flink 0.8.0 release which should be
> out very soon.
> \ No newline at end of file
>
> Modified: flink/site/blog/index.html
> URL:
> http://svn.apache.org/viewvc/flink/site/blog/index.html?rev=3D1650029&r1=
=3D1650028&r2=3D1650029&view=3Ddiff
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> --- flink/site/blog/index.html (original)
> +++ flink/site/blog/index.html Wed Jan  7 10:40:31 2015
> @@ -131,6 +131,68 @@
>                 <div class=3D"col-md-8">
>
>                         <article>
> +                               <h2><a
> href=3D"/news/2015/01/06/december-in-flink.html">December 2014 in the Fli=
nk
> community</a></h2>
> +                               <p class=3D"meta">06 Jan 2015</p>
> +
> +                               <div><p>This is the first blog post of a
> =C3=A2=E2=82=AC=C5=93newsletter=C3=A2=E2=82=AC  like series where we give=
 a summary of the monthly
> activity in the Flink community. As the Flink project grows, this can ser=
ve
> as a &quot;tl;dr&quot; for people that are not following the Flink dev an=
d
> user mailing lists, or those that are simply overwhelmed by the traffic.<=
/p>
> +
> +<h3 id=3D"flink-graduation">Flink graduation</h3>
> +
> +<p>The biggest news is that the Apache board approved Flink as a
> top-level Apache project! The Flink team is working closely with the Apac=
he
> press team for an official announcement, so stay tuned for details!</p>
> +
> +<h3 id=3D"new-flink-website">New Flink website</h3>
> +
> +<p>The <a href=3D"http://flink.apache.org">Flink website</a> got a total
> make-over, both in terms of appearance and content.</p>
> +
> +<h3 id=3D"flink-irc-channel">Flink IRC channel</h3>
> +
> +<p>A new IRC channel called #flink was created at irc.freenode.org. An
> easy way to access the IRC channel is through the <a href=3D"
> http://webchat.freenode.net/">web client</a>.  Feel free to stop by to
> ask anything or share your ideas about Apache Flink!</p>
> +
> +<h3 id=3D"meetups-and-talks">Meetups and Talks</h3>
> +
> +<p>Apache Flink was presented in the <a href=3D"
> http://www.meetup.com/Netherlands-Hadoop-User-Group/events/218635152">Ams=
terdam
> Hadoop User Group</a></p>
> +
> +<h2 id=3D"notable-code-contributions">Notable code contributions</h2>
> +
> +<p><strong>Note:</strong> Code contributions listed here may not be part
> of a release or even the current snapshot yet.</p>
> +
> +<h3 id=3D"streaming-scala-api"><a href=3D"
> https://github.com/apache/incubator-flink/pull/275">Streaming Scala
> API</a></h3>
> +
> +<p>The Flink Streaming Java API recently got its Scala counterpart. Once
> merged, Flink Streaming users can use both Scala and Java for their
> development. The Flink Streaming Scala API is built as a thin layer on to=
p
> of the Java API, making sure that the APIs are kept easily in sync.</p>
> +
> +<h3 id=3D"intermediate-datasets"><a href=3D"
> https://github.com/apache/incubator-flink/pull/254">Intermediate
> datasets</a></h3>
> +
> +<p>This pull request introduces a major change in the Flink runtime.
> Currently, the Flink runtime is based on the notion of operators that
> exchange data through channels. With the PR, intermediate data sets that
> are produced by operators become first-class citizens in the runtime. Whi=
le
> this does not have any user-facing impact yet, it lays the groundwork for=
 a
> slew of future features such as blocking execution, fine-grained
> fault-tolerance, and more efficient data sharing between cluster and
> client.</p>
> +
> +<h3 id=3D"configurable-execution-mode"><a href=3D"
> https://github.com/apache/incubator-flink/pull/259">Configurable
> execution mode</a></h3>
> +
> +<p>This pull request allows the user to change the object-reuse
> behaviour. Before this pull request, some operations would reuse objects
> passed to the user function while others would always create new objects.
> This introduces a system wide switch and changes all operators to either
> reuse objects or don=C3=A2=E2=82=AC=E2=84=A2t reuse objects.</p>
> +
> +<h3 id=3D"distributed-coordination-via-akka"><a href=3D"
> https://github.com/apache/incubator-flink/pull/149">Distributed
> Coordination via Akka</a></h3>
> +
> +<p>Another major change is a complete rewrite of the JobManager /
> TaskManager components in Scala. In addition to that, the old RPC service
> was replaced by Actors, using the Akka framework.</p>
> +
> +<h3 id=3D"sorting-of-very-large-records"><a href=3D"
> https://github.com/apache/incubator-flink/pull/249">Sorting of very large
> records</a></h3>
> +
> +<p>Flink&#39;s internal sort-algorithms were improved to better handle
> large records (multiple 100s of megabytes or larger). Previously, the
> system did in some cases hold instances of multiple large records,
> resulting in high memory consumption and JVM heap thrashing. Through this
> fix, large records are streamed through the operators, reducing the memor=
y
> consumption and GC pressure. The system now requires much less memory to
> support algorithms that work on such large records.</p>
> +
> +<h3 id=3D"kryo-serialization-as-the-new-default-fallback"><a href=3D"
> https://github.com/apache/incubator-flink/pull/271">Kryo Serialization as
> the new default fallback</a></h3>
> +
> +<p>Flink=C3=A2=E2=82=AC=E2=84=A2s build-in type serialization framework =
is handles all common
> types very efficiently. Prior versions uses Avro to serialize types that
> the built-in framework could not handle.
> +Flink serialization system improved a lot over time and by now surpasses
> the capabilities of Avro in many cases. Kryo now serves as the default
> fallback serialization framework, supporting a much broader range of
> types.</p>
> +
> +<h3 id=3D"hadoop-filesystem-support"><a href=3D"
> https://github.com/apache/incubator-flink/pull/268">Hadoop FileSystem
> support</a></h3>
> +
> +<p>This change permits users to use all file systems supported by Hadoop
> with Flink. In practice this means that users can use Flink with Tachyon,
> Google Cloud Storage (also out of the box Flink YARN support on Google
> Compute Cloud), FTP and all the other file system implementations for
> Hadoop.</p>
> +
> +<h2 id=3D"heading-to-the-0.8.0-release">Heading to the 0.8.0 release</h2=
>
> +
> +<p>The community is working hard together with the Apache infra team to
> migrate the Flink infrastructure to a top-level project. At the same time=
,
> the Flink community is working on the Flink 0.8.0 release which should be
> out very soon.</p>
> +</div>
> +                               <a
> href=3D"/news/2015/01/06/december-in-flink.html#disqus_thread">December 2=
014
> in the Flink community</a>
> +                       </article>
> +
> +                       <article>
>                                 <h2><a
> href=3D"/news/2014/11/18/hadoop-compatibility.html">Hadoop Compatibility =
in
> Flink</a></h2>
>                                 <p class=3D"meta">18 Nov 2014</p>
>
> @@ -786,98 +848,6 @@ Inspect the result in HDFS using:</p>
>                                 <a
> href=3D"/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html#disqus_=
thread">Use
> Stratosphere with Amazon Elastic MapReduce</a>
>                         </article>
>
> -                       <article>
> -                               <h2><a
> href=3D"/news/2014/01/28/querying_mongodb.html">Accessing Data Stored in
> MongoDB with Stratosphere</a></h2>
> -                               <p class=3D"meta">28 Jan 2014</p>
> -
> -                               <div><p>We recently merged a <a href=3D"
> https://github.com/stratosphere/stratosphere/pull/437">pull request</a>
> that allows you to use any existing Hadoop <a href=3D"
> http://developer.yahoo.com/hadoop/tutorial/module5.html#inputformat">Inpu=
tFormat</a>
> with Stratosphere. So you can now (in the <code>0.5-SNAPSHOT</code> and
> upwards versions) define a Hadoop-based data source:</p>
> -<div class=3D"highlight"><pre><code class=3D"language-java"
> data-lang=3D"java"><span class=3D"n">HadoopDataSource</span> <span
> class=3D"n">source</span> <span class=3D"o">=3D</span> <span class=3D"k">=
new</span>
> <span class=3D"nf">HadoopDataSource</span><span class=3D"o">(</span><span
> class=3D"k">new</span> <span class=3D"nf">TextInputFormat</span><span
> class=3D"o">(),</span> <span class=3D"k">new</span> <span
> class=3D"nf">JobConf</span><span class=3D"o">(),</span> <span
> class=3D"s">&quot;Input Lines&quot;</span><span class=3D"o">);</span>
> -<span class=3D"n">TextInputFormat</span><span class=3D"o">.</span><span
> class=3D"na">addInputPath</span><span class=3D"o">(</span><span
> class=3D"n">source</span><span class=3D"o">.</span><span
> class=3D"na">getJobConf</span><span class=3D"o">(),</span> <span
> class=3D"k">new</span> <span class=3D"nf">Path</span><span
> class=3D"o">(</span><span class=3D"n">dataInput</span><span class=3D"o">)=
);</span>
> -</code></pre></div>
> -<p>We describe in the following article how to access data stored in <a
> href=3D"http://www.mongodb.org/">MongoDB</a> with Stratosphere. This allo=
ws
> users to join data from multiple sources (e.g. MonogDB and HDFS) or perfo=
rm
> machine learning with the documents stored in MongoDB.</p>
> -
> -<p>The approach here is to use the <code>MongoInputFormat</code> that wa=
s
> developed for Apache Hadoop but now also runs with Stratosphere.</p>
> -<div class=3D"highlight"><pre><code class=3D"language-java"
> data-lang=3D"java"><span class=3D"n">JobConf</span> <span class=3D"n">con=
f</span>
> <span class=3D"o">=3D</span> <span class=3D"k">new</span> <span
> class=3D"nf">JobConf</span><span class=3D"o">();</span>
> -<span class=3D"n">conf</span><span class=3D"o">.</span><span
> class=3D"na">set</span><span class=3D"o">(</span><span
> class=3D"s">&quot;mongo.input.uri&quot;</span><span class=3D"o">,</span><=
span
> class=3D"s">&quot;mongodb://localhost:27017/enron_mail.messages&quot;</sp=
an><span
> class=3D"o">);</span>
> -<span class=3D"n">HadoopDataSource</span> <span class=3D"n">src</span> <=
span
> class=3D"o">=3D</span> <span class=3D"k">new</span> <span
> class=3D"nf">HadoopDataSource</span><span class=3D"o">(</span><span
> class=3D"k">new</span> <span class=3D"nf">MongoInputFormat</span><span
> class=3D"o">(),</span> <span class=3D"n">conf</span><span class=3D"o">,</=
span>
> <span class=3D"s">&quot;Read from Mongodb&quot;</span><span
> class=3D"o">,</span> <span class=3D"k">new</span> <span
> class=3D"nf">WritableWrapperConverter</span><span class=3D"o">());</span>
> -</code></pre></div>
> -<h3 id=3D"example-program">Example Program</h3>
> -
> -<p>The example program reads data from the <a href=3D"
> http://www.cs.cmu.edu/%7Eenron/">enron dataset</a> that contains about
> 500k internal e-mails. The data is stored in MongoDB and the Stratosphere
> program counts the number of e-mails per day.</p>
> -
> -<p>The complete code of this sample program is available on <a href=3D"
> https://github.com/stratosphere/stratosphere-mongodb-example
> ">GitHub</a>.</p>
> -
> -<h4 id=3D"prepare-mongodb-and-the-data">Prepare MongoDB and the Data</h4=
>
> -
> -<ul>
> -<li>Install MongoDB</li>
> -<li>Download the enron dataset from <a href=3D"
> http://mongodb-enron-email.s3-website-us-east-1.amazonaws.com/">their
> website</a>.</li>
> -<li>Unpack and load it</li>
> -</ul>
> -<div class=3D"highlight"><pre><code class=3D"language-bash" data-lang=3D=
"bash">
> bunzip2 enron_mongo.tar.bz2
> - tar xvf enron_mongo.tar
> - mongorestore dump/enron_mail/messages.bson
> -</code></pre></div>
> -<p>We used <a href=3D"http://robomongo.org/">Robomongo</a> to visually
> examine the dataset stored in MongoDB.</p>
> -
> -<p><img src=3D"/img/blog/robomongo.png" style=3D"width:90%;margin:15px">=
</p>
> -
> -<h4 id=3D"build-mongoinputformat">Build <code>MongoInputFormat</code></h=
4>
> -
> -<p>MongoDB offers an InputFormat for Hadoop on their <a href=3D"
> https://github.com/mongodb/mongo-hadoop">GitHub page</a>. The code is not
> available in any Maven repository, so we have to build the jar file on ou=
r
> own.</p>
> -
> -<ul>
> -<li>Check out the repository</li>
> -</ul>
> -<div class=3D"highlight"><pre><code class=3D"language-text"
> data-lang=3D"text">git clone https://github.com/mongodb/mongo-hadoop.git
> -cd mongo-hadoop
> -</code></pre></div>
> -<ul>
> -<li>Set the appropriate Hadoop version in the <code>build.sbt</code>, we
> used <code>1.1</code>.</li>
> -</ul>
> -<div class=3D"highlight"><pre><code class=3D"language-bash"
> data-lang=3D"bash">hadoopRelease in ThisBuild :<span class=3D"o">=3D</spa=
n> <span
> class=3D"s2">&quot;1.1&quot;</span>
> -</code></pre></div>
> -<ul>
> -<li>Build the input format</li>
> -</ul>
> -<div class=3D"highlight"><pre><code class=3D"language-bash"
> data-lang=3D"bash">./sbt package
> -</code></pre></div>
> -<p>The jar-file is now located in <code>core/target</code>.</p>
> -
> -<h4 id=3D"the-stratosphere-program">The Stratosphere Program</h4>
> -
> -<p>Now we have everything prepared to run the Stratosphere program. I
> only ran it on my local computer, out of Eclipse. To do that, check out t=
he
> code ...</p>
> -<div class=3D"highlight"><pre><code class=3D"language-bash"
> data-lang=3D"bash">git clone
> https://github.com/stratosphere/stratosphere-mongodb-example.git
> -</code></pre></div>
> -<p>... and import it as a Maven project into your Eclipse. You have to
> manually add the previously built mongo-hadoop jar-file as a dependency.
> -You can now press the &quot;Run&quot; button and see how Stratosphere
> executes the little program. It was running for about 8 seconds on the 1.=
5
> GB dataset.</p>
> -
> -<p>The result (located in <code>/tmp/enronCountByDay</code>) now looks
> like this.</p>
> -<div class=3D"highlight"><pre><code class=3D"language-text"
> data-lang=3D"text">11,Fri Sep 26 10:00:00 CEST 1997
> -154,Tue Jun 29 10:56:00 CEST 1999
> -292,Tue Aug 10 12:11:00 CEST 1999
> -185,Thu Aug 12 18:35:00 CEST 1999
> -26,Fri Mar 19 12:33:00 CET 1999
> -</code></pre></div>
> -<p>There is one thing left I want to point out here. MongoDB represents
> objects stored in the database as JSON-documents. Since Stratosphere&#39;=
s
> standard types do not support JSON documents, I was using the
> <code>WritableWrapper</code> here. This wrapper allows to use any Hadoop
> datatype with Stratosphere.</p>
> -
> -<p>The following code example shows how the JSON-documents are accessed
> in Stratosphere.</p>
> -<div class=3D"highlight"><pre><code class=3D"language-java"
> data-lang=3D"java"><span class=3D"kd">public</span> <span
> class=3D"kt">void</span> <span class=3D"nf">map</span><span
> class=3D"o">(</span><span class=3D"n">Record</span> <span
> class=3D"n">record</span><span class=3D"o">,</span> <span
> class=3D"n">Collector</span><span class=3D"o">&lt;</span><span
> class=3D"n">Record</span><span class=3D"o">&gt;</span> <span
> class=3D"n">out</span><span class=3D"o">)</span> <span class=3D"kd">throw=
s</span>
> <span class=3D"n">Exception</span> <span class=3D"o">{</span>
> -    <span class=3D"n">Writable</span> <span class=3D"n">valWr</span> <sp=
an
> class=3D"o">=3D</span> <span class=3D"n">record</span><span
> class=3D"o">.</span><span class=3D"na">getField</span><span
> class=3D"o">(</span><span class=3D"mi">1</span><span class=3D"o">,</span>=
 <span
> class=3D"n">WritableWrapper</span><span class=3D"o">.</span><span
> class=3D"na">class</span><span class=3D"o">).</span><span
> class=3D"na">value</span><span class=3D"o">();</span>
> -    <span class=3D"n">BSONWritable</span> <span class=3D"n">value</span>
> <span class=3D"o">=3D</span> <span class=3D"o">(</span><span
> class=3D"n">BSONWritable</span><span class=3D"o">)</span> <span
> class=3D"n">valWr</span><span class=3D"o">;</span>
> -    <span class=3D"n">Object</span> <span class=3D"n">headers</span> <sp=
an
> class=3D"o">=3D</span> <span class=3D"n">value</span><span
> class=3D"o">.</span><span class=3D"na">getDoc</span><span
> class=3D"o">().</span><span class=3D"na">get</span><span
> class=3D"o">(</span><span class=3D"s">&quot;headers&quot;</span><span
> class=3D"o">);</span>
> -    <span class=3D"n">BasicDBObject</span> <span class=3D"n">headerOb</s=
pan>
> <span class=3D"o">=3D</span> <span class=3D"o">(</span><span
> class=3D"n">BasicDBObject</span><span class=3D"o">)</span> <span
> class=3D"n">headers</span><span class=3D"o">;</span>
> -    <span class=3D"n">String</span> <span class=3D"n">date</span> <span
> class=3D"o">=3D</span> <span class=3D"o">(</span><span
> class=3D"n">String</span><span class=3D"o">)</span> <span
> class=3D"n">headerOb</span><span class=3D"o">.</span><span
> class=3D"na">get</span><span class=3D"o">(</span><span
> class=3D"s">&quot;Date&quot;</span><span class=3D"o">);</span>
> -    <span class=3D"c1">// further date processing</span>
> -<span class=3D"o">}</span>
> -</code></pre></div>
> -<p>Please use the comments if you have questions or if you want to
> showcase your own MongoDB-Stratosphere integration.
> -<br><br>
> -<small>Written by Robert Metzger (<a href=3D"https://twitter.com/rmetzge=
r_
> ">@rmetzger_</a>).</small></p>
> -</div>
> -                               <a
> href=3D"/news/2014/01/28/querying_mongodb.html#disqus_thread">Accessing D=
ata
> Stored in MongoDB with Stratosphere</a>
> -                       </article>
> -
>                 </div>
>                 <div class=3D"col-md-2"></div>
>         </div>
>
> Modified: flink/site/blog/page2/index.html
> URL:
> http://svn.apache.org/viewvc/flink/site/blog/page2/index.html?rev=3D16500=
29&r1=3D1650028&r2=3D1650029&view=3Ddiff
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> --- flink/site/blog/page2/index.html (original)
> +++ flink/site/blog/page2/index.html Wed Jan  7 10:40:31 2015
> @@ -131,6 +131,98 @@
>                 <div class=3D"col-md-8">
>
>                         <article>
> +                               <h2><a
> href=3D"/news/2014/01/28/querying_mongodb.html">Accessing Data Stored in
> MongoDB with Stratosphere</a></h2>
> +                               <p class=3D"meta">28 Jan 2014</p>
> +
> +                               <div><p>We recently merged a <a href=3D"
> https://github.com/stratosphere/stratosphere/pull/437">pull request</a>
> that allows you to use any existing Hadoop <a href=3D"
> http://developer.yahoo.com/hadoop/tutorial/module5.html#inputformat">Inpu=
tFormat</a>
> with Stratosphere. So you can now (in the <code>0.5-SNAPSHOT</code> and
> upwards versions) define a Hadoop-based data source:</p>
> +<div class=3D"highlight"><pre><code class=3D"language-java"
> data-lang=3D"java"><span class=3D"n">HadoopDataSource</span> <span
> class=3D"n">source</span> <span class=3D"o">=3D</span> <span class=3D"k">=
new</span>
> <span class=3D"nf">HadoopDataSource</span><span class=3D"o">(</span><span
> class=3D"k">new</span> <span class=3D"nf">TextInputFormat</span><span
> class=3D"o">(),</span> <span class=3D"k">new</span> <span
> class=3D"nf">JobConf</span><span class=3D"o">(),</span> <span
> class=3D"s">&quot;Input Lines&quot;</span><span class=3D"o">);</span>
> +<span class=3D"n">TextInputFormat</span><span class=3D"o">.</span><span
> class=3D"na">addInputPath</span><span class=3D"o">(</span><span
> class=3D"n">source</span><span class=3D"o">.</span><span
> class=3D"na">getJobConf</span><span class=3D"o">(),</span> <span
> class=3D"k">new</span> <span class=3D"nf">Path</span><span
> class=3D"o">(</span><span class=3D"n">dataInput</span><span class=3D"o">)=
);</span>
> +</code></pre></div>
> +<p>We describe in the following article how to access data stored in <a
> href=3D"http://www.mongodb.org/">MongoDB</a> with Stratosphere. This allo=
ws
> users to join data from multiple sources (e.g. MonogDB and HDFS) or perfo=
rm
> machine learning with the documents stored in MongoDB.</p>
> +
> +<p>The approach here is to use the <code>MongoInputFormat</code> that wa=
s
> developed for Apache Hadoop but now also runs with Stratosphere.</p>
> +<div class=3D"highlight"><pre><code class=3D"language-java"
> data-lang=3D"java"><span class=3D"n">JobConf</span> <span class=3D"n">con=
f</span>
> <span class=3D"o">=3D</span> <span class=3D"k">new</span> <span
> class=3D"nf">JobConf</span><span class=3D"o">();</span>
> +<span class=3D"n">conf</span><span class=3D"o">.</span><span
> class=3D"na">set</span><span class=3D"o">(</span><span
> class=3D"s">&quot;mongo.input.uri&quot;</span><span class=3D"o">,</span><=
span
> class=3D"s">&quot;mongodb://localhost:27017/enron_mail.messages&quot;</sp=
an><span
> class=3D"o">);</span>
> +<span class=3D"n">HadoopDataSource</span> <span class=3D"n">src</span> <=
span
> class=3D"o">=3D</span> <span class=3D"k">new</span> <span
> class=3D"nf">HadoopDataSource</span><span class=3D"o">(</span><span
> class=3D"k">new</span> <span class=3D"nf">MongoInputFormat</span><span
> class=3D"o">(),</span> <span class=3D"n">conf</span><span class=3D"o">,</=
span>
> <span class=3D"s">&quot;Read from Mongodb&quot;</span><span
> class=3D"o">,</span> <span class=3D"k">new</span> <span
> class=3D"nf">WritableWrapperConverter</span><span class=3D"o">());</span>
> +</code></pre></div>
> +<h3 id=3D"example-program">Example Program</h3>
> +
> +<p>The example program reads data from the <a href=3D"
> http://www.cs.cmu.edu/%7Eenron/">enron dataset</a> that contains about
> 500k internal e-mails. The data is stored in MongoDB and the Stratosphere
> program counts the number of e-mails per day.</p>
> +
> +<p>The complete code of this sample program is available on <a href=3D"
> https://github.com/stratosphere/stratosphere-mongodb-example
> ">GitHub</a>.</p>
> +
> +<h4 id=3D"prepare-mongodb-and-the-data">Prepare MongoDB and the Data</h4=
>
> +
> +<ul>
> +<li>Install MongoDB</li>
> +<li>Download the enron dataset from <a href=3D"
> http://mongodb-enron-email.s3-website-us-east-1.amazonaws.com/">their
> website</a>.</li>
> +<li>Unpack and load it</li>
> +</ul>
> +<div class=3D"highlight"><pre><code class=3D"language-bash" data-lang=3D=
"bash">
> bunzip2 enron_mongo.tar.bz2
> + tar xvf enron_mongo.tar
> + mongorestore dump/enron_mail/messages.bson
> +</code></pre></div>
> +<p>We used <a href=3D"http://robomongo.org/">Robomongo</a> to visually
> examine the dataset stored in MongoDB.</p>
> +
> +<p><img src=3D"/img/blog/robomongo.png" style=3D"width:90%;margin:15px">=
</p>
> +
> +<h4 id=3D"build-mongoinputformat">Build <code>MongoInputFormat</code></h=
4>
> +
> +<p>MongoDB offers an InputFormat for Hadoop on their <a href=3D"
> https://github.com/mongodb/mongo-hadoop">GitHub page</a>. The code is not
> available in any Maven repository, so we have to build the jar file on ou=
r
> own.</p>
> +
> +<ul>
> +<li>Check out the repository</li>
> +</ul>
> +<div class=3D"highlight"><pre><code class=3D"language-text"
> data-lang=3D"text">git clone https://github.com/mongodb/mongo-hadoop.git
> +cd mongo-hadoop
> +</code></pre></div>
> +<ul>
> +<li>Set the appropriate Hadoop version in the <code>build.sbt</code>, we
> used <code>1.1</code>.</li>
> +</ul>
> +<div class=3D"highlight"><pre><code class=3D"language-bash"
> data-lang=3D"bash">hadoopRelease in ThisBuild :<span class=3D"o">=3D</spa=
n> <span
> class=3D"s2">&quot;1.1&quot;</span>
> +</code></pre></div>
> +<ul>
> +<li>Build the input format</li>
> +</ul>
> +<div class=3D"highlight"><pre><code class=3D"language-bash"
> data-lang=3D"bash">./sbt package
> +</code></pre></div>
> +<p>The jar-file is now located in <code>core/target</code>.</p>
> +
> +<h4 id=3D"the-stratosphere-program">The Stratosphere Program</h4>
> +
> +<p>Now we have everything prepared to run the Stratosphere program. I
> only ran it on my local computer, out of Eclipse. To do that, check out t=
he
> code ...</p>
> +<div class=3D"highlight"><pre><code class=3D"language-bash"
> data-lang=3D"bash">git clone
> https://github.com/stratosphere/stratosphere-mongodb-example.git
> +</code></pre></div>
> +<p>... and import it as a Maven project into your Eclipse. You have to
> manually add the previously built mongo-hadoop jar-file as a dependency.
> +You can now press the &quot;Run&quot; button and see how Stratosphere
> executes the little program. It was running for about 8 seconds on the 1.=
5
> GB dataset.</p>
> +
> +<p>The result (located in <code>/tmp/enronCountByDay</code>) now looks
> like this.</p>
> +<div class=3D"highlight"><pre><code class=3D"language-text"
> data-lang=3D"text">11,Fri Sep 26 10:00:00 CEST 1997
> +154,Tue Jun 29 10:56:00 CEST 1999
> +292,Tue Aug 10 12:11:00 CEST 1999
> +185,Thu Aug 12 18:35:00 CEST 1999
> +26,Fri Mar 19 12:33:00 CET 1999
> +</code></pre></div>
> +<p>There is one thing left I want to point out here. MongoDB represents
> objects stored in the database as JSON-documents. Since Stratosphere&#39;=
s
> standard types do not support JSON documents, I was using the
> <code>WritableWrapper</code> here. This wrapper allows to use any Hadoop
> datatype with Stratosphere.</p>
> +
> +<p>The following code example shows how the JSON-documents are accessed
> in Stratosphere.</p>
> +<div class=3D"highlight"><pre><code class=3D"language-java"
> data-lang=3D"java"><span class=3D"kd">public</span> <span
> class=3D"kt">void</span> <span class=3D"nf">map</span><span
> class=3D"o">(</span><span class=3D"n">Record</span> <span
> class=3D"n">record</span><span class=3D"o">,</span> <span
> class=3D"n">Collector</span><span class=3D"o">&lt;</span><span
> class=3D"n">Record</span><span class=3D"o">&gt;</span> <span
> class=3D"n">out</span><span class=3D"o">)</span> <span class=3D"kd">throw=
s</span>
> <span class=3D"n">Exception</span> <span class=3D"o">{</span>
> +    <span class=3D"n">Writable</span> <span class=3D"n">valWr</span> <sp=
an
> class=3D"o">=3D</span> <span class=3D"n">record</span><span
> class=3D"o">.</span><span class=3D"na">getField</span><span
> class=3D"o">(</span><span class=3D"mi">1</span><span class=3D"o">,</span>=
 <span
> class=3D"n">WritableWrapper</span><span class=3D"o">.</span><span
> class=3D"na">class</span><span class=3D"o">).</span><span
> class=3D"na">value</span><span class=3D"o">();</span>
> +    <span class=3D"n">BSONWritable</span> <span class=3D"n">value</span>
> <span class=3D"o">=3D</span> <span class=3D"o">(</span><span
> class=3D"n">BSONWritable</span><span class=3D"o">)</span> <span
> class=3D"n">valWr</span><span class=3D"o">;</span>
> +    <span class=3D"n">Object</span> <span class=3D"n">headers</span> <sp=
an
> class=3D"o">=3D</span> <span class=3D"n">value</span><span
> class=3D"o">.</span><span class=3D"na">getDoc</span><span
> class=3D"o">().</span><span class=3D"na">get</span><span
> class=3D"o">(</span><span class=3D"s">&quot;headers&quot;</span><span
> class=3D"o">);</span>
> +    <span class=3D"n">BasicDBObject</span> <span class=3D"n">headerOb</s=
pan>
> <span class=3D"o">=3D</span> <span class=3D"o">(</span><span
> class=3D"n">BasicDBObject</span><span class=3D"o">)</span> <span
> class=3D"n">headers</span><span class=3D"o">;</span>
> +    <span class=3D"n">String</span> <span class=3D"n">date</span> <span
> class=3D"o">=3D</span> <span class=3D"o">(</span><span
> class=3D"n">String</span><span class=3D"o">)</span> <span
> class=3D"n">headerOb</span><span class=3D"o">.</span><span
> class=3D"na">get</span><span class=3D"o">(</span><span
> class=3D"s">&quot;Date&quot;</span><span class=3D"o">);</span>
> +    <span class=3D"c1">// further date processing</span>
> +<span class=3D"o">}</span>
> +</code></pre></div>
> +<p>Please use the comments if you have questions or if you want to
> showcase your own MongoDB-Stratosphere integration.
> +<br><br>
> +<small>Written by Robert Metzger (<a href=3D"https://twitter.com/rmetzge=
r_
> ">@rmetzger_</a>).</small></p>
> +</div>
> +                               <a
> href=3D"/news/2014/01/28/querying_mongodb.html#disqus_thread">Accessing D=
ata
> Stored in MongoDB with Stratosphere</a>
> +                       </article>
> +
> +                       <article>
>                                 <h2><a
> href=3D"/news/2014/01/26/optimizer_plan_visualization_tool.html">Optimize=
r
> Plan Visualization Tool</a></h2>
>                                 <p class=3D"meta">26 Jan 2014</p>
>
> @@ -448,24 +540,6 @@ Analyzing big data sets as they occur in
>                                 <a
> href=3D"/news/2012/11/12/btw2013demo.html#disqus_thread">Stratosphere Dem=
o
> Paper Accepted for BTW 2013</a>
>                         </article>
>
> -                       <article>
> -                               <h2><a
> href=3D"/news/2012/10/15/icde2013.html">Stratosphere Demo Accepted for IC=
DE
> 2013</a></h2>
> -                               <p class=3D"meta">15 Oct 2012</p>
> -
> -                               <div> <p>Our demo submission<br />
> -<strong><cite>"Peeking into the Optimization of Data Flow Programs with
> MapReduce-style UDFs"</cite></strong><br />
> -has been accepted for ICDE 2013 in Brisbane, Australia.<br />
> -The demo illustrates the contributions of our VLDB 2012 paper
> <cite>"Opening the Black Boxes in Data Flow Optimization"</cite> <a
> href=3D"/assets/papers/optimizationOfDataFlowsWithUDFs_13.pdf">[PDF]</a> =
and
> <a
> href=3D"/assets/papers/optimizationOfDataFlowsWithUDFs_poster_13.pdf">[Po=
ster
> PDF]</a>.</p>
> -<p>Visit our poster, enjoy the demo, and talk to us if you are going to
> attend ICDE 2013.</p>
> -<p><strong>Abstract:</strong><br />
> -Data flows are a popular abstraction to define data-intensive processing
> tasks. In order to support a wide range of use cases, many data processin=
g
> systems feature MapReduce-style user-defined functions (UDFs). In contras=
t
> to UDFs as known from relational DBMS, MapReduce-style UDFs have less
> strict templates. These templates do not alone provide all the informatio=
n
> needed to decide whether they can be reordered with relational operators
> and other UDFs. However, it is well-known that reordering operators such =
as
> filters, joins, and aggregations can yield runtime improvements by orders
> of magnitude.<br />
> -We demonstrate an optimizer for data flows that is able to reorder
> operators with MapReduce-style UDFs written in an imperative language. Ou=
r
> approach leverages static code analysis to extract information from UDFs
> which is used to reason about the reorderbility of UDF operators. This
> information is sufficient to enumerate a large fraction of the search spa=
ce
> covered by conventional RDBMS optimizers including filter and aggregation
> push-down, bushy join orders, and choice of physical execution strategies
> based on interesting properties.<br />
> -We demonstrate our optimizer and a job submission client that allows
> users to peek step-by-step into each phase of the optimization process: t=
he
> static code analysis of UDFs, the enumeration of reordered candidate data
> flows, the generation of physical execution plans, and their parallel
> execution. For the demonstration, we provide a selection of relational an=
d
> non-relational data flow programs which highlight the salient features of
> our approach.</p>
> -
> -</div>
> -                               <a
> href=3D"/news/2012/10/15/icde2013.html#disqus_thread">Stratosphere Demo
> Accepted for ICDE 2013</a>
> -                       </article>
> -
>                 </div>
>                 <div class=3D"col-md-2"></div>
>         </div>
>
> Modified: flink/site/blog/page3/index.html
> URL:
> http://svn.apache.org/viewvc/flink/site/blog/page3/index.html?rev=3D16500=
29&r1=3D1650028&r2=3D1650029&view=3Ddiff
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> --- flink/site/blog/page3/index.html (original)
> +++ flink/site/blog/page3/index.html Wed Jan  7 10:40:31 2015
> @@ -131,6 +131,24 @@
>                 <div class=3D"col-md-8">
>
>                         <article>
> +                               <h2><a
> href=3D"/news/2012/10/15/icde2013.html">Stratosphere Demo Accepted for IC=
DE
> 2013</a></h2>
> +                               <p class=3D"meta">15 Oct 2012</p>
> +
> +                               <div> <p>Our demo submission<br />
> +<strong><cite>"Peeking into the Optimization of Data Flow Programs with
> MapReduce-style UDFs"</cite></strong><br />
> +has been accepted for ICDE 2013 in Brisbane, Australia.<br />
> +The demo illustrates the contributions of our VLDB 2012 paper
> <cite>"Opening the Black Boxes in Data Flow Optimization"</cite> <a
> href=3D"/assets/papers/optimizationOfDataFlowsWithUDFs_13.pdf">[PDF]</a> =
and
> <a
> href=3D"/assets/papers/optimizationOfDataFlowsWithUDFs_poster_13.pdf">[Po=
ster
> PDF]</a>.</p>
> +<p>Visit our poster, enjoy the demo, and talk to us if you are going to
> attend ICDE 2013.</p>
> +<p><strong>Abstract:</strong><br />
> +Data flows are a popular abstraction to define data-intensive processing
> tasks. In order to support a wide range of use cases, many data processin=
g
> systems feature MapReduce-style user-defined functions (UDFs). In contras=
t
> to UDFs as known from relational DBMS, MapReduce-style UDFs have less
> strict templates. These templates do not alone provide all the informatio=
n
> needed to decide whether they can be reordered with relational operators
> and other UDFs. However, it is well-known that reordering operators such =
as
> filters, joins, and aggregations can yield runtime improvements by orders
> of magnitude.<br />
> +We demonstrate an optimizer for data flows that is able to reorder
> operators with MapReduce-style UDFs written in an imperative language. Ou=
r
> approach leverages static code analysis to extract information from UDFs
> which is used to reason about the reorderbility of UDF operators. This
> information is sufficient to enumerate a large fraction of the search spa=
ce
> covered by conventional RDBMS optimizers including filter and aggregation
> push-down, bushy join orders, and choice of physical execution strategies
> based on interesting properties.<br />
> +We demonstrate our optimizer and a job submission client that allows
> users to peek step-by-step into each phase of the optimization process: t=
he
> static code analysis of UDFs, the enumeration of reordered candidate data
> flows, the generation of physical execution plans, and their parallel
> execution. For the demonstration, we provide a selection of relational an=
d
> non-relational data flow programs which highlight the salient features of
> our approach.</p>
> +
> +</div>
> +                               <a
> href=3D"/news/2012/10/15/icde2013.html#disqus_thread">Stratosphere Demo
> Accepted for ICDE 2013</a>
> +                       </article>
> +
> +                       <article>
>                                 <h2><a
> href=3D"/news/2012/08/21/release02.html">Version 0.2 Released</a></h2>
>                                 <p class=3D"meta">21 Aug 2012</p>
>
>
> Added: flink/site/news/2015/01/06/december-in-flink.html
> URL:
> http://svn.apache.org/viewvc/flink/site/news/2015/01/06/december-in-flink=
.html?rev=3D1650029&view=3Dauto
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> --- flink/site/news/2015/01/06/december-in-flink.html (added)
> +++ flink/site/news/2015/01/06/december-in-flink.html Wed Jan  7 10:40:31
> 2015
> @@ -0,0 +1,339 @@
> +<!DOCTYPE html>
> +<html lang=3D"en">
> +    <head>
> +           <meta charset=3D"utf-8">
> +           <meta http-equiv=3D"X-UA-Compatible" content=3D"IE=3Dedge">
> +           <meta name=3D"viewport" content=3D"width=3Ddevice-width,
> initial-scale=3D1">
> +
> +           <title>Apache Flink (incubating): December 2014 in the Flink
> community</title>
> +           <link rel=3D"shortcut icon" href=3D"favicon.ico"
> type=3D"image/x-icon">
> +           <link rel=3D"icon" href=3D"favicon.ico" type=3D"image/x-icon"=
>
> +           <link rel=3D"stylesheet" href=3D"/css/bootstrap.css">
> +           <link rel=3D"stylesheet" href=3D"/css/bootstrap-lumen-custom.=
css">
> +           <link rel=3D"stylesheet" href=3D"/css/syntax.css">
> +           <link rel=3D"stylesheet" href=3D"/css/custom.css">
> +           <link href=3D"/css/main/main.css" rel=3D"stylesheet">
> +           <!-- <link href=3D"//
> maxcdn.bootstrapcdn.com/font-awesome/4.1.0/css/font-awesome.min.css"
> rel=3D"stylesheet"> -->
> +           <script src=3D"
> https://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js
> "></script>
> +           <script src=3D"/js/bootstrap.min.js"></script>
> +    </head>
> +    <body>
> +    <div class=3D"af-header-container af-inner-pages-navigation">
> +       <header>
> +               <div class=3D"container">
> +                       <div class=3D"row">
> +                               <div class=3D"col-md-1 af-mobile-nav-bar"=
>
> +                                       <a href=3D"/" title=3D"Home">
> +                                       <img class=3D"hidden-xs hidden-sm
> img-responsive"
> +                                               src=3D"/img/main/logo.png=
"
> alt=3D"Apache Flink Logo">
> +                                       </a>
> +                                       <div class=3D"row visible-xs">
> +                                               <div class=3D"col-xs-3">
> +                                                   <a href=3D"/"
> title=3D"Home">
> +                                                       <img
> class=3D"hidden-x hidden-sm img-responsive"
> +
>  src=3D"/img/main/logo.png" alt=3D"Apache Flink Logo">
> +                                                       </a>
> +                                               </div>
> +                                               <div
> class=3D"col-xs-5"></div>
> +                                               <div class=3D"col-xs-4">
> +                                                       <div
> class=3D"af-mobile-btn">
> +                                                               <span
> class=3D"glyphicon glyphicon-plus"></span>
> +                                                       </div>
> +                                               </div>
> +                                       </div>
> +                               </div>
> +                               <!-- Navigation -->
> +                               <div class=3D"col-md-11">
> +                                       <nav class=3D"af-main-nav"
> role=3D"navigation">
> +                                               <ul>
> +                                                       <li><a href=3D"#"
> class=3D"af-nav-links">Quickstart
> +                                                                       <=
b
> class=3D"caret"></b>
> +                                                       </a>
> +                                                               <ul
> class=3D"af-dropdown-menu">
> +
>  <li><a href=3D"/docs/0.7-incubating/setup_quickstart.html">Setup
> +
>              Flink</a></li>
> +
>  <li><a
> +
>      href=3D"/docs/0.7-incubating/java_api_quickstart.html">Java
> +
>              API</a></li>
> +
>  <li><a
> +
>      href=3D"/docs/0.7-incubating/scala_api_quickstart.html">Scala
> +
>              API</a></li>
> +                                                               </ul></li=
>
> +                                                       <li><a
> href=3D"/downloads.html">Download</a></li>
> +                                                       <li><a
> href=3D"/docs/0.7-incubating/faq.html">FAQ</a></li>
> +                                                       <li><a href=3D"#"
> class=3D"af-nav-links">Documentation <b
> +
>  class=3D"caret"></b></a>
> +                                                               <ul
> class=3D"af-dropdown-menu">
> +
>  <li class=3D"af-separator">Current Stable:</li>
> +
>  <li></li>
> +
>  <li><a href=3D"/docs/0.7-incubating/">0.7.0-incubating</a></li>
> +
>  <li><a href=3D"/docs/0.7-incubating/api/java">0.7.0-incubating
> +
>              Javadocs</a></li>
> +
>  <li><a
> +
>
>  href=3D"/docs/0.7-incubating/api/scala/index.html#org.apache.flink.api.s=
cala.package">0.7.0-incubating
> +
>              Scaladocs</a></li>
> +
>  <li class=3D"divider"></li>
> +
>  <li class=3D"af-separator">Previous:</li>
> +
>  <li></li>
> +
>  <li><a href=3D"/docs/0.6-incubating/">0.6-incubating</a></li>
> +
>  <li><a href=3D"/docs/0.6-incubating/api/java">0.6-incubating
> +
>              Javadocs</a></li>
> +                                                               </ul></li=
>
> +                                                       <li><a href=3D"#"
> class=3D"af-nav-links">Community <b
> +
>  class=3D"caret"></b></a>
> +                                                               <ul
> class=3D"af-dropdown-menu">
> +
>  <li><a href=3D"/community.html#mailing-lists">Mailing
> +
>              Lists</a></li>
> +
>  <li><a href=3D"/community.html#issues">Issues</a></li>
> +
>  <li><a href=3D"/community.html#team">Team</a></li>
> +
>  <li class=3D"divider"></li>
> +
>  <li><a href=3D"/how-to-contribute.html">How To
> +
>              Contribute</a></li>
> +
>  <li><a href=3D"/coding_guidelines.html">Coding
> +
>              Guidelines</a></li>
> +                                                               </ul></li=
>
> +                                                       <li><a href=3D"#"
> class=3D"af-nav-links">Project <b
> +
>  class=3D"caret"></b></a>
> +                                                               <ul
> class=3D"af-dropdown-menu">
> +
>  <li><a href=3D"/material.html">Material</a></li>
> +
>  <li><a href=3D"http://www.apache.org/">Apache Software
> +
>              Foundation <span class=3D"glyphicon glyphicon-new-window"></=
span>
> +
>  </a></li>
> +
>  <li><a
> +
>      href=3D"https://cwiki.apache.org/confluence/display/FLINK">Wiki
> +
>              <span class=3D"glyphicon glyphicon-new-window"></span>
> +
>  </a></li>
> +
>  <li><a
> +
>      href=3D"https://wiki.apache.org/incubator/StratosphereProposal
> ">Incubator
> +
>              Proposal <span class=3D"glyphicon glyphicon-new-window"></sp=
an>
> +
>  </a></li>
> +
>  <li><a href=3D"http://www.apache.org/licenses/LICENSE-2.0">License
> +
>              <span class=3D"glyphicon glyphicon-new-window"></span>
> +
>  </a></li>
> +
>  <li><a href=3D"https://github.com/apache/incubator-flink">Source
> +
>              Code <span class=3D"glyphicon glyphicon-new-window"></span>
> +
>  </a></li>
> +                                                               </ul></li=
>
> +                                                       <li><a
> href=3D"/blog/index.html" class=3D"">Blog</a></li>
> +                                               </ul>
> +                                       </nav>
> +                               </div>
> +                       </div>
> +               </div>
> +       </header>
> +</div>
> +
> +
> +    <div style=3D"padding-top:120px" class=3D"container">
> +        <div class=3D"container">
> +    <div class=3D"row">
> +               <div class=3D"col-md-2"></div>
> +               <div class=3D"col-md-8">
> +                       <article>
> +                               <h2>December 2014 in the Flink
> community</h2>
> +                                   <p class=3D"meta">06 Jan 2015</p>
> +                               <div>
> +                                   <p>This is the first blog post of a
> =C3=A2=E2=82=AC=C5=93newsletter=C3=A2=E2=82=AC  like series where we give=
 a summary of the monthly
> activity in the Flink community. As the Flink project grows, this can ser=
ve
> as a &quot;tl;dr&quot; for people that are not following the Flink dev an=
d
> user mailing lists, or those that are simply overwhelmed by the traffic.<=
/p>
> +
> +<h3 id=3D"flink-graduation">Flink graduation</h3>
> +
> +<p>The biggest news is that the Apache board approved Flink as a
> top-level Apache project! The Flink team is working closely with the Apac=
he
> press team for an official announcement, so stay tuned for details!</p>
> +
> +<h3 id=3D"new-flink-website">New Flink website</h3>
> +
> +<p>The <a href=3D"http://flink.apache.org">Flink website</a> got a total
> make-over, both in terms of appearance and content.</p>
> +
> +<h3 id=3D"flink-irc-channel">Flink IRC channel</h3>
> +
> +<p>A new IRC channel called #flink was created at irc.freenode.org. An
> easy way to access the IRC channel is through the <a href=3D"
> http://webchat.freenode.net/">web client</a>.  Feel free to stop by to
> ask anything or share your ideas about Apache Flink!</p>
> +
> +<h3 id=3D"meetups-and-talks">Meetups and Talks</h3>
> +
> +<p>Apache Flink was presented in the <a href=3D"
> http://www.meetup.com/Netherlands-Hadoop-User-Group/events/218635152">Ams=
terdam
> Hadoop User Group</a></p>
> +
> +<h2 id=3D"notable-code-contributions">Notable code contributions</h2>
> +
> +<p><strong>Note:</strong> Code contributions listed here may not be part
> of a release or even the current snapshot yet.</p>
> +
> +<h3 id=3D"streaming-scala-api"><a href=3D"
> https://github.com/apache/incubator-flink/pull/275">Streaming Scala
> API</a></h3>
> +
> +<p>The Flink Streaming Java API recently got its Scala counterpart. Once
> merged, Flink Streaming users can use both Scala and Java for their
> development. The Flink Streaming Scala API is built as a thin layer on to=
p
> of the Java API, making sure that the APIs are kept easily in sync.</p>
> +
> +<h3 id=3D"intermediate-datasets"><a href=3D"
> https://github.com/apache/incubator-flink/pull/254">Intermediate
> datasets</a></h3>
> +
> +<p>This pull request introduces a major change in the Flink runtime.
> Currently, the Flink runtime is based on the notion of operators that
> exchange data through channels. With the PR, intermediate data sets that
> are produced by operators become first-class citizens in the runtime. Whi=
le
> this does not have any user-facing impact yet, it lays the groundwork for=
 a
> slew of future features such as blocking execution, fine-grained
> fault-tolerance, and more efficient data sharing between cluster and
> client.</p>
> +
> +<h3 id=3D"configurable-execution-mode"><a href=3D"
> https://github.com/apache/incubator-flink/pull/259">Configurable
> execution mode</a></h3>
> +
> +<p>This pull request allows the user to change the object-reuse
> behaviour. Before this pull request, some operations would reuse objects
> passed to the user function while others would always create new objects.
> This introduces a system wide switch and changes all operators to either
> reuse objects or don=C3=A2=E2=82=AC=E2=84=A2t reuse objects.</p>
> +
> +<h3 id=3D"distributed-coordination-via-akka"><a href=3D"
> https://github.com/apache/incubator-flink/pull/149">Distributed
> Coordination via Akka</a></h3>
> +
> +<p>Another major change is a complete rewrite of the JobManager /
> TaskManager components in Scala. In addition to that, the old RPC service
> was replaced by Actors, using the Akka framework.</p>
> +
> +<h3 id=3D"sorting-of-very-large-records"><a href=3D"
> https://github.com/apache/incubator-flink/pull/249">Sorting of very large
> records</a></h3>
> +
> +<p>Flink&#39;s internal sort-algorithms were improved to better handle
> large records (multiple 100s of megabytes or larger). Previously, the
> system did in some cases hold instances of multiple large records,
> resulting in high memory consumption and JVM heap thrashing. Through this
> fix, large records are streamed through the operators, reducing the memor=
y
> consumption and GC pressure. The system now requires much less memory to
> support algorithms that work on such large records.</p>
> +
> +<h3 id=3D"kryo-serialization-as-the-new-default-fallback"><a href=3D"
> https://github.com/apache/incubator-flink/pull/271">Kryo Serialization as
> the new default fallback</a></h3>
> +
> +<p>Flink=C3=A2=E2=82=AC=E2=84=A2s build-in type serialization framework =
is handles all common
> types very efficiently. Prior versions uses Avro to serialize types that
> the built-in framework could not handle.
> +Flink serialization system improved a lot over time and by now surpasses
> the capabilities of Avro in many cases. Kryo now serves as the default
> fallback serialization framework, supporting a much broader range of
> types.</p>
> +
> +<h3 id=3D"hadoop-filesystem-support"><a href=3D"
> https://github.com/apache/incubator-flink/pull/268">Hadoop FileSystem
> support</a></h3>
> +
> +<p>This change permits users to use all file systems supported by Hadoop
> with Flink. In practice this means that users can use Flink with Tachyon,
> Google Cloud Storage (also out of the box Flink YARN support on Google
> Compute Cloud), FTP and all the other file system implementations for
> Hadoop.</p>
> +
> +<h2 id=3D"heading-to-the-0.8.0-release">Heading to the 0.8.0 release</h2=
>
> +
> +<p>The community is working hard together with the Apache infra team to
> migrate the Flink infrastructure to a top-level project. At the same time=
,
> the Flink community is working on the Flink 0.8.0 release which should be
> out very soon.</p>
> +
> +                               </div>
> +                       </article>
> +               </div>
> +               <div class=3D"col-md-2"></div>
> +       </div>
> +       <div class=3D"row" style=3D"padding-top:30px">
> +               <div class=3D"col-md-2"></div>
> +               <div class=3D"col-md-8">
> +                   <div id=3D"disqus_thread"></div>
> +                   <script type=3D"text/javascript">
> +                       /* * * CONFIGURATION VARIABLES: EDIT BEFORE
> PASTING INTO YOUR WEBPAGE * * */
> +                       var disqus_shortname =3D 'stratosphere-eu'; //
> required: replace example with your forum shortname
> +
> +                       /* * * DON'T EDIT BELOW THIS LINE * * */
> +                       (function() {
> +                           var dsq =3D document.createElement('script');
> dsq.type =3D 'text/javascript'; dsq.async =3D true;
> +                           dsq.src =3D '//' + disqus_shortname + '.
> disqus.com/embed.js';
> +                           (document.getElementsByTagName('head')[0] ||
> document.getElementsByTagName('body')[0]).appendChild(dsq);
> +                       })();
> +                   </script>
> +                   <noscript>Please enable JavaScript to view the <a
> href=3D"http://disqus.com/?ref_noscript">comments powered by
> Disqus.</a></noscript>
> +                   <a href=3D"http://disqus.com"
> class=3D"dsq-brlink">comments powered by <span
> class=3D"logo-disqus">Disqus</span></a>
> +               </div>
> +               <div class=3D"col-md-2"></div>
> +       </div>
> +</div>
> +
> +    </div>
> +    <section id=3D"af-upfooter" class=3D"af-section">
> +       <div class=3D"container">
> +               <p>Apache Flink is an effort undergoing incubation at The
> Apache
> +                       Software Foundation (ASF), sponsored by the Apach=
e
> Incubator PMC.
> +                       Incubation is required of all newly accepted
> projects until a further
> +                       review indicates that the infrastructure,
> communications, and
> +                       decision making process have stabilized in a
> manner consistent with
> +                       other successful ASF projects. While incubation
> status is not
> +                       necessarily a reflection of the completeness or
> stability of the
> +                       code, it does indicate that the project has yet t=
o
> be fully endorsed
> +                       by the ASF.</p>
> +               <a href=3D"http://incubator.apache.org"> <img
> class=3D"img-responsive"
> +                       src=3D"/img/main/apache-incubator-logo.png"
> alt=3D"Apache Flink" />
> +               </a>
> +               <p class=3D"text-center">
> +                       <a href=3D"/privacy-policy.html" title=3D"Privacy
> Policy"
> +                               class=3D"af-privacy-policy">Privacy
> Policy</a>
> +               </p>
> +       </div>
> +</section>
> +
> +<footer id=3D"af-footer">
> +       <div class=3D"container">
> +               <div class=3D"row">
> +                       <div class=3D"col-md-3">
> +                               <h3>Documentation</h3>
> +                               <ul class=3D"af-footer-menu">
> +                                       <li><a
> href=3D"/docs/0.6-incubating/">0.6 Incubating</a></li>
> +                                       <li><a
> href=3D"/docs/0.6-incubating/api/java/">0.6
> +                                                       Incubating
> Javadocs</a></li>
> +                                       <li><a
> href=3D"/docs/0.7-incubating/">0.7 Incubating</a></li>
> +                                       <li><a
> href=3D"/docs/0.7-incubating/api/java/">0.7
> +                                                       Incubating
> Javadocs</a></li>
> +                                       <li><a
> +
>  href=3D"/docs/0.7-incubating/api/scala/index.html#org.apache.flink.api.s=
cala.package">0.7
> +                                                       Incubating
> Scaladocs</a></li>
> +                               </ul>
> +                       </div>
> +                       <div class=3D"col-md-3">
> +                               <h3>Community</h3>
> +                               <ul class=3D"af-footer-menu">
> +                                       <li><a
> href=3D"/community.html#mailing-lists">Mailing Lists</a></li>
> +                                       <li><a href=3D"
> https://issues.apache.org/jira/browse/FLINK"
> +                                               target=3D"blank">Issues <=
span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a
> href=3D"/community.html#team">Team</a></li>
> +                                       <li><a
> href=3D"/how-to-contribute.html">How to contribute</a></li>
> +                                       <li><a
> href=3D"/coding_guidelines.html">Coding Guidelines</a></li>
> +                               </ul>
> +                       </div>
> +                       <div class=3D"col-md-3">
> +                               <h3>ASF</h3>
> +                               <ul class=3D"af-footer-menu">
> +                                       <li><a href=3D"
> http://www.apache.org/" target=3D"blank">Apache
> +                                                       Software
> foundation <span class=3D"glyphicon glyphicon-new-window"></span>
> +                                       </a></li>
> +                                       <li><a
> +                                               href=3D"
> http://www.apache.org/foundation/how-it-works.html"
> +                                               target=3D"blank">How it
> works <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a href=3D"
> http://www.apache.org/foundation/thanks.html"
> +                                               target=3D"blank">Thanks <=
span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a
> +                                               href=3D"
> http://www.apache.org/foundation/sponsorship.html"
> +                                               target=3D"blank">Become a
> sponsor <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a href=3D"
> http://incubator.apache.org/projects/flink.html"
> +                                               target=3D"blank">Incubati=
on
> status page <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                               </ul>
> +                       </div>
> +                       <div class=3D"col-md-3">
> +                               <h3>Project</h3>
> +                               <ul class=3D"af-footer-menu">
> +                                       <li><a href=3D"/material.html"
> target=3D"blank">Material <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a
> +                                               href=3D"
> https://cwiki.apache.org/confluence/display/FLINK"
> +                                               target=3D"blank">Wiki <sp=
an
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a
> +                                               href=3D"
> https://wiki.apache.org/incubator/StratosphereProposal"
> +                                               target=3D"blank">Incubato=
r
> proposal <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a href=3D"
> http://www.apache.org/licenses/LICENSE-2.0"
> +                                               target=3D"blank">License
> <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                                       <li><a href=3D"
> https://github.com/apache/incubator-flink"
> +                                               target=3D"blank">Source c=
ode
> <span
> +                                                       class=3D"glyphico=
n
> glyphicon-new-window"></span></a></li>
> +                               </ul>
> +                       </div>
> +               </div>
> +       </div>
> +       <div class=3D"af-footer-bar">
> +               <div class=3D"container">
> +                       <div class=3D"row">
> +                               <div class=3D"col-md-6">
> +                                 Copyright &copy 2014-2015, <a href=3D"
> http://www.apache.org">The Apache Software Foundation</a>. All Rights
> Reserved.
> +                               </div>
> +                               <div class=3D"col-md-5 text-right"></div>
> +                       </div>
> +               </div>
> +       </div>
> +</footer>
> +
> +    <!-- Google Analytics -->
> +    <script>
> +
> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=3Dr;i[r]=3Di[r]||func=
tion(){
> +      (i[r].q=3Di[r].q||[]).push(arguments)},i[r].l=3D1*new
> Date();a=3Ds.createElement(o),
> +
> m=3Ds.getElementsByTagName(o)[0];a.async=3D1;a.src=3Dg;m.parentNode.inser=
tBefore(a,m)
> +      })(window,document,'script','//
> www.google-analytics.com/analytics.js','ga');
> +
> +      ga('create', 'UA-52545728-1', 'auto');
> +      ga('send', 'pageview');
> +    </script>
> +    <script src=3D"/js/main/jquery.mobile.events.min.js"></script>
> +    <script src=3D"/js/main/main.js"></script>
> +  </body>
> +</html>
>
>
>

--047d7b3a8192683abf050c0d9a41--