flink-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rmetz...@apache.org
Subject svn commit: r1610413 [4/4] - in /incubator/flink: ./ _layouts/ site/ site/docs/0.6-SNAPSHOT/ site/docs/0.6-SNAPSHOT/img/
Date Mon, 14 Jul 2014 14:10:07 GMT
Modified: incubator/flink/site/docs/0.6-SNAPSHOT/yarn_setup.html
URL: http://svn.apache.org/viewvc/incubator/flink/site/docs/0.6-SNAPSHOT/yarn_setup.html?rev=1610413&r1=1610412&r2=1610413&view=diff
==============================================================================
--- incubator/flink/site/docs/0.6-SNAPSHOT/yarn_setup.html (original)
+++ incubator/flink/site/docs/0.6-SNAPSHOT/yarn_setup.html Mon Jul 14 14:10:06 2014
@@ -109,6 +109,7 @@
       <li>Setup &amp; Configuration
         <ul>
           <li><a href="local_setup.html">Local Setup</a></li>
+          <li><a href="building.html">Build Flink</a></li>
           <li><a href="cluster_setup.html">Cluster Setup</a></li>
           <li><a href="yarn_setup.html">YARN Setup</a></li>
           <li><a href="config.html">Configuration</a></li>
@@ -159,10 +160,10 @@
 <a href="#introducing-yarn">Introducing YARN</a>
 <ul>
 <li>
-<a href="#start-stratosphere-session">Start Stratosphere Session</a>
+<a href="#start-flink-session">Start Flink Session</a>
 <ul>
 <li>
-<a href="#download-stratosphere-for-yarn">Download Stratosphere for YARN</a>
+<a href="#download-flink-for-yarn">Download Flink for YARN</a>
 </li>
 <li>
 <a href="#start-a-session">Start a Session</a>
@@ -172,10 +173,10 @@
 </ul>
 </li>
 <li>
-<a href="#submit-job-to-stratosphere">Submit Job to Stratosphere</a>
+<a href="#submit-job-to-flink">Submit Job to Flink</a>
 </li>
 <li>
-<a href="#build-stratosphere-for-a-specific-hadoop-version">Build Stratosphere for
a specific Hadoop Version</a>
+<a href="#build-yarn-client-for-a-specific-hadoop-version">Build YARN client for a
specific Hadoop version</a>
 </li>
 <li>
 <a href="#background">Background</a>
@@ -187,13 +188,13 @@
 
 <p>Start YARN session with 4 Taskmanagers (each with 4 GB of Heapspace):</p>
 <div class="highlight"><pre><code class="language-bash" data-lang="bash">wget
https://github.com/stratosphere/stratosphere/releases/download/release-0.5.1/stratosphere-bin-0.5.1-yarn.tar.gz
-tar xvzf stratosphere-dist-0.5.1-yarn.tar.gz
-<span class="nb">cd </span>stratosphere-yarn-0.5.1/
+tar xvzf flink-dist-0.5.1-yarn.tar.gz
+<span class="nb">cd </span>flink-yarn-0.5.1/
 ./bin/yarn-session.sh -n <span class="m">4</span> -jm <span class="m">1024</span>
-tm 4096
 </code></pre></div>
 <h1 id="introducing-yarn">Introducing YARN</h1>
 
-<p>Apache <a href="http://hadoop.apache.org/">Hadoop YARN</a> is a cluster
resource management framework. It allows to run various distributed applications on top of
a cluster. Stratosphere runs on YARN next to other applications. Users do not have to setup
or install anything if there is already a YARN setup.</p>
+<p>Apache <a href="http://hadoop.apache.org/">Hadoop YARN</a> is a cluster
resource management framework. It allows to run various distributed applications on top of
a cluster. Flink runs on YARN next to other applications. Users do not have to setup or install
anything if there is already a YARN setup.</p>
 
 <p><strong>Requirements</strong></p>
 
@@ -202,23 +203,23 @@ tar xvzf stratosphere-dist-0.5.1-yarn.ta
 <li>HDFS</li>
 </ul>
 
-<p>If you have troubles using the Stratosphere YARN client, have a look in the <a
href="/docs/0.5/general/faq.html">FAQ section</a>.</p>
+<p>If you have troubles using the Flink YARN client, have a look in the <a href="/docs/0.5/general/faq.html">FAQ
section</a>.</p>
 
-<h2 id="start-stratosphere-session">Start Stratosphere Session</h2>
+<h2 id="start-flink-session">Start Flink Session</h2>
 
-<p>Follow these instructions to learn how to launch a Stratosphere Session within your
YARN cluster.</p>
+<p>Follow these instructions to learn how to launch a Flink Session within your YARN
cluster.</p>
 
-<p>A session will start all required Stratosphere services (JobManager and TaskManagers)
so that you can submit programs to the cluster. Note that you can run multiple programs per
session.</p>
+<p>A session will start all required Flink services (JobManager and TaskManagers) so
that you can submit programs to the cluster. Note that you can run multiple programs per session.</p>
 
-<h3 id="download-stratosphere-for-yarn">Download Stratosphere for YARN</h3>
+<h3 id="download-flink-for-yarn">Download Flink for YARN</h3>
 
 <p>Download the YARN tgz package on the <a href="/downloads/#nightly">download
page</a>. It contains the required files.</p>
 
-<p>If you want to build the YARN .tgz file from sources, follow the build instructions.
Make sure to use the <code>-Dhadoop.profile=2</code> profile. You can find the
file in <code>stratosphere-dist/target/stratosphere-dist--yarn.tar.gz</code> (<em>Note:
The version might be different for you</em> ).</p>
+<p>If you want to build the YARN .tgz file from sources, follow the build instructions.
Make sure to use the <code>-Dhadoop.profile=2</code> profile. You can find the
file in <code>flink-dist/target/flink-dist--yarn.tar.gz</code> (<em>Note:
The version might be different for you</em> ).</p>
 
 <p>Extract the package using:</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">tar
xvzf stratosphere-dist-0.5.1-yarn.tar.gz
-<span class="nb">cd </span>stratosphere-yarn-0.5.1/
+<div class="highlight"><pre><code class="language-bash" data-lang="bash">tar
xvzf flink-dist-0.5.1-yarn.tar.gz
+<span class="nb">cd </span>flink-yarn-0.5.1/
 </code></pre></div>
 <h3 id="start-a-session">Start a Session</h3>
 
@@ -242,35 +243,35 @@ tar xvzf stratosphere-dist-0.5.1-yarn.ta
 <p><strong>Example:</strong> Issue the following command to allocate 10
TaskTrackers, with 8 GB of memory each:</p>
 <div class="highlight"><pre><code class="language-bash" data-lang="bash">./bin/yarn-session.sh
-n <span class="m">10</span> -tm 8192
 </code></pre></div>
-<p>The system will use the configuration in <code>conf/stratosphere-config.yaml</code>.
Please follow our <a href="config.html">configuration guide</a> if you want to
change something. Stratosphere on YARN will overwrite the following configuration parameters
<code>jobmanager.rpc.address</code> (because the JobManager is always allocated
at different machines) and <code>taskmanager.tmp.dirs</code> (we are using the
tmp directories given by YARN).</p>
+<p>The system will use the configuration in <code>conf/flink-config.yaml</code>.
Please follow our <a href="config.html">configuration guide</a> if you want to
change something. Flink on YARN will overwrite the following configuration parameters <code>jobmanager.rpc.address</code>
(because the JobManager is always allocated at different machines) and <code>taskmanager.tmp.dirs</code>
(we are using the tmp directories given by YARN).</p>
 
 <p>The example invocation starts 11 containers, since there is one additional container
for the ApplicationMaster and JobTracker.</p>
 
-<p>Once Stratosphere is deployed in your YARN cluster, it will show you the connection
details of the JobTracker.</p>
+<p>Once Flink is deployed in your YARN cluster, it will show you the connection details
of the JobTracker.</p>
 
 <p>The client has to remain open to keep the deployment running. We suggest to use
<code>screen</code>, which will start a detachable shell:</p>
 
 <ol>
 <li>Open <code>screen</code>,</li>
-<li>Start Stratosphere on YARN,</li>
+<li>Start Flink on YARN,</li>
 <li>Use <code>CTRL+a</code>, then press <code>d</code> to detach
the screen session,</li>
 <li>Use <code>screen -r</code> to resume again.</li>
 </ol>
 
-<h1 id="submit-job-to-stratosphere">Submit Job to Stratosphere</h1>
+<h1 id="submit-job-to-flink">Submit Job to Flink</h1>
 
-<p>Use the following command to submit a Stratosphere program to the YARN cluster:</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">./bin/stratosphere
+<p>Use the following command to submit a Flink program to the YARN cluster:</p>
+<div class="highlight"><pre><code class="language-bash" data-lang="bash">./bin/flink
 </code></pre></div>
 <p>Please refer to the documentation of the <a href="cli.html">commandline client</a>.</p>
 
 <p>The command will show you a help menu like this:</p>
 <div class="highlight"><pre><code class="language-bash" data-lang="bash"><span
class="o">[</span>...<span class="o">]</span>
-Action <span class="s2">&quot;run&quot;</span> compiles and submits a
Stratosphere program.
+Action <span class="s2">&quot;run&quot;</span> compiles and submits a
Flink program.
   <span class="s2">&quot;run&quot;</span> action arguments:
      -a,--arguments &lt;programArgs&gt;   Program arguments
      -c,--class &lt;classname&gt;         Program class
-     -j,--jarfile &lt;jarfile&gt;         Stratosphere program JAR file
+     -j,--jarfile &lt;jarfile&gt;         Flink program JAR file
      -m,--jobmanager &lt;host:port&gt;    Jobmanager to which the program is submitted
      -w,--wait                      Wait <span class="k">for</span> program to
finish
 <span class="o">[</span>...<span class="o">]</span>
@@ -280,49 +281,26 @@ Action <span class="s2">&quot;run&quot;<
 <p><strong>Example</strong></p>
 <div class="highlight"><pre><code class="language-bash" data-lang="bash">wget
-O apache-license-v2.txt http://www.apache.org/licenses/LICENSE-2.0.txt
 
-./bin/stratosphere run -j ./examples/stratosphere-java-examples-0.5.1-WordCount.jar <span
class="se">\</span>
+./bin/flink run -j ./examples/flink-java-examples-0.5.1-WordCount.jar <span class="se">\</span>
                        -a <span class="m">1</span> file://<span class="sb">`</span><span
class="nb">pwd</span><span class="sb">`</span>/apache-license-v2.txt
file://<span class="sb">`</span><span class="nb">pwd</span><span
class="sb">`</span>/wordcount-result.txt 
 </code></pre></div>
 <p>If there is the following error, make sure that all TaskManagers started:</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">Exception
in thread <span class="s2">&quot;main&quot;</span> eu.stratosphere.compiler.CompilerException:
+<div class="highlight"><pre><code class="language-bash" data-lang="bash">Exception
in thread <span class="s2">&quot;main&quot;</span> org.apache.flinkcompiler.CompilerException:
     Available instances could not be determined from job manager: Connection timed out.
 </code></pre></div>
 <p>You can check the number of TaskManagers in the JobManager web interface. The address
of this interface is printed in the YARN session console.</p>
 
 <p>If the TaskManagers do not show up after a minute, you should investigate the issue
using the log files.</p>
 
-<h1 id="build-stratosphere-for-a-specific-hadoop-version">Build Stratosphere for a
specific Hadoop Version</h1>
+<h1 id="build-yarn-client-for-a-specific-hadoop-version">Build YARN client for a specific
Hadoop version</h1>
 
-<p>This section covers building Stratosphere for a specific Hadoop version. Most users
do not need to do this manually.
-The problem is that Stratosphere uses HDFS and YARN which are both from Apache Hadoop. There
exist many different builds of Hadoop (from both the upstream project and the different Hadoop
distributions). Typically errors arise with the RPC services. An error could look like this:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">ERROR:
The job was not successfully submitted to the nephele job manager:
-    eu.stratosphere.nephele.executiongraph.GraphConversionException: Cannot compute input
splits for TSV:
-    java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException:
-    Protocol message contained an invalid tag (zero).; Host Details :
-</code></pre></div>
-<p><strong>Example</strong></p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">mvn
-Dhadoop.profile=2 -Pcdh-repo -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 -DskipTests package
-</code></pre></div>
-<p>The commands in detail:</p>
-
-<ul>
-<li> <code>-Dhadoop.profile=2</code> activates the Hadoop YARN profile
of Stratosphere. This will enable all components of Stratosphere that are compatible with
Hadoop 2.2</li>
-<li> <code>-Pcdh-repo</code> activates the Cloudera Hadoop dependencies.
If you want other vendor&#39;s Hadoop dependencies (not in maven central) add the repository
to your local maven configuration in <code>~/.m2/</code>.</li>
-<li><code>-Dhadoop.version=2.2.0-cdh5.0.0-beta-2</code> sets a special
version of the Hadoop dependencies. Make sure that the specified Hadoop version is compatible
with the profile you activated.</li>
-</ul>
-
-<p>If you want to build HDFS for Hadoop 2 without YARN, use the following parameter:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">-P!include-yarn
-</code></pre></div>
-<p>Some Cloudera versions (such as <code>2.0.0-cdh4.2.0</code>) require
this, since they have a new HDFS version with the old YARN API.</p>
-
-<p>Please post to the <em>Stratosphere mailinglist</em>(<a href="mailto:dev@flink.incubator.apache.org">dev@flink.incubator.apache.org</a>)
or create an issue on <a href="https://issues.apache.org/jira/browse/FLINK">Jira</a>,
if you have issues with your YARN setup and Stratosphere.</p>
+<p>Users using Hadoop distributions from companies like Hortonworks, Cloudera or MapR
might have to build Flink against their specific versions of Hadoop (HDFS) and YARN. Please
read the <a href="building.html">build instructions</a> for more details.</p>
 
 <h1 id="background">Background</h1>
 
-<p>This section briefly describes how Stratosphere and YARN interact. </p>
+<p>This section briefly describes how Flink and YARN interact. </p>
 
-<p><img src="img/StratosphereOnYarn.svg" class="img-responsive"></p>
+<p><img src="img/FlinkOnYarn.svg" class="img-responsive"></p>
 
 <p>The YARN client needs to access the Hadoop configuration to connect to the YARN
resource manager and to HDFS. It determines the Hadoop configuration using the following strategy:</p>
 
@@ -331,13 +309,13 @@ The problem is that Stratosphere uses HD
 <li>If the above strategy fails (this should not be the case in a correct YARN setup),
the client is using the <code>HADOOP_HOME</code> environment variable. If it is
set, the client tries to access <code>$HADOOP_HOME/etc/hadoop</code> (Hadoop 2)
and <code>$HADOOP_HOME/conf</code> (Hadoop 1).</li>
 </ul>
 
-<p>When starting a new Stratosphere YARN session, the client first checks if the requested
resources (containers and memory) are available. After that, it uploads a jar that contains
Stratosphere and the configuration to HDFS (step 1).</p>
+<p>When starting a new Flink YARN session, the client first checks if the requested
resources (containers and memory) are available. After that, it uploads a jar that contains
Flink and the configuration to HDFS (step 1).</p>
 
 <p>The next step of the client is to request (step 2) a YARN container to start the
<em>ApplicationMaster</em> (step 3). Since the client registered the configuration
and jar-file as a resource for the container, the NodeManager of YARN running on that particular
machine will take care of preparing the container (e.g. downloading the files). Once that
has finished, the <em>ApplicationMaster</em> (AM) is started.</p>
 
-<p>The <em>JobManager</em> and AM are running in the same container. Once
they successfully started, the AM knows the address of the JobManager (its own host). It is
generating a new Stratosphere configuration file for the TaskManagers (so that they can connect
to the JobManager). The file is also uploaded to HDFS. Additionally, the <em>AM</em>
container is also serving Stratosphere&#39;s web interface.</p>
+<p>The <em>JobManager</em> and AM are running in the same container. Once
they successfully started, the AM knows the address of the JobManager (its own host). It is
generating a new Flink configuration file for the TaskManagers (so that they can connect to
the JobManager). The file is also uploaded to HDFS. Additionally, the <em>AM</em>
container is also serving Flink&#39;s web interface.</p>
 
-<p>After that, the AM starts allocating the containers for Stratosphere&#39;s TaskManagers,
which will download the jar file and the modified configuration from the HDFS. Once these
steps are completed, Stratosphere is set up and ready to accept Jobs.</p>
+<p>After that, the AM starts allocating the containers for Flink&#39;s TaskManagers,
which will download the jar file and the modified configuration from the HDFS. Once these
steps are completed, Flink is set up and ready to accept Jobs.</p>
 
 
       <div style="padding-top:30px" id="disqus_thread"></div>

Modified: incubator/flink/site/how-to-contribute.html
URL: http://svn.apache.org/viewvc/incubator/flink/site/how-to-contribute.html?rev=1610413&r1=1610412&r2=1610413&view=diff
==============================================================================
--- incubator/flink/site/how-to-contribute.html (original)
+++ incubator/flink/site/how-to-contribute.html Mon Jul 14 14:10:06 2014
@@ -149,7 +149,35 @@
 <li><p>It is typically helpful to switch to a <em>topic branch</em>
for the changes. To create a dedicated branch based on the current master, use the following
command:</p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">git
checkout -b myBranch master
 </code></pre></div></li>
-<li><p>Now you can create your changes, compile the code, and validate the changes.
Here are some pointers on how to <a href="https://github.com/apache/incubator-flink/#eclipse-setup-and-debugging">set
up the Eclipse IDE for development</a>, and how to <a href="https://github.com/apache/incubator-flink/#build-stratosphere">build
the code</a>.</p></li>
+<li><p>Now you can create your changes, compile the code, and validate the changes.
Here are some pointers on how to <a href="https://github.com/apache/incubator-flink/#build-apache-flink">build
the code</a>.
+In addition to that, we recommend setting up Eclipse (or IntelliJ) using the &quot;Import
Maven Project&quot; feature. If you want to work on the scala code you will need the following
plugins:</p>
+
+<p>Eclipse 4.x:</p>
+
+<ul>
+<li>scala-ide: <a href="http://download.scala-ide.org/sdk/e38/scala210/stable/site">http://download.scala-ide.org/sdk/e38/scala210/stable/site</a></li>
+<li>m2eclipse-scala: <a href="http://alchim31.free.fr/m2e-scala/update-site">http://alchim31.free.fr/m2e-scala/update-site</a></li>
+<li>build-helper-maven-plugin: <a href="https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.15.0/N/0.15.0.201206251206/">https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.15.0/N/0.15.0.201206251206/</a></li>
+</ul>
+
+<p>Eclipse 3.7:</p>
+
+<ul>
+<li>scala-ide: <a href="http://download.scala-ide.org/sdk/e37/scala210/stable/site">http://download.scala-ide.org/sdk/e37/scala210/stable/site</a></li>
+<li>m2eclipse-scala: <a href="http://alchim31.free.fr/m2e-scala/update-site">http://alchim31.free.fr/m2e-scala/update-site</a></li>
+<li>build-helper-maven-plugin: <a href="https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.14.0/N/0.14.0.201109282148/">https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.14.0/N/0.14.0.201109282148/</a></li>
+</ul>
+
+<p>When you don&#39;t have the plugins your project will have build errors, you
can just close the scala projects and ignore them.</p>
+
+<p>Import the Flink source code using Maven&#39;s Import tool:</p>
+
+<ul>
+<li>Select &quot;Import&quot; from the &quot;File&quot;-menu.</li>
+<li>Expand &quot;Maven&quot; node, select &quot;Existing Maven Projects&quot;,
and click &quot;next&quot; button</li>
+<li>Select the root directory by clicking on the &quot;Browse&quot; button
and navigate to the top folder of the cloned Flink git repository.</li>
+<li>Ensure that all projects are selected and click the &quot;Finish&quot;
button.</li>
+</ul></li>
 <li><p>After you have finalized your contribution, verify the compliance with
the contribution guidelines (see below), and commit them. To make the changes easily mergeable,
please rebase them to the latest version of the main repositories master branch. Assuming
you created a topic branch (step 3), you can follow this sequence of commands to do that:
 Switch to the master branch, update it to the latest revision, switch back to your topic
branch, and rebase it on top of the master branch.</p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">git
checkout master
@@ -198,6 +226,8 @@ git rebase master
 
 <p><strong>ASF git web interface</strong>: <a href="https://git-wip-us.apache.org/repos/asf?p=incubator-flink.git;a=summary">https://git-wip-us.apache.org/repos/asf?p=incubator-flink.git;a=summary</a></p>
 
+<p><strong>ASF svn for the website</strong>: <a href="https://svn.apache.org/repos/asf/incubator/flink/">https://svn.apache.org/repos/asf/incubator/flink/</a>.</p>
+
 <p>Details on how to set the credentials for the ASF git repostiory are <a href="https://git-wip-us.apache.org/">linked
here</a>.
 To merge pull requests from our GitHub mirror, there is a script in the source <code>./tools/merge_pull_request.sh.template</code>.
Rename it to <code>merge_pull_request.sh</code> with the appropriate settings
and use it for merging.</p>
 



Mime
View raw message