flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Albert Giménez <alb...@datacamp.com>
Subject Re: Flink session on Yarn - ClassNotFoundException
Date Fri, 01 Sep 2017 07:33:26 GMT
Thanks for the replies :)

I managed to get it working following the instructions here <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions>,
but I found a few issues that I guess were specific to HDInsight, or at least to the HDP version
it uses. Trying to summarize:

Hadoop version 
After running “hadoop version”, the result was “2.7.3.2.6.1.3-4”.
However, when building I was getting errors that some dependencies from the Hortonworks repo
were not found, for instance zookeeper “3.4.6.2.6.1.3-4”.
I browsed to the Hortonworks repo <http://repo.hortonworks.com/content/repositories/releases/org/apache/zookeeper/zookeeper/>
to find a suitable version, so I ended up using 2.7.3.2.6.1.31-3 instead.

Scala version
I also had issues with dependencies if I tried using Scala version 2.11.11, so I compiled
agains 2.11.7.

So, the maven command I used was this:
mvn install -DskipTests -Dscala.version=2.11.7 -Pvendor-repos -Dhadoop.version=2.7.3.2.6.1.31-3

Azure Jars
With all that, I still had class not found errors errors when trying to start my Flink session,
for instance "java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem”.
To fix that, I had to find the azure-specific jars I needed to use. For that, I checked which
jars Spark was using and copied / symlinked them into Flink’s “lib” directory:
/usr/hdp/current/spark2-client/jars/*azure*.jar
/usr/lib/hdinsight-datalake/adls2-oauth2-token-provider.jar

Guava Jar
Finally, for some reason my jobs were failing (the Cassandra driver was complaining about
the Guava version being too old, although I had the right version in my assembled jar).
I just downloaded the version I needed (in my case, 23.0 <http://central.maven.org/maven2/com/google/guava/guava/23.0/guava-23.0.jar>)
and also put that into Flink’s lib directory.

I hope it helps other people trying to run Flink on Azure HDInsight :)

Kind regards,

Albert

> On Aug 31, 2017, at 8:18 PM, Banias H <banias4spark@gmail.com> wrote:
> 
> We had the same issue. Get the hdp version, from /usr/hdp/current/hadoop-client/hadoop-common-<version>.jar
for example. Then rebuild flink from src:
> mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=<version>
> 
> for example: mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.7.3.2.6.1.0-129
> 
> Copy and setup build-target/ to the cluster. Export HADOOP_CONF_DIR and YARN_CONF_DIR
according to your env. You should have no problem starting the session.
> 
> 
> On Wed, Aug 30, 2017 at 6:45 AM, Federico D'Ambrosio <fedexist@gmail.com <mailto:fedexist@gmail.com>>
wrote:
> Hi,
> What is your "hadoop version" output? I'm asking because you said your hadoop distribution
is in /usr/hdp so it looks like you're using Hortonworks HDP, just like myself. So, this would
be a third party distribution and you'd need to build Flink from source according to this:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions
<https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions>
> 
> Federico D'Ambrosio
> 
> Il 30 ago 2017 13:33, "albert" <albert@datacamp.com <mailto:albert@datacamp.com>>
ha scritto:
> Hi Chesnay,
> 
> Thanks for your reply. I did download the binaries matching my Hadoop
> version (2.7), that's why I was wondering if the issue had something to do
> with the exact hadoop version flink is compiled again, or if there might be
> things that are missing in my environment.
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
> 


Mime
View raw message