flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Götzinger <m...@simplydevelop.de>
Subject Re: Flink on EC"
Date Sun, 08 Nov 2015 18:06:14 GMT
HI Fabian,

thanks for reply. I use a karamel receipt to install flink on ec2.Currently I am using flink-0.9.1-bin-hadoop24.tgz
<http://apache.mirrors.spacedump.net/flink/flink-0.9.1/flink-0.9.1-bin-hadoop24.tgz>.

 In that file the NativeS3FileSystem is included. First I’ve tried it with the standard
karamel receipt on github hopshadoop/flink-chef <https://github.com/hopshadoop/flink-chef>
but it’s on Version 0.9.0 and the S3NFileSystem is not included.
So I forked the github project by goetzingert/flink-chef
Although the class file is include the application throws a ClassNotFoundException for the
class above.
In my Project I add the conf/core-site.xml

  <property>
    <name>fs.s3n.impl</name>
    <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
  </property>
  <property>
    <name>fs.s3n.awsAccessKeyId</name>
    <value>….</value>
  </property>
  <property>
    <name>fs.s3n.awsSecretAccessKey</name>
    <value>...</value>
  </property>

— 
I also tried to use the programmatic configuration 

		XMLConfiguration config = new XMLConfiguration(configPath);

		env = ExecutionEnvironment.getExecutionEnvironment();
		Configuration configuration = GlobalConfiguration.getConfiguration();
		configuration.setString("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem");
		configuration.setString("fs.s3n.awsAccessKeyId", “..");
		configuration.setString("fs.s3n.awsSecretAccessKey”,”../");
		configuration.setString("fs.hdfs.hdfssite",Template.class.getResource("/conf/core-site.xml").toString());
		GlobalConfiguration.includeConfiguration(configuration);


Any Idea why the class is not included in classpath? Is there another script to setup flink
on ec2 cluster?

When will flink 0.10 be released? 



Regards

 
Thomas Götzinger

Freiberuflicher Informatiker

 
Glockenstraße 2a

D-66882 Hütschenhausen OT Spesbach

Mobil: +49 (0)176 82180714

Privat: +49 (0) 6371 954050

mailto:mail@simplydevelop.de <mailto:thomas.goetzinger@kajukin.de>
epost: thomas.goetzinger@epost.de <mailto:thomas.goetzinger@epost.de>




> On 29.10.2015, at 09:47, Fabian Hueske <fhueske@gmail.com> wrote:
> 
> Hi Thomas,
> 
> until recently, Flink provided an own implementation of a S3FileSystem which wasn't fully
tested and buggy.
> We removed that implementation and are using now (in 0.10-SNAPSHOT) Hadoop's S3 implementation
by default.
> 
> If you want to continue using 0.9.1 you can configure Flink to use Hadoop's implementation.
See this answer on StackOverflow and the linked email thread [1].
> If you switch to the 0.10-SNAPSHOT version (which will be released in a few days as 0.10.0),
things become a bit easier and Hadoop's implementation is used by default. The documentation
shows how to configure your access keys [2]
> 
> Please don't hesitate to ask if something is unclear or not working.
> 
> Best, Fabian
> 
> [1] http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3 <http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3>
> [2] https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html
<https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html>
> 
> 2015-10-29 9:35 GMT+01:00 Thomas Götzinger <mail@simplydevelop.de <mailto:mail@simplydevelop.de>>:
> Hello Flink Team,
> 
> We at IESE Fraunhofer are evaluating Flink for a project and I'm a bit frustrated in
the moment. 
> 
> I've wrote a few testcases with the flink API and want to deploy them to an Flink EC2
Cluster. I setup the cluster using the 
> karamel receipt which was adressed in the following video 
> 
> https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg
<https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg>
> 
> The setup works fine and the hello-flink app could be run. But afterwards I want to copy
some data from s3 bucket to the local ec2 hdfs cluster. 
> 
> The hadoop fs -ls s3n.... works as well as cat,...
> But if I want to copy the data with distcp the command freezes, and does not respond
until a timeout.
> 
> After trying a few things I gave up and start another solution. I want to access the
s3 Bucket directly with flink and import it using a small flink programm which just reads
s3 and writes to local hadoop. This works fine locally, but on cluster the S3NFileSystem class
is missing (ClassNotFound Exception) althoug it is included in the jar file of the installation.

> 
> 
> I forked the chef receipt and updated to flink 0.9.1 but the same issue.
> 
> Is there another simple script to install flink with hadoop on an ec2 cluster and working
s3n filesystem?
> 
> 
> 
> 
> Freelancer 
> 
> on Behalf of Fraunhofer IESE Kaiserslautern
> 
> 
> -- 
> Viele Grüße
> 
>  
> Thomas Götzinger
> 
> Freiberuflicher Informatiker
> 
>  
> Glockenstraße 2a
> 
> D-66882 Hütschenhausen OT Spesbach
> 
> Mobil: +49 (0)176 82180714
> 
> Homezone: +49 (0) 6371 735083
> 
> Privat: +49 (0) 6371 954050
> 
> mailto:mail@simplydevelop.de <mailto:thomas.goetzinger@kajukin.de>
> epost: thomas.goetzinger@epost.de <mailto:thomas.goetzinger@epost.de>


Mime
View raw message