flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominique Rondé <dominique.ro...@allsecur.de>
Subject HDFS to Kafka
Date Tue, 12 Jul 2016 16:30:37 GMT
Hi folks,

on the first view I have a very simple problem. I like to get datasets 
out of some textfiles in HDFS and send them to a kafka topic. I use the 
following code to do that:

DataStream<String> hdfsDatasource = env.readTextFile("hdfs://" + 
parameterTool.getRequired("hdfs_env") + "/user/flink/" + 
parameterTool.getRequired("hdfs_path") + "/");
hdfsDatasource.addSink(new 
FlinkKafkaProducer08<String>(parameterTool.getRequired("brokerlist"),parameterTool.getRequired("topic"),new

SimpleStringSchema()));

Everything works fine. But I need a possibility to go recursive through 
the source folder and find textfiles in subfolders. For my batch 
routines it work fine with "recursive.file.enumeration", but in the 
streaming environment it is not possible to give these configuration to 
the readTextFile method.

Can someone give me a hint ?

Cheers

Dominique


Mime
View raw message