flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raja.Aravapalli <Raja.Aravapa...@target.com>
Subject Bucketing/Rolling Sink: New timestamp appeded to the part file name everytime a new part file is rolled
Date Thu, 31 Aug 2017 17:12:07 GMT


I have a flink application that is streaming data into HDFS and I am using Bucketing Sink
for that. And, I want to know if is it possible to rename the part files that is being created
in the base hdfs directory.

Right now I am using the below code for including the timestamp into part-file name, but the
problem I am facing is the timestamp is not changing for the new part file that is being rolled!

BucketingSink<String> HdfsSink = new BucketingSink<String> (hdfsOutputPath);

HdfsSink.setBucketer(new BasePathBucketer<String>());
HdfsSink.setBatchSize(1024 * 1024 * hdfsOutputBatchSizeInMB); // this means 'hdfsOutputBatchSizeInMB'
HdfsSink.setPartPrefix("PART-FILE-" + Long.toString(System.currentTimeMillis()));

Can someone please suggest me, what code changes I can try so that I get a new timestamp for
every part file that is being rolled new?

Thanks a lot.

View raw message