flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShB <shon.balakris...@gmail.com>
Subject Adding headers to tuples before writing to S3
Date Mon, 23 Oct 2017 17:50:19 GMT
Hi,

I'm working with Flink for data analytics and reporting. The use case is
that, when a user requests a report, a Flink cluster does some computations
on the data, generates the final report(a DataSet of tuples) and uploads the
report to S3, post which an email is sent to the corresponding email id. So
I need the uploaded report to be the final, complete one that is sent to the
user. 

I'm struggling with adding a header to the final tuple DataSet that I get,
before writing it to S3. The header will be a tuple of the same arity as the
final dataset, but with all Strings, whereas my final report tuple dataset
has Long, Double, etc.

I've been trying to write my own writeToS3 function, which creates a CSV
file with the header and the Dataset tuple and then uploads to S3, but I'm
having trouble scaling to larger dataset sizes.

Is there any other recommended way to do this? Is there any way I can extend
upon the Flink writeAsCsv method to do this?

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Mime
View raw message