flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Kloudas <k.klou...@data-artisans.com>
Subject Re: Bucketing/Rolling Sink: How to overwrite method "openNewPartFile" - to append a new timestamp to part file path every time a new part file is being created
Date Mon, 09 Oct 2017 14:18:42 GMT
Hi Raja,

To know about the method, I suppose you have looked at the source code of the sink.
I think that including the timestamp of the element in the path file is not as easy as overriding
the openNewPartFile.

The reason is that the filenames serve as identities for the associated state of the bucket
and this searches for 
complete equality of the filename, rather that “contains()”, when checking for partial
filenames to transition from
pending to finished state.

A way to bypass this, it to write along each element, its timestamp, so that when you check
out the content of the 
file, you can see the timestamp of the first element. You will have to write more data though.

Does this fit your needs?


> On Oct 6, 2017, at 11:02 PM, Raja.Aravapalli <Raja.Aravapalli@target.com> wrote:
> Hi,
> I want to overwrite the method “openNewPartFile” in the BucketingSink Class such
that it creates part file name with inclusion of timestamp whenever it rolls a new part file.
> Can someone share some thoughts on how I can do this.                 
> Thanks a ton, in advance. 
> Regards,
> Raja.

View raw message