flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Conrad Crampton <conrad.cramp...@SecData.com>
Subject Reload DistributedCache file?
Date Wed, 16 Aug 2017 16:41:22 GMT
I have a simple text file that is stored in HDFS which I use in a RichFilterFunction by way
of DistributedCache file. The file is externally edited periodically to have other lines added
to it. My FilterFunction also implements Runnable whose run method is run as a scheduleAtFixedRate
method of ScheduledExectutorService which reloads the file and stores the results in a List
in the Filter class.

I have realized the errors of my ways as the file that is reloaded is the cached file that
is copied to temporary file location on the node which this instance of Filter class is loaded
and not the file from HDFS directly (as this has been copied when the Flink job started.

Can anyone suggest a solution to this? It is I think a similar problem that Add Side Inputs
in Flink [1] proposal is trying to address but not finalized yet.
Can anyone see a problem if I have a thread that reloads the HDFS file being in the main body
of my Flink program and registers the cache file within that reload process e.g.

env.registerCachedFile(properties.getProperty("whitelist.location"), WHITELIST);

i.e. does this actually copy the file again from HDFS to temporary files on each node? I think
I’d have to have the same schedule I have currently that reload within my Filter function
too though as all the previous process would do is to push the HDFS file to temp location
and not actually refresh my List.

Any suggestions would be welcome.


[1] https://docs.google.com/document/d/1hIgxi2Zchww_5fWUHLoYiXwSBXjv-M5eOv-MKQYN3m4/edit#heading=h.pqg5z6g0mjm7

SecureData, combating cyber threats
The information contained in this message or any of its attachments may be privileged and
confidential and intended for the exclusive use of the intended recipient. If you are not
the intended recipient any disclosure, reproduction, distribution or other dissemination or
use of this communications is strictly prohibited. The views expressed in this email are those
of the individual and not necessarily of SecureData Europe Ltd. Any prices quoted are only
valid if followed up by a formal written quote.

SecureData Europe Limited. Registered in England & Wales 04365896. Registered Address:
SecureData House, Hermitage Court, Hermitage Lane, Maidstone, Kent, ME16 9NT
View raw message