flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Writing a DataSet to ElasticSearch
Date Mon, 09 Mar 2020 13:25:53 GMT
Hey Niels,

For the OOM problem: Did you try RocksDB?

I don't think there's an ES OutputFormat.

I guess there's no way around implementing your own OutputFormat for ES, if
you want to use the DataSet API. It should not be too hard to implement.


On Sun, Mar 1, 2020 at 1:42 PM Niels Basjes <Niels@basjes.nl> wrote:

> Hi,
>
> I have a job in Flink 1.10.0 which creates data that I need to write to
> ElasticSearch.
> Because it really is a Batch (and doing it as a stream keeps giving OOM
> problems: big + unordered + groupby) I'm trying to do it as a real batch.
>
> To write a DataSet to some output (that is not a file) an OutputFormat
> implementation is needed.
>
> public DataSink<T> output(OutputFormat<T> outputFormat)
>
> The problem I have is that I have not been able to find a "OutputFormat"
> for ElasticSearch.
> Adding ES as a Sink to a DataStream is trivial because a Sink is provided
> out of the box.
>
> The only alternative I came up with is to write the output of my batch to
> a file and then load that (with a stream) into ES.
>
> What is the proper solution?
> Is there an OutputFormat for ES I can use that I overlooked?
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>
>

Mime
View raw message