flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Iterative queries on Flink
Date Mon, 30 Nov 2015 16:35:40 GMT
I think that with some support I could try to implement it...actually I
just need to add a persist(StorageLevel.OFF_HEAP) method to the Dataset
APIs (similar to what Spark does..) and output it to a tachyon directory
configured in the flink-conf.yml and then re-read that dataset using its
generated name on tachyon. Do you have other suggestions?

On Mon, Nov 30, 2015 at 4:58 PM, Fabian Hueske <fhueske@gmail.com> wrote:

> The basic building blocks are there but I am not aware of any efforts to
> implement caching and add it to the API.
> 2015-11-30 16:55 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:
>> Is there any effort in this direction? maybe I could achieve something
>> like that using Tachyon in some way...?
>> On Mon, Nov 30, 2015 at 4:52 PM, Fabian Hueske <fhueske@gmail.com> wrote:
>>> Hi Flavio,
>>> Flink does not support caching of data sets in memory yet.
>>> Best, Fabian
>>> 2015-11-30 16:45 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:
>>>> Hi to all,
>>>> I was wondering if Flink could fit a use case where a user load a
>>>> dataset in memory and then he/she wants to explore it interactively. Let's
>>>> say I want to load a csv, then filter out the rows where the column value
>>>> match some criteria, then apply another criteria after seeing the results
>>>> of the first filter.
>>>> Is there a way to keep the dataset in memory and modify it
>>>> interactively without re-reading all the dataset every time I want to chain
>>>> another operation to my dataset?
>>>> Best,
>>>> Flavio

View raw message