flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: Nested Iterations Outlook
Date Mon, 20 Jul 2015 15:53:55 GMT
You could do that but you might run into merge conflicts. Also keep in mind
that it is work in progress :)

On Mon, Jul 20, 2015 at 4:15 PM, Maximilian Alber <
alber.maximilian@gmail.com> wrote:

> Thanks!
>
> Ok, cool. If I would like to test it, I just need to merge those two pull
> requests into my current branch?
>
> Cheers,
> Max
>
> On Mon, Jul 20, 2015 at 4:02 PM, Maximilian Michels <mxm@apache.org>
> wrote:
>
>> Now that makes more sense :) I thought by "nested iterations" you meant
>> iterations in Flink that can be nested, i.e. starting an iteration inside
>> an iteration.
>>
>> The caching/pinning of intermediate results is still a work in progress
>> in Flink. It is actually in a state where it could be merged but some
>> pending pull requests got delayed because priorities changed a bit.
>>
>> Essentially, we need to merge these two pull requests:
>>
>> https://github.com/apache/flink/pull/858
>> This introduces a session management which allows to keep the
>> ExecutionGraph for the session.
>>
>> https://github.com/apache/flink/pull/640
>> Implements the actual backtracking and caching of the results.
>>
>> Once these are in, we can change the Java/Scala API to support
>> backtracking. I don't exactly know how Spark's API does it but, essentially
>> it should work then by just creating new operations on an existing DataSet
>> and submit to the cluster again.
>>
>> Cheers,
>> Max
>>
>> On Mon, Jul 20, 2015 at 3:31 PM, Maximilian Alber <
>> alber.maximilian@gmail.com> wrote:
>>
>>> Oh sorry, my fault. When I wrote it, I had iterations in mind.
>>>
>>> What I actually wanted to say, how "resuming from intermediate results"
>>> will work with (non-nested) "non-Flink" iterations? And with iterations I
>>> mean something like this:
>>>
>>> while(...):
>>>   - change params
>>>   - submit to cluster
>>>
>>> where the executed Flink-program is more or less the same at each
>>> iterations. But with changing input sets, which are reused between
>>> different loop iterations.
>>>
>>> I might got something wrong, because in our group we mentioned caching a
>>> lá Spark for Flink and someone came up that "pinning" will do that. Is that
>>> somewhat right?
>>>
>>> Thanks and Cheers,
>>> Max
>>>
>>> On Mon, Jul 20, 2015 at 1:06 PM, Maximilian Michels <mxm@apache.org>
>>> wrote:
>>>
>>>>  "So it is up to debate how the support for resuming from intermediate
>>>> results will look like." -> What's the current state of that debate?
>>>>
>>>> Since there is no support for nested iterations that I know of, the
>>>> debate how intermediate results are integrated has not started yet.
>>>>
>>>>
>>>>> "Intermediate results are not produced within the iterations cycles."
>>>>> -> Ok, if there are none, what does it have to do with that debate?
:-)
>>>>>
>>>>
>>>> I was referring to the existing support for intermediate results within
>>>> iterations. If we were to implement nested iterations, this could
>>>> (possibly) change. This is all very theoretical because there are no plans
>>>> to support nested iterations.
>>>>
>>>> Hope this clarifies. Otherwise, please restate your question because I
>>>> might have misunderstood.
>>>>
>>>> Cheers,
>>>> Max
>>>>
>>>>
>>>> On Mon, Jul 20, 2015 at 12:11 PM, Maximilian Alber <
>>>> alber.maximilian@gmail.com> wrote:
>>>>
>>>>> Thanks for the answer! But I need some clarification:
>>>>>
>>>>> "So it is up to debate how the support for resuming from intermediate
>>>>> results will look like." -> What's the current state of that debate?
>>>>> "Intermediate results are not produced within the iterations cycles."
>>>>> -> Ok, if there are none, what does it have to do with that debate?
:-)
>>>>>
>>>>> Cheers,
>>>>> Max
>>>>>
>>>>> On Mon, Jul 20, 2015 at 10:50 AM, Maximilian Michels <mxm@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi Max,
>>>>>>
>>>>>> You are right, there is no support for nested iterations yet. As
far
>>>>>> as I know, there are no concrete plans to add support for it. So
it is up
>>>>>> to debate how the support for resuming from intermediate results
will look
>>>>>> like. Intermediate results are not produced within the iterations
cycles.
>>>>>> Same would be true for nested iterations. So the behavior for resuming
from
>>>>>> intermediate results should be alike for nested iterations.
>>>>>>
>>>>>> Cheers,
>>>>>> Max
>>>>>>
>>>>>> On Fri, Jul 17, 2015 at 4:26 PM, Maximilian Alber <
>>>>>> alber.maximilian@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Flinksters,
>>>>>>>
>>>>>>> as far as I know, there is still no support for nested iterations
>>>>>>> planned. Am I right?
>>>>>>>
>>>>>>> So my question is how such use cases should be handled in the
future.
>>>>>>> More specific: when pinning/caching will be available, you suggest
>>>>>>> to use that feature and program in "Spark" style? Or is there
some other,
>>>>>>> more flexible, mechanism planned for loops?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Max
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message