flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Alexandrov <alexander.s.alexand...@gmail.com>
Subject Re: Collision of task number values for the same task
Date Tue, 31 May 2016 10:48:11 GMT
Sure, you can find them attached here (both jobmanager and taskmanager, the
problem was observed in the jobmanager logs).

If needed I can also share the binary to reproduce the issue.

I think the problem is related to the fact that the input splits are lazily
assigned to the task slots, and it seems that in case of 8 splits for 4
slots, we get each (x/y) combination twice.

Moreover, I am currently analyzing the structure of the log files, and it
seems that the task ID is not reported consistently across the different
messages [1,2,3]. This makes the implementation of an ETL job that extracts
the statistics from the log and feed them into a database quite hard.

Would it be possible to push a fix which adds the task ID consistently
across all messages in the 1.0.x line? If yes, I will open a JIRA and work
on that this week.
I would like to get feedback from other people who are parsing jobmanager /
taskamanager logs on that in order to avoid possible backwards
compatibility with job analysis tools on the release line.

[1]
https://github.com/apache/flink/blob/da23ee38e5b36ddf26a6a5a807efbbbcbfe1d517/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L370-L371
[2]
https://github.com/apache/flink/blob/da23ee38e5b36ddf26a6a5a807efbbbcbfe1d517/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java#L991-L992

Regards,
A.


2016-05-31 12:01 GMT+02:00 Ufuk Celebi <uce@apache.org>:

> On Tue, May 31, 2016 at 11:53 AM, Alexander Alexandrov
> <alexander.s.alexandrov@gmail.com> wrote:
> > Can somebody shed a light on the execution semantics of the scheduler
> which
> > will explain this behavior?
>
> The execution IDs are unique per execution attempt. Having two tasks
> with the same subtask index running at the same time is unexpected.
>
> Can you share the complete logs, please?
>
> – Ufuk
>

Mime
View raw message