flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <mj...@apache.org>
Subject Re: No error on dangling dataflow parts
Date Thu, 08 Oct 2015 15:12:10 GMT
I just skipped over the discussion. If I got it right, the question was
about if a dedicated sink operator must be the last in a flow or not. In
the case I described below, it is about distinct flows in a single
program. This was not discussed in the PR (or did I miss it?).

For sink/no-sink I don't care. However, having two data flow components
that are not connected to each other is an issue for me. I also would
raise an exception for a program like this:

Source1 --> Bolt --> SinkBolt
Source2 --> Bolt --> SinkBolt




-Matthias


On 10/08/2015 04:55 PM, Aljoscha Krettek wrote:
> We had some discussion on the Jira Issue when I changed the translation
> from operators to StreamGraph:
> https://issues.apache.org/jira/browse/FLINK-2398
> 
> There are arguments for both ways of doing it, I'm not partial towards any
> solution but if you want this changed you can start a discussion. (If we
> change it, however, this will probably not come in time for 0.10)
> 
> On Thu, 8 Oct 2015 at 16:42 Matthias J. Sax <mjsax@apache.org> wrote:
> 
>> Dropping would be less strange.
>>
>> However, raising an exception would be natural (at least to me)
>>
>>
>> On 10/08/2015 04:30 PM, Aljoscha Krettek wrote:
>>> What do you mean? The current behavior is strange or the other way round
>> would be strange?
>>>
>>> I think it is in line with what other Stream Processing Systems provide.
>> For example Storm and Google Dataflow behave similarly.
>>>
>>>> On 08 Oct 2015, at 16:25, Matthias J. Sax <mjsax@apache.org> wrote:
>>>>
>>>> Well. This behavior would also be kind of strange... (at least to me)
>>>>
>>>> On 10/08/2015 04:22 PM, Aljoscha Krettek wrote:
>>>>> Hi,
>>>>> I think Flink does in fact not drop the dangling parts. In streaming
>> it is allowed to have dangling operators that are not sinks. They are
>> executed and the output is just discarded.
>>>>>
>>>>> Cheers,
>>>>> Aljoscha
>>>>>> On 08 Oct 2015, at 16:18, Matthias J. Sax <mjsax@apache.org>
wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I just hit a problem in Storm Compatibility:
>>>>>> https://issues.apache.org/jira/browse/FLINK-2837
>>>>>>
>>>>>> If a bolt has multiple inputs, the topology is not translated
>> correctly
>>>>>> into a Flink streaming program. The point is, that the Flink program
>> can
>>>>>> be executed without an error, even if the assembled data flow has
>>>>>> dangling parts...
>>>>>>
>>>>>> For example:
>>>>>>
>>>>>> Source1 --+--+--> Bolt --> SinkBolt
>>>>>>         |  |
>>>>>> Source2 --+  |
>>>>>>            |
>>>>>> Source3 -----+
>>>>>>
>>>>>> Is translated to the following Flink program
>>>>>>
>>>>>> Source1 --> Bolt --> SinkBolt
>>>>>>
>>>>>> Source2 --> Bolt
>>>>>>
>>>>>> Source3 --> Bolt
>>>>>>
>>>>>> with Source2 and Source3 being added to the environment but not
>>>>>> connected correctly to the overall program because the Bolt is
>>>>>> instantiated three times and only a single bolt is connect to the
>> sink.
>>>>>> It is clear, that Flink just drops the dangling parts, as it builds
>> the
>>>>>> JobGraph starting from the sink and traversing backwards. I was just
>>>>>> wondering, if an error should actually occur.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>
>>>>
>>>
>>
>>
> 


Mime
View raw message