flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <mj...@informatik.hu-berlin.de>
Subject Re: Storm Compatibility Improvement
Date Mon, 29 Jun 2015 22:22:31 GMT
That would also work. I thought about it already, too. Thanks for the
feedback. If two people have similar idea, it might be the right way to
got. I will just include all this stuff and open an PR. Than we can
evaluate it again.

-Matthias

On 06/30/2015 12:01 AM, Gyula Fóra wrote:
> By declare I mean we assume a Flink Tuple datatype and the user declares
> the name mapping (sorry its getting late).
> 
> Gyula Fóra <gyula.fora@gmail.com> ezt írta (időpont: 2015. jún. 29., H,
> 23:57):
> 
>> Ah ok, now I get what I didn't get before :)
>>
>> So you want to take some input stream , and execute a bolt implementation
>> on it. And the question is what input type to assume when the user wants to
>> use field name based access.
>>
>> Can't we force the user to declare the names of the inputs/outputs even in
>> this case? Otherwise there is not much we can do. Maybe stick to either
>> public fields or getter setters.
>>
>>
>> Matthias J. Sax <mjsax@informatik.hu-berlin.de> ezt írta (időpont: 2015.
>> jún. 29., H, 23:51):
>>
>>> Well. If a whole Storm topology is executed, this is of course the way
>>> to got. However, I want to have named-attribute access in the case of an
>>> embedded bolt (as a single operator) in a Flink program. And is this
>>> case, fields are not declared and do not have a name (eg, if the bolt's
>>> consumers emits a stream of type Tuple3)
>>>
>>> -Matthias
>>>
>>>
>>> On 06/29/2015 11:42 PM, Gyula Fóra wrote:
>>>> Hey,
>>>> I didn't look through the whole code so I probably don't get something
>>> but
>>>> why don't you just do what storm does? Keep a map from the field names
>>> to
>>>> indexes somewhere (make this accessible from the tuple) and then you can
>>>> just use a simple Flink tuple.
>>>>
>>>> I think this is what's happening in storm, they get the index from the
>>>> context, which knows the declared output fields.
>>>>
>>>> Gyula
>>>>
>>>> Matthias J. Sax <mjsax@informatik.hu-berlin.de> ezt írta (időpont:
>>> 2015.
>>>> jún. 29., H, 18:08):
>>>>
>>>>> Hi,
>>>>>
>>>>> I started to work on a missing feature for the Storm compatibility
>>>>> layer: named attribute access
>>>>>
>>>>> In Storm, each attribute of an input tuple can be accessed via index
or
>>>>> by name. Currently, only index access is supported. In order to support
>>>>> this feature in Flink (embedded Bolt in Flink program), I see two
>>>>> (independent and complementary) ways to support this feature:
>>>>>
>>>>>  1) the input type is a POJO
>>>>>  2) Flink's Tuple type is extended to support named attributes
>>>>>
>>>>> Right now I started a prototype for POJOs. I would like to extend Tuple
>>>>> type with named attributes. However, I am not sure how the community
>>>>> likes this idea.
>>>>>
>>>>> I would like to get some feedback for the POJO prototype, too. I use
>>>>> reflections and I am not sure if my code is elegant enough. You can
>>> find
>>>>> it here: https://github.com/mjsax/flink/tree/flink-storm-compatibility
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
> 


Mime
View raw message