flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Flink 1.4.0 RC3 and Avro objects with maps have null values
Date Fri, 08 Dec 2017 13:13:37 GMT
I think this Avro issue might be related: https://issues.apache.org/jira/browse/AVRO-803

> On 7. Dec 2017, at 17:19, Timo Walther <twalthr@apache.org> wrote:
> 
> Hi Jörn,
> 
> thanks for the little example. Maybe Avro changed the behavior about maps from the old
version we used in 1.3 to the newest version in Flink 1.4.0.
> 
> I will investigate this and might open an issue for it.
> 
> Regards,
> Timo
> 
> 
> Am 12/7/17 um 5:07 PM schrieb Joern Kottmann:
>> Hello Timo,
>> 
>> thanks for your quick response. I can't share the code of that pipeline here.
>> 
>> The Flink version I am using is this one:
>> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>> 
>> That was compiled by me with mvn install -DskipTests (for some reason
>> the tests failed, can also share that, maybe a firewall issue on
>> Fedora 27).
>> 
>> I managed to isolate the problem and it can be reproduced with a few
>> lines running in the IDE:
>> https://github.com/kottmann/flink-avro-issue
>> 
>> And it is indeed like you suspected, the key is a Utf8, and I pass in
>> a String to the get.
>> 
>> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>> 
>> Thanks again!
>> Jörn
>> 
>> On Thu, Dec 7, 2017 at 5:07 PM, Joern Kottmann <kottmann@gmail.com> wrote:
>>> Hello Timo,
>>> 
>>> thanks for your quick response. I can't share the code of that pipeline here.
>>> 
>>> The Flink version I am using is this one:
>>> http://people.apache.org/~aljoscha/flink-1.4.0-rc3/flink-1.4.0-src.tgz
>>> 
>>> That was compiled by me with mvn install -DskipTests (for some reason
>>> the tests failed, can also share that, maybe a firewall issue on
>>> Fedora 27).
>>> 
>>> I managed to isolate the problem and it can be reproduced with a few
>>> lines running in the IDE:
>>> https://github.com/kottmann/flink-avro-issue
>>> 
>>> And it is indeed like you suspected, the key is a Utf8, and I pass in
>>> a String to the get.
>>> 
>>> But why did that now break with Flink 1.4.0 and runs on Flink 1.3.2?
>>> 
>>> Thanks again!
>>> Jörn
>>> 
>>> On Thu, Dec 7, 2017 at 3:57 PM, Timo Walther <twalthr@apache.org> wrote:
>>>> Can you also check the type of the keys in your map. Avro distinguished
>>>> between String and Utf8 class. Maybe this is why your key cannot be found.
>>>> 
>>>> Regards,
>>>> Timo
>>>> 
>>>> 
>>>> Am 12/7/17 um 3:54 PM schrieb Timo Walther:
>>>> 
>>>>> Hi Jörn,
>>>>> 
>>>>> could you tell us a bit more about your job? Did you import the flink-avro
>>>>> module? How does the Flink TypeInformation for your Avro type look like
>>>>> using println(ds.getType)? It sounds like a very weird error if the
>>>>> toString() method shows the key. Can you reproduce the error in your
IDE and
>>>>> enter the `get("theKey")` method using a debugger. I will loop in Aljoscha
>>>>> that worked on Avro recently.
>>>>> 
>>>>> Regards,
>>>>> Timo
>>>>> 
>>>>> 
>>>>> 
>>>>> Am 12/7/17 um 3:19 PM schrieb Joern Kottmann:
>>>>>> Hello,
>>>>>> 
>>>>>> after having a version mismatch between Avro in Flink 1.3.2 I decided
>>>>>> to see how things work with Flink 1.4.0.
>>>>>> 
>>>>>> The pipeline I am building runs now, deployed as standalone on YARN
>>>>>> with Flink 1.3.2 and putting it "FIRST" on the classpath (to use
Avro
>>>>>> 1.8.2 instead of an 1.7.x version).
>>>>>> 
>>>>>> The default setting for yarn.per-job-cluster.include-user-jar is
>>>>>> ORDERED which in my case was a bit tricky, because I first deployed
a
>>>>>> pipeline having a jar starting with "A" so it was placed before the
>>>>>> flink jars, later I worked on a mostly identical pipeline with a
name
>>>>>> a bit further down in the alphabet, and that was placed behind the
>>>>>> flink jars causing various issues. I would recommend to change this
as
>>>>>> default to LAST. Then the jar name has no impact on how things are
>>>>>> loaded.
>>>>>> Should I open a jira for this?
>>>>>> 
>>>>>> Now after updating to Flink 1.4.0 and also running on YARN as
>>>>>> standalone I am getting null values in a map field I did set in
>>>>>> earlier steps in the pipeline on Avro objects. The thing is a bit
>>>>>> weird, because avroObject.toString would show the values just fine,
>>>>>> but I get null on retrieval.
>>>>>> 
>>>>>> I have my own TableInputFormat to read from HBase and output a Tuple2
>>>>>> containing an Avro object.
>>>>>> Next comes a map which tries to access it.
>>>>>> 
>>>>>> The failing piece looks like this:
>>>>>> ao.getTheMap().get("theKey") (returns null)
>>>>>> Doing an ao.toString would show the "theMap" and the value of "theKey".
>>>>>> 
>>>>>> Is this a bug in Flink 1.4.0 or is there something wrong with my
>>>>>> dependencies?
>>>>>> 
>>>>>> Thanks,
>>>>>> Jörn
>>>>> 
> 


Mime
View raw message