flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lisonbee, Todd" <todd.lison...@intel.com>
Subject RE: Kryo StackOverflowError
Date Mon, 11 Apr 2016 21:42:32 GMT
Hi,

I also got this error message when I had private inner classes:

public class A {
    private class B {
    }
}

I was able to fix by making the inner classes public static:

public class A {
    public static class B {
    }
}

When I was trying to debug it seemed this error message can be caused by several different
things.

Thanks,

Todd


-----Original Message-----
From: Hilmi Yildirim [mailto:Hilmi.Yildirim@dfki.de] 
Sent: Sunday, April 10, 2016 11:36 AM
To: dev@flink.apache.org
Subject: Re: Kryo StackOverflowError

Hi,
I also had this problem and solved it.

In my case I had multiple objects which are created via anonymous classes. When I broadcasted
these objects, the serializer tried to serialize the objects and for that it tried to serialize
the anonymous classes. This caused the problem.

For example,

class A{

  def createObjects() : Array[Object]{
            objects
         for{
             object = new Class{
             ...
             }
             objects.add(object)
         }
         return objects
     }
}

It tried to serialize "new Class". For that it tried to serialize the method createObjects().
And then it tried to serialize class A. To serialize class A it tried to serialize the method
createObjects. Or something like that, I do not remember the details. This caused the recursion.

BR,
Hilmi

Am 10.04.2016 um 19:18 schrieb Stephan Ewen:
> Hi!
>
> Is it possible that some datatype has a recursive structure nonetheless?
> Something like a linked list or so, which would create a large object graph?
>
> There seems to be a large object graph that the Kryo serializer traverses,
> which causes the StackOverflowError.
>
> Greetings,
> Stephan
>
>
> On Sun, Apr 10, 2016 at 6:24 PM, Andrew Palumbo <ap.dev@outlook.com> wrote:
>
>> Hi Stephan,
>>
>> thanks for answering.
>>
>> This not from a recursive object. (it is used in a recursive method in the
>> test that is throwing this error, but the the depth is only 2 and there are
>> no other Flink DataSet operations before execution is triggered so it is
>> trivial.)
>>
>> Gere is a Gist of the code, and the full output and stack trace:
>>
>> https://gist.github.com/andrewpalumbo/40c7422a5187a24cd03d7d81feb2a419
>>
>> The Error begins at line 178 of the "Output" file.
>>
>> Thanks
>>
>> ________________________________________
>> From: ewenstephan@gmail.com <ewenstephan@gmail.com> on behalf of Stephan
>> Ewen <sewen@apache.org>
>> Sent: Sunday, April 10, 2016 9:39 AM
>> To: dev@flink.apache.org
>> Subject: Re: Kryo StackOverflowError
>>
>> Hi!
>>
>> Sorry, I don't fully understand he diagnosis.
>> You say that this stack overflow is not from a recursive/object type?
>>
>> Long graphs of operations in Flink usually do not cause
>> StackOverflowExceptions, because not the whole graph is recursively
>> processed.
>>
>> Can you paste the entire Stack Trace (for example to a gist)?
>>
>> Greetings,
>> Stephan
>>
>>
>> On Sun, Apr 10, 2016 at 4:42 AM, Andrew Palumbo <ap.dev@outlook.com>
>> wrote:
>>
>>> Hi all,
>>>
>>>
>>> I am working on a matrix multiplication operation for Mahout Flink
>>> Bindings that uses quite a few chained Flink Dataset operations,
>>>
>>>
>>> When testing, I am getting the following error:
>>>
>>>
>>> {...}
>>>
>>> 04/09/2016 22:30:35    CHAIN Reduce (Reduce at
>>>
>> org.apache.mahout.flinkbindings.blas.FlinkOpABt$.abt_nograph(FlinkOpABt.scala:147))
>>> -> FlatMap (FlatMap at
>>>
>> org.apache.mahout.flinkbindings.drm.BlockifiedFlinkDrm.asRowWise(FlinkDrm.scala:93))(1/1)
>>> switched to CANCELED
>>> 04/09/2016 22:30:35    CHAIN Partition -> Map (Map at
>>>
>> org.apache.mahout.flinkbindings.blas.FlinkOpABt$.pairwiseApply(FlinkOpABt.scala:240))
>>> -> GroupCombine (GroupCombine at
>>>
>> org.apache.mahout.flinkbindings.blas.FlinkOpABt$.abt_nograph(FlinkOpABt.scala:129))
>>> -> Combine (Reduce at
>>>
>> org.apache.mahout.flinkbindings.blas.FlinkOpABt$.abt_nograph(FlinkOpABt.scala:147))(3/3)
>>> switched to FAILED
>>> java.lang.StackOverflowError
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:48)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>>>      at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>>>      at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>>>      at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
>>>      at
>>>
>> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:495)
>>> {...}
>>>
>>>
>>> I've seen similar issues on the dev@flink list (and other places), but I
>>> believe that they were from recursive calls and objects which pointed
>> back
>>> to themselves somehow.
>>>
>>>
>>> This is a relatively straightforward method, it just has several Flink
>>> operations before execution is triggered.   If I remove some operations,
>>> eg. a reduce, i can get the method to complete on a simple test however
>> the
>>> it will then, of course be numerically incorrect.
>>>
>>>
>>> I am wondering if there is any workaround for this type of problem?
>>>
>>>
>>> Thank You,
>>>
>>>
>>> Andy
>>>


-- 
==================================================================
Hilmi Yildirim, M.Sc.
Researcher

DFKI GmbH
Intelligente Analytik für Massendaten
DFKI Projektbüro Berlin
Alt-Moabit 91c
D-10559 Berlin
Phone: +49 30 23895 1814

E-Mail: Hilmi.Yildirim@dfki.de

-------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------

Mime
View raw message