apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganelin, Ilya" <Ilya.Gane...@capitalone.com>
Subject Re: Stack overflow errors when launching job
Date Mon, 21 Mar 2016 18:18:08 GMT
Awesome - will try this and report results. Thanks!




On 3/21/16, 11:15 AM, "Munagala Ramanath" <ram@datatorrent.com> wrote:

>Please add the "*-XXMaxJavaStackTraceDepth=-1*" to the JVM options and
>regenerate the stack trace.
>Please note that the argument is a negative 1 which forces unlimited stack
>trace depth.
>
>For example:
>
><property>
>  <name>dt.attr.CONTAINER_JVM_OPTIONS</name>
>  <value>-XXMaxJavaStackTraceDepth=-1</value>
></property>
>
>Ram
>
>On Mon, Mar 21, 2016 at 11:06 AM, Ganelin, Ilya <Ilya.Ganelin@capitalone.com
>> wrote:
>
>> Ram - that is the complete log. I have nothing else available, either
>> through YARN or through the DT UI.
>>
>>
>>
>>
>> On 3/21/16, 10:33 AM, "Munagala Ramanath" <ram@datatorrent.com> wrote:
>>
>> >The call chain is not complete; it ends abruptly with:
>> >
>> >at java.util.ArrayList.writeObject(ArrayList.java:742)
>> >at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>> >at
>>
>> >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >at java.lang.reflect.Method.invoke(Method.java:606)
>> >at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
>> >at
>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)
>> >
>> >
>> >We need to see the point of origin.
>> >
>> >Ram
>> >
>> >On Mon, Mar 21, 2016 at 10:02 AM, Ganelin, Ilya <
>> Ilya.Ganelin@capitalone.com
>> >> wrote:
>> >
>> >> I uploaded the complete stack trace to the gist in the issue:
>> >> https://gist.github.com/ilganeli/7f770374113b40ffa18a
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On 3/21/16, 9:38 AM, "Munagala Ramanath" <ram@datatorrent.com> wrote:
>> >>
>> >> >Ilya, could you upload a full stack trace of the failure so we can see
>> >> >where the call chain
>> >> >originated ?
>> >> >
>> >> >Ram
>> >> >
>> >> >On Mon, Mar 21, 2016 at 9:21 AM, Ganelin, Ilya <
>> >> Ilya.Ganelin@capitalone.com>
>> >> >wrote:
>> >> >
>> >> >> Chandni- my application fails when launching in YARN, not in local
>> mode.
>> >> >> There is no custom partitioning - the code in the example is complete
>> >> for
>> >> >> both the input and output classes.
>> >> >>
>> >> >>
>> >> >>
>> >> >> Sent with Good (www.good.com)
>> >> >> ________________________________
>> >> >> From: Chandni Singh <chandni@datatorrent.com>
>> >> >> Sent: Monday, March 21, 2016 3:45:46 AM
>> >> >> To: dev@apex.incubator.apache.org
>> >> >> Subject: Re: Stack overflow errors when launching job
>> >> >>
>> >> >> ​
>> >> >>  debug.zip
>> >> >> <
>> >> >>
>> >>
>> https://drive.google.com/a/datatorrent.com/file/d/0BxX8sOLG8CxHLXFjUjBxM0hIZDg/view?usp=drive_web
>> >> >> >
>> >> >> ​​Hi Ilya,
>> >> >>
>> >> >> Attached is the debug application with 20 partitions of input and
>> output
>> >> >> operators. I changed the default locality. This application doesn't
>> >> fail in
>> >> >> local mode.
>> >> >>
>> >> >> ​I am using the Stateless Partitioner for both Input and Output.
>> >> >> Test configuration is in ApplicationTest and cluster configuration
>> is in
>> >> >> my-app-conf1.xml
>> >> >>
>> >> >> Have you added custom partitioning? They maybe causing the stack
>> >> overflow
>> >> >> in the app master.
>> >> >>
>> >> >> Can you modify this application so that the ApplicationTest throws
>> this
>> >> >> stack overflow?
>> >> >>
>> >> >> - Chandni
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Sun, Mar 20, 2016 at 11:30 AM, Chandni Singh <
>> >> chandni@datatorrent.com>
>> >> >> wrote:
>> >> >>
>> >> >> > Hi Ilya,
>> >> >> > As Ram mentioned that we don't know the beginning of the stack
>> track
>> >> from
>> >> >> > where this is triggered. We can add jvm options in the
>> configuration
>> >> file
>> >> >> > so that app master is deployed with those configurations.
>> >> >> >
>> >> >> > Anyways  I will look into creating this application (with
20
>> >> partitions)
>> >> >> > and run it in local mode to find out where the problem is.
>> >> >> >
>> >> >> > Will get back to you today or tomorrow.
>> >> >> >
>> >> >> > Chandni
>> >> >> >
>> >> >> > On Sun, Mar 20, 2016 at 9:54 AM, Amol Kekre <amol@datatorrent.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >> Can we get on a webex to take a look?
>> >> >> >>
>> >> >> >> thks
>> >> >> >> Amol
>> >> >> >>
>> >> >> >>
>> >> >> >> On Sat, Mar 19, 2016 at 7:27 PM, Ganelin, Ilya <
>> >> >> >> Ilya.Ganelin@capitalone.com>
>> >> >> >> wrote:
>> >> >> >>
>> >> >> >> > I don't think I have any time really to connect to
the
>> container.
>> >> The
>> >> >> >> > application launches and crashes almost immediately.
Total
>> runtime
>> >> is
>> >> >> 50
>> >> >> >> > seconds.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Sent with Good (www.good.com<http://www.good.com>)
>> >> >> >> > ________________________________
>> >> >> >> > From: Munagala Ramanath <ram@datatorrent.com>
>> >> >> >> > Sent: Saturday, March 19, 2016 5:39:11 PM
>> >> >> >> > To: dev@apex.incubator.apache.org
>> >> >> >> > Subject: Re: Stack overflow errors when launching
job
>> >> >> >> >
>> >> >> >> > There is some info here, near the end of the page:
>> >> >> >> >
>> >> >> >> > http://docs.datatorrent.com/troubleshooting/
>> >> >> >> >
>> >> >> >> > under the heading "How do I get a heap dump when
a container
>> gets
>> >> an
>> >> >> >> > OutOfMemoryError ?"
>> >> >> >> >
>> >> >> >> > However since you're blowing the stack, you may need
to manually
>> >> run
>> >> >> >> jmap
>> >> >> >> > on the running container
>> >> >> >> > which may be difficult if it doesn't stay up for
very long.
>> There
>> >> is a
>> >> >> >> way
>> >> >> >> > to dump the heap programmatically
>> >> >> >> > as described, for instance, here:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java
>> >> >> >> >
>> >> >> >> > Ram
>> >> >> >> >
>> >> >> >> > On Sat, Mar 19, 2016 at 2:07 PM, Ganelin, Ilya <
>> >> >> >> > Ilya.Ganelin@capitalone.com>
>> >> >> >> > wrote:
>> >> >> >> >
>> >> >> >> > > How would we go about getting a heap dump?
>> >> >> >> > >
>> >> >> >> > >
>> >> >> >> > >
>> >> >> >> > > Sent with Good (www.good.com<http://www.good.com<
>> >> >> http://www.good.com<http://www.good.com>>)
>> >> >> >> > > ________________________________
>> >> >> >> > > From: Yogi Devendra <yogidevendra@apache.org>
>> >> >> >> > > Sent: Saturday, March 19, 2016 12:19:26 AM
>> >> >> >> > > To: dev@apex.incubator.apache.org
>> >> >> >> > > Subject: Re: Stack overflow errors when launching
job
>> >> >> >> > >
>> >> >> >> > > Stack trace in the gist shows some symptoms
of infinite
>> >> recursion.
>> >> >> >> > > But, I could not figure out exact cause for
it.
>> >> >> >> > >
>> >> >> >> > > Can you please check your heap dump to see if
there are any
>> >> cycles
>> >> >> in
>> >> >> >> the
>> >> >> >> > > object hierarchy?
>> >> >> >> > >
>> >> >> >> > > ~ Yogi
>> >> >> >> > >
>> >> >> >> > > On 19 March 2016 at 00:36, Ashwin Chandra Putta
<
>> >> >> >> > ashwinchandrap@gmail.com>
>> >> >> >> > > wrote:
>> >> >> >> > >
>> >> >> >> > > > In the example you posted, do you have
any locality
>> constraint
>> >> >> >> applied?
>> >> >> >> > > >
>> >> >> >> > > > From what I see, you have two operators
- hdfs input
>> operator
>> >> and
>> >> >> >> hdfs
>> >> >> >> > > > output operator. Each of them have 40 partitions
each and
>> you
>> >> >> don't
>> >> >> >> > have
>> >> >> >> > > > any other constraints on them. And the
partitioner
>> >> implementation
>> >> >> >> you
>> >> >> >> > are
>> >> >> >> > > > using is
>> >> com.datatorrent.common.partitioner.StatelessPartitioner
>> >> >> >> > > >
>> >> >> >> > > > Please confirm.
>> >> >> >> > > >
>> >> >> >> > > > Regards,
>> >> >> >> > > > Ashwin.
>> >> >> >> > > >
>> >> >> >> > > > On Thu, Mar 17, 2016 at 5:00 PM, Ganelin,
Ilya <
>> >> >> >> > > > Ilya.Ganelin@capitalone.com>
>> >> >> >> > > > wrote:
>> >> >> >> > > >
>> >> >> >> > > > > I’ve updated the gist with a more
complete example, and
>> >> updated
>> >> >> >> the
>> >> >> >> > > > > associated JIRA that I’ve created.
>> >> >> >> > > > > https://issues.apache.org/jira/browse/APEXCORE-392
>> >> >> >> > > > >
>> >> >> >> > > > >
>> >> >> >> > > > >
>> >> >> >> > > > >
>> >> >> >> > > > >
>> >> >> >> > > > > On 3/17/16, 4:33 AM, "Tushar Gosavi"
<
>> tushar@datatorrent.com
>> >> >
>> >> >> >> wrote:
>> >> >> >> > > > >
>> >> >> >> > > > > >Hi,
>> >> >> >> > > > >
>> >> >> >> > > > > >
>> >> >> >> > > > > >I created a sample application
with operators from the
>> given
>> >> >> >> link.
>> >> >> >> > > just
>> >> >> >> > > > a
>> >> >> >> > > > > >simple input and output and created
32 partitions of
>> each.
>> >> >> Could
>> >> >> >> not
>> >> >> >> > > > > >reproduce the
>> >> >> >> > > > > >stack overflow issue. Do you have
a small sample
>> application
>> >> >> >> which
>> >> >> >> > > could
>> >> >> >> > > > > >reproduce this issue?
>> >> >> >> > > > > >
>> >> >> >> > > > > >  @Override
>> >> >> >> > > > > >  public void populateDAG(DAG
dag, Configuration
>> >> configuration)
>> >> >> >> > > > > >  {
>> >> >> >> > > > > >    NewlineFileInputOperator in
=
>> dag.addOperator("Input",
>> >> new
>> >> >> >> > > > > >NewlineFileInputOperator());
>> >> >> >> > > > > >    in.setDirectory("/user/tushar/data");
>> >> >> >> > > > > >    in.setPartitionCount(32);
>> >> >> >> > > > > >
>> >> >> >> > > > > >    HdfsFileOutputOperator out
=
>> dag.addOperator("Output",
>> >> new
>> >> >> >> > > > > >HdfsFileOutputOperator());
>> >> >> >> > > > > >    out.setFilePath("/user/tushar/outdata");
>> >> >> >> > > > > >
>> >> >> >> > > > >
>> >> >> >> > > >
>> >> >> >> > >
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> >dag.getMeta(out).getAttributes().put(Context.OperatorContext.PARTITIONER,
>> >> >> >> > > > > >new StatelessPartitioner<HdfsFileOutputOperator>(32));
>> >> >> >> > > > > >
>> >> >> >> > > > > >    dag.addStream("s1", in.output,
out.input);
>> >> >> >> > > > > >  }
>> >> >> >> > > > > >
>> >> >> >> > > > > >-Tushar.
>> >> >> >> > > > > >
>> >> >> >> > > > > >
>> >> >> >> > > > > >
>> >> >> >> > > > > >On Thu, Mar 17, 2016 at 12:30
AM, Ganelin, Ilya <
>> >> >> >> > > > > Ilya.Ganelin@capitalone.com
>> >> >> >> > > > > >> wrote:
>> >> >> >> > > > > >
>> >> >> >> > > > > >> Hi guys – I’m running
into a very frustrating issue
>> where
>> >> >> >> certain
>> >> >> >> > > DAG
>> >> >> >> > > > > >> configurations cause the
following error log
>> (attached).
>> >> When
>> >> >> >> this
>> >> >> >> > > > > happens,
>> >> >> >> > > > > >> my application even fails
to launch. This does not
>> seem to
>> >> >> be a
>> >> >> >> > YARN
>> >> >> >> > > > > issue
>> >> >> >> > > > > >> since this occurs even with
a relatively small number
>> of
>> >> >> >> > > > > partitions/memory.
>> >> >> >> > > > > >>
>> >> >> >> > > > > >> I’ve attached the input
and output operators in
>> question:
>> >> >> >> > > > > >> https://gist.github.com/ilganeli/7f770374113b40ffa18a
>> >> >> >> > > > > >>
>> >> >> >> > > > > >> I can get this to occur predictable
by
>> >> >> >> > > > > >>
>> >> >> >> > > > > >>   1.  Increasing the partition
count on my input
>> operator
>> >> >> >> (reads
>> >> >> >> > > from
>> >> >> >> > > > > >> HDFS) - values above 20 cause
this error
>> >> >> >> > > > > >>   2.  Increase the partition
count on my output
>> operator
>> >> >> >> (writes
>> >> >> >> > to
>> >> >> >> > > > > HDFS)
>> >> >> >> > > > > >> - values above 20 cause this
error
>> >> >> >> > > > > >>   3.  Set stream locality
from the default to either
>> >> thread
>> >> >> >> local,
>> >> >> >> > > > node
>> >> >> >> > > > > >> local, or container_local
on the output operator
>> >> >> >> > > > > >>
>> >> >> >> > > > > >> This behavior is very frustrating
as it’s preventing me
>> >> from
>> >> >> >> > > > > partitioning
>> >> >> >> > > > > >> my HDFS I/O appropriately,
thus allowing me to scale to
>> >> >> higher
>> >> >> >> > > > > throughputs.
>> >> >> >> > > > > >>
>> >> >> >> > > > > >> Do you have any thoughts
on what’s going wrong? I would
>> >> love
>> >> >> >> your
>> >> >> >> > > > > feedback.
>> >> >> >> > > > > >>
>> ________________________________________________________
>> >> >> >> > > > > >>
>> >> >> >> > > > > >> The information contained
in this e-mail is
>> confidential
>> >> >> and/or
>> >> >> >> > > > > >> proprietary to Capital One
and/or its affiliates and
>> may
>> >> only
>> >> >> >> be
>> >> >> >> > > used
>> >> >> >> > > > > >> solely in performance of
work or services for Capital
>> One.
>> >> >> The
>> >> >> >> > > > > information
>> >> >> >> > > > > >> transmitted herewith is intended
only for use by the
>> >> >> >> individual or
>> >> >> >> > > > > entity
>> >> >> >> > > > > >> to which it is addressed.
If the reader of this
>> message is
>> >> >> not
>> >> >> >> the
>> >> >> >> > > > > intended
>> >> >> >> > > > > >> recipient, you are hereby
notified that any review,
>> >> >> >> > retransmission,
>> >> >> >> > > > > >> dissemination, distribution,
copying or other use of,
>> or
>> >> >> >> taking of
>> >> >> >> > > any
>> >> >> >> > > > > >> action in reliance upon this
information is strictly
>> >> >> >> prohibited.
>> >> >> >> > If
>> >> >> >> > > > you
>> >> >> >> > > > > >> have received this communication
in error, please
>> contact
>> >> the
>> >> >> >> > sender
>> >> >> >> > > > and
>> >> >> >> > > > > >> delete the material from
your computer.
>> >> >> >> > > > > >>
>> >> >> >> > > > > ________________________________________________________
>> >> >> >> > > > >
>> >> >> >> > > > > The information contained in this
e-mail is confidential
>> >> and/or
>> >> >> >> > > > > proprietary to Capital One and/or
its affiliates and may
>> >> only be
>> >> >> >> used
>> >> >> >> > > > > solely in performance of work or services
for Capital One.
>> >> The
>> >> >> >> > > > information
>> >> >> >> > > > > transmitted herewith is intended only
for use by the
>> >> individual
>> >> >> or
>> >> >> >> > > entity
>> >> >> >> > > > > to which it is addressed. If the reader
of this message is
>> >> not
>> >> >> the
>> >> >> >> > > > intended
>> >> >> >> > > > > recipient, you are hereby notified
that any review,
>> >> >> >> retransmission,
>> >> >> >> > > > > dissemination, distribution, copying
or other use of, or
>> >> taking
>> >> >> of
>> >> >> >> > any
>> >> >> >> > > > > action in reliance upon this information
is strictly
>> >> prohibited.
>> >> >> >> If
>> >> >> >> > you
>> >> >> >> > > > > have received this communication in
error, please contact
>> the
>> >> >> >> sender
>> >> >> >> > > and
>> >> >> >> > > > > delete the material from your computer.
>> >> >> >> > > > >
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > > --
>> >> >> >> > > >
>> >> >> >> > > > Regards,
>> >> >> >> > > > Ashwin.
>> >> >> >> > > >
>> >> >> >> > > ________________________________________________________
>> >> >> >> > >
>> >> >> >> > > The information contained in this e-mail is
confidential
>> and/or
>> >> >> >> > > proprietary to Capital One and/or its affiliates
and may only
>> be
>> >> >> used
>> >> >> >> > > solely in performance of work or services for
Capital One. The
>> >> >> >> > information
>> >> >> >> > > transmitted herewith is intended only for use
by the
>> individual
>> >> or
>> >> >> >> entity
>> >> >> >> > > to which it is addressed. If the reader of this
message is not
>> >> the
>> >> >> >> > intended
>> >> >> >> > > recipient, you are hereby notified that any
review,
>> >> retransmission,
>> >> >> >> > > dissemination, distribution, copying or other
use of, or
>> taking
>> >> of
>> >> >> any
>> >> >> >> > > action in reliance upon this information is
strictly
>> prohibited.
>> >> If
>> >> >> >> you
>> >> >> >> > > have received this communication in error, please
contact the
>> >> sender
>> >> >> >> and
>> >> >> >> > > delete the material from your computer.
>> >> >> >> > >
>> >> >> >> > ________________________________________________________
>> >> >> >> >
>> >> >> >> > The information contained in this e-mail is confidential
and/or
>> >> >> >> > proprietary to Capital One and/or its affiliates
and may only be
>> >> used
>> >> >> >> > solely in performance of work or services for Capital
One. The
>> >> >> >> information
>> >> >> >> > transmitted herewith is intended only for use by
the individual
>> or
>> >> >> >> entity
>> >> >> >> > to which it is addressed. If the reader of this message
is not
>> the
>> >> >> >> intended
>> >> >> >> > recipient, you are hereby notified that any review,
>> retransmission,
>> >> >> >> > dissemination, distribution, copying or other use
of, or taking
>> of
>> >> any
>> >> >> >> > action in reliance upon this information is strictly
>> prohibited. If
>> >> >> you
>> >> >> >> > have received this communication in error, please
contact the
>> >> sender
>> >> >> and
>> >> >> >> > delete the material from your computer.
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> ________________________________________________________
>> >> >>
>> >> >> The information contained in this e-mail is confidential and/or
>> >> >> proprietary to Capital One and/or its affiliates and may only be
used
>> >> >> solely in performance of work or services for Capital One. The
>> >> information
>> >> >> transmitted herewith is intended only for use by the individual
or
>> >> entity
>> >> >> to which it is addressed. If the reader of this message is not
the
>> >> intended
>> >> >> recipient, you are hereby notified that any review, retransmission,
>> >> >> dissemination, distribution, copying or other use of, or taking
of
>> any
>> >> >> action in reliance upon this information is strictly prohibited.
If
>> you
>> >> >> have received this communication in error, please contact the sender
>> and
>> >> >> delete the material from your computer.
>> >> >>
>> >> ________________________________________________________
>> >>
>> >> The information contained in this e-mail is confidential and/or
>> >> proprietary to Capital One and/or its affiliates and may only be used
>> >> solely in performance of work or services for Capital One. The
>> information
>> >> transmitted herewith is intended only for use by the individual or
>> entity
>> >> to which it is addressed. If the reader of this message is not the
>> intended
>> >> recipient, you are hereby notified that any review, retransmission,
>> >> dissemination, distribution, copying or other use of, or taking of any
>> >> action in reliance upon this information is strictly prohibited. If you
>> >> have received this communication in error, please contact the sender and
>> >> delete the material from your computer.
>> >>
>> ________________________________________________________
>>
>> The information contained in this e-mail is confidential and/or
>> proprietary to Capital One and/or its affiliates and may only be used
>> solely in performance of work or services for Capital One. The information
>> transmitted herewith is intended only for use by the individual or entity
>> to which it is addressed. If the reader of this message is not the intended
>> recipient, you are hereby notified that any review, retransmission,
>> dissemination, distribution, copying or other use of, or taking of any
>> action in reliance upon this information is strictly prohibited. If you
>> have received this communication in error, please contact the sender and
>> delete the material from your computer.
>>
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One
and/or its affiliates and may only be used solely in performance of work or services for Capital
One. The information transmitted herewith is intended only for use by the individual or entity
to which it is addressed. If the reader of this message is not the intended recipient, you
are hereby notified that any review, retransmission, dissemination, distribution, copying
or other use of, or taking of any action in reliance upon this information is strictly prohibited.
If you have received this communication in error, please contact the sender and delete the
material from your computer.
Mime
View raw message