apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganelin, Ilya" <Ilya.Gane...@capitalone.com>
Subject RE: Stack overflow errors when launching job
Date Tue, 22 Mar 2016 03:21:02 GMT
Hi, Chandni - we are presently dealing with some environment woes due to HDFS issues and amusingly
enough I can no longer reproduce this problem. I suspect that this might have been a symptom
of deeper cluster issues. If I am able to again reproduce it consistently, I'll let you know,
and, now that I know how to provide complete stack logs, I'll be able to provide those as
well.



Sent with Good (www.good.com)
________________________________
From: Chandni Singh <chandni@datatorrent.com>
Sent: Monday, March 21, 2016 7:29:27 PM
To: dev@apex.incubator.apache.org
Subject: Re: Stack overflow errors when launching job

Hi Ilya,

Are you available at 2 pm tomorrow for webex?

Chandni

On Mon, Mar 21, 2016 at 2:53 PM, Chandni Singh <chandni@datatorrent.com>
wrote:

> Ilya,
>
> I have launched the application on our Yarn cluster and I don't see this
> happening.
>
> Chandni
>
> On Sun, Mar 20, 2016 at 9:43 PM, Ganelin, Ilya <
> Ilya.Ganelin@capitalone.com> wrote:
>
>> Sure thing. If you guys have time tomorrow I can hop on a WebEx.
>>
>>
>>
>> Sent with Good (www.good.com<http://www.good.com>)
>> ________________________________
>> From: Amol Kekre <amol@datatorrent.com>
>> Sent: Sunday, March 20, 2016 12:54:22 PM
>> To: dev@apex.incubator.apache.org
>> Subject: Re: Stack overflow errors when launching job
>>
>> Can we get on a webex to take a look?
>>
>> thks
>> Amol
>>
>>
>> On Sat, Mar 19, 2016 at 7:27 PM, Ganelin, Ilya <
>> Ilya.Ganelin@capitalone.com>
>> wrote:
>>
>> > I don't think I have any time really to connect to the container. The
>> > application launches and crashes almost immediately. Total runtime is 50
>> > seconds.
>> >
>> >
>> >
>> > Sent with Good (www.good.com<http://www.good.com<http://www.good.com<http://www.good.com>>)
>> > ________________________________
>> > From: Munagala Ramanath <ram@datatorrent.com>
>> > Sent: Saturday, March 19, 2016 5:39:11 PM
>> > To: dev@apex.incubator.apache.org
>> > Subject: Re: Stack overflow errors when launching job
>> >
>> > There is some info here, near the end of the page:
>> >
>> > http://docs.datatorrent.com/troubleshooting/
>> >
>> > under the heading "How do I get a heap dump when a container gets an
>> > OutOfMemoryError ?"
>> >
>> > However since you're blowing the stack, you may need to manually run
>> jmap
>> > on the running container
>> > which may be difficult if it doesn't stay up for very long. There is a
>> way
>> > to dump the heap programmatically
>> > as described, for instance, here:
>> >
>> >
>> >
>> https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java
>> >
>> > Ram
>> >
>> > On Sat, Mar 19, 2016 at 2:07 PM, Ganelin, Ilya <
>> > Ilya.Ganelin@capitalone.com>
>> > wrote:
>> >
>> > > How would we go about getting a heap dump?
>> > >
>> > >
>> > >
>> > > Sent with Good (<http://>www.good.com<http://www.good.com<http://www.good.com<
>> http://www.good.com>>)
>> > > ________________________________
>> > > From: Yogi Devendra <yogidevendra@apache.org>
>> > > Sent: Saturday, March 19, 2016 12:19:26 AM
>> > > To: dev@apex.incubator.apache.org
>> > > Subject: Re: Stack overflow errors when launching job
>> > >
>> > > Stack trace in the gist shows some symptoms of infinite recursion.
>> > > But, I could not figure out exact cause for it.
>> > >
>> > > Can you please check your heap dump to see if there are any cycles in
>> the
>> > > object hierarchy?
>> > >
>> > > ~ Yogi
>> > >
>> > > On 19 March 2016 at 00:36, Ashwin Chandra Putta <
>> > ashwinchandrap@gmail.com>
>> > > wrote:
>> > >
>> > > > In the example you posted, do you have any locality constraint
>> applied?
>> > > >
>> > > > From what I see, you have two operators - hdfs input operator and
>> hdfs
>> > > > output operator. Each of them have 40 partitions each and you don't
>> > have
>> > > > any other constraints on them. And the partitioner implementation
>> you
>> > are
>> > > > using is com.datatorrent.common.partitioner.StatelessPartitioner
>> > > >
>> > > > Please confirm.
>> > > >
>> > > > Regards,
>> > > > Ashwin.
>> > > >
>> > > > On Thu, Mar 17, 2016 at 5:00 PM, Ganelin, Ilya <
>> > > > Ilya.Ganelin@capitalone.com>
>> > > > wrote:
>> > > >
>> > > > > I’ve updated the gist with a more complete example, and updated
>> the
>> > > > > associated JIRA that I’ve created.
>> > > > > https://issues.apache.org/jira/browse/APEXCORE-392
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On 3/17/16, 4:33 AM, "Tushar Gosavi" <tushar@datatorrent.com>
>> wrote:
>> > > > >
>> > > > > >Hi,
>> > > > >
>> > > > > >
>> > > > > >I created a sample application with operators from the given
>> link.
>> > > just
>> > > > a
>> > > > > >simple input and output and created 32 partitions of each.
Could
>> not
>> > > > > >reproduce the
>> > > > > >stack overflow issue. Do you have a small sample application
>> which
>> > > could
>> > > > > >reproduce this issue?
>> > > > > >
>> > > > > >  @Override
>> > > > > >  public void populateDAG(DAG dag, Configuration configuration)
>> > > > > >  {
>> > > > > >    NewlineFileInputOperator in = dag.addOperator("Input",
new
>> > > > > >NewlineFileInputOperator());
>> > > > > >    in.setDirectory("/user/tushar/data");
>> > > > > >    in.setPartitionCount(32);
>> > > > > >
>> > > > > >    HdfsFileOutputOperator out = dag.addOperator("Output",
new
>> > > > > >HdfsFileOutputOperator());
>> > > > > >    out.setFilePath("/user/tushar/outdata");
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> >dag.getMeta(out).getAttributes().put(Context.OperatorContext.PARTITIONER,
>> > > > > >new StatelessPartitioner<HdfsFileOutputOperator>(32));
>> > > > > >
>> > > > > >    dag.addStream("s1", in.output, out.input);
>> > > > > >  }
>> > > > > >
>> > > > > >-Tushar.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >On Thu, Mar 17, 2016 at 12:30 AM, Ganelin, Ilya <
>> > > > > Ilya.Ganelin@capitalone.com
>> > > > > >> wrote:
>> > > > > >
>> > > > > >> Hi guys – I’m running into a very frustrating issue
where
>> certain
>> > > DAG
>> > > > > >> configurations cause the following error log (attached).
When
>> this
>> > > > > happens,
>> > > > > >> my application even fails to launch. This does not seem
to be a
>> > YARN
>> > > > > issue
>> > > > > >> since this occurs even with a relatively small number
of
>> > > > > partitions/memory.
>> > > > > >>
>> > > > > >> I’ve attached the input and output operators in question:
>> > > > > >> https://gist.github.com/ilganeli/7f770374113b40ffa18a
>> > > > > >>
>> > > > > >> I can get this to occur predictable by
>> > > > > >>
>> > > > > >>   1.  Increasing the partition count on my input operator
>> (reads
>> > > from
>> > > > > >> HDFS) - values above 20 cause this error
>> > > > > >>   2.  Increase the partition count on my output operator
>> (writes
>> > to
>> > > > > HDFS)
>> > > > > >> - values above 20 cause this error
>> > > > > >>   3.  Set stream locality from the default to either
thread
>> local,
>> > > > node
>> > > > > >> local, or container_local on the output operator
>> > > > > >>
>> > > > > >> This behavior is very frustrating as it’s preventing
me from
>> > > > > partitioning
>> > > > > >> my HDFS I/O appropriately, thus allowing me to scale
to higher
>> > > > > throughputs.
>> > > > > >>
>> > > > > >> Do you have any thoughts on what’s going wrong? I
would love
>> your
>> > > > > feedback.
>> > > > > >> ________________________________________________________
>> > > > > >>
>> > > > > >> The information contained in this e-mail is confidential
and/or
>> > > > > >> proprietary to Capital One and/or its affiliates and
may only
>> be
>> > > used
>> > > > > >> solely in performance of work or services for Capital
One. The
>> > > > > information
>> > > > > >> transmitted herewith is intended only for use by the
>> individual or
>> > > > > entity
>> > > > > >> to which it is addressed. If the reader of this message
is not
>> the
>> > > > > intended
>> > > > > >> recipient, you are hereby notified that any review,
>> > retransmission,
>> > > > > >> dissemination, distribution, copying or other use of,
or
>> taking of
>> > > any
>> > > > > >> action in reliance upon this information is strictly
>> prohibited.
>> > If
>> > > > you
>> > > > > >> have received this communication in error, please contact
the
>> > sender
>> > > > and
>> > > > > >> delete the material from your computer.
>> > > > > >>
>> > > > > ________________________________________________________
>> > > > >
>> > > > > The information contained in this e-mail is confidential and/or
>> > > > > proprietary to Capital One and/or its affiliates and may only
be
>> used
>> > > > > solely in performance of work or services for Capital One. The
>> > > > information
>> > > > > transmitted herewith is intended only for use by the individual
or
>> > > entity
>> > > > > to which it is addressed. If the reader of this message is not
the
>> > > > intended
>> > > > > recipient, you are hereby notified that any review,
>> retransmission,
>> > > > > dissemination, distribution, copying or other use of, or taking
of
>> > any
>> > > > > action in reliance upon this information is strictly prohibited.
>> If
>> > you
>> > > > > have received this communication in error, please contact the
>> sender
>> > > and
>> > > > > delete the material from your computer.
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > Regards,
>> > > > Ashwin.
>> > > >
>> > > ________________________________________________________
>> > >
>> > > The information contained in this e-mail is confidential and/or
>> > > proprietary to Capital One and/or its affiliates and may only be used
>> > > solely in performance of work or services for Capital One. The
>> > information
>> > > transmitted herewith is intended only for use by the individual or
>> entity
>> > > to which it is addressed. If the reader of this message is not the
>> > intended
>> > > recipient, you are hereby notified that any review, retransmission,
>> > > dissemination, distribution, copying or other use of, or taking of any
>> > > action in reliance upon this information is strictly prohibited. If
>> you
>> > > have received this communication in error, please contact the sender
>> and
>> > > delete the material from your computer.
>> > >
>> > ________________________________________________________
>> >
>> > The information contained in this e-mail is confidential and/or
>> > proprietary to Capital One and/or its affiliates and may only be used
>> > solely in performance of work or services for Capital One. The
>> information
>> > transmitted herewith is intended only for use by the individual or
>> entity
>> > to which it is addressed. If the reader of this message is not the
>> intended
>> > recipient, you are hereby notified that any review, retransmission,
>> > dissemination, distribution, copying or other use of, or taking of any
>> > action in reliance upon this information is strictly prohibited. If you
>> > have received this communication in error, please contact the sender and
>> > delete the material from your computer.
>> >
>> ________________________________________________________
>>
>> The information contained in this e-mail is confidential and/or
>> proprietary to Capital One and/or its affiliates and may only be used
>> solely in performance of work or services for Capital One. The information
>> transmitted herewith is intended only for use by the individual or entity
>> to which it is addressed. If the reader of this message is not the intended
>> recipient, you are hereby notified that any review, retransmission,
>> dissemination, distribution, copying or other use of, or taking of any
>> action in reliance upon this information is strictly prohibited. If you
>> have received this communication in error, please contact the sender and
>> delete the material from your computer.
>>
>
>
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One
and/or its affiliates and may only be used solely in performance of work or services for Capital
One. The information transmitted herewith is intended only for use by the individual or entity
to which it is addressed. If the reader of this message is not the intended recipient, you
are hereby notified that any review, retransmission, dissemination, distribution, copying
or other use of, or taking of any action in reliance upon this information is strictly prohibited.
If you have received this communication in error, please contact the sender and delete the
material from your computer.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message