apex-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandesh Hegde <sand...@datatorrent.com>
Subject Re: Blocked operator PTOperator
Date Tue, 28 Feb 2017 21:09:19 GMT
Can you please attach the stacktrace of the operator?

You can increase the attribute TIMEOUT_WINDOW_COUNT , AppMaster uses that
to decide when to kill the blocked operator.

For taking stack trace, find the information in the blog.
https://www.datatorrent.com/blog/getting-stack-traces-apache-apex-applications/

On Tue, Feb 28, 2017 at 12:59 PM Sunil Parmar <sparmar@threatmetrix.com>
wrote:

> Ashwin,
> I don’t see such warning. I’ll PM you entire log file.
>
> On 2017-02-28 12:16 (-0800), Ashwin Chandra Putta <
> ashwinchandrap@gmail.com> wrote:
> > Sunil,
> > This might be related to checkpointing. See:
> >
> https://github.com/apache/apex-core/blob/master/engine/src/main/java/com/datatorrent/stram/StreamingContainerManager.java#L2211-L2217
> >
> > Also check this piece of code:
> >
> https://github.com/apache/apex-core/blob/master/engine/src/main/java/com/datatorrent/stram/StreamingContainerManager.java#L2031-L2044
> >
> > Can you paste the output of the warning from the code above which starts
> > with 'Marking operator '
> >
> > Regards,
> > Ashwin.
> >
> > On Tue, Feb 28, 2017 at 12:03 PM, Sunil Parmar <sparmar@threatmetrix.com
> >
> > wrote:
> >
> > > That doesn%u2019t seems to be the case. We do see window id moving in
> UI as
> > > well.
> > >
> > > On 2017-02-28 11:19 (-0800), Munagala Ramanath <ram@datatorrent.com>
> > > wrote:
> > > > It likely means that that operator is taking too long to return from
> one
> > > of
> > > > the callbacks like beginWindow(), endWindow(),
> > > > emitTuples(), etc. Do you have any potentially blocking calls to
> external
> > > > systems in any of those callbacks ?
> > > >
> > > > Ram
> > > >
> > > > On Tue, Feb 28, 2017 at 11:09 AM, Sunil Parmar <
> sparmar@threatmetrix.com
> > > >
> > > > wrote:
> > > >
> > > > > 2017-02-27 19:43:21,926 INFO com.datatorrent.stram.
> > > StreamingContainerManager:
> > > > > Blocked operator PTOperator[id=3,name=eventUpdatesFormatter]
> container
> > > > >
> PTContainer[id=1(container_1487310232732_0027_02_000111),state=ACTIVE]
> > > > > time 61905ms
> > > > > 2017-02-27 19:43:22,928 INFO com.datatorrent.stram.
> > > StreamingAppMasterService:
> > > > > Completed containerId=container_1487310232732_0027_02_000111,
> > > > > state=COMPLETE, exitStatus=-105, diagnostics=Container killed by
> the
> > > > > ApplicationMaster.
> > > > > Container killed on request. Exit code is 143
> > > > > Container exited with a non-zero exit code 143
> > > > >
> > > > >
> > > > > Can anyone help understand this error ? We see one of the operators
> > > keeps
> > > > > restarting the container; the above error is from AppMaster log.
> > > > >
> > > > > Thanks,
> > > > > Sunil
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > _______________________________________________________
> > > >
> > > > Munagala V. Ramanath
> > > >
> > > > Software Engineer
> > > >
> > > > E: ram@datatorrent.com | M: (408) 331-5034 | Twitter: @UnknownRam
> > > >
> > > > www.datatorrent.com  |  apex.apache.org
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>
-- 
*Join us at Apex Big Data World-San Jose
<http://www.apexbigdata.com/san-jose.html>, April 4, 2017!*
[image: http://www.apexbigdata.com/san-jose-register.html]

Mime
View raw message