apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kottapalli, Venkatesh" <VKottapa...@DIRECTV.com>
Subject RE: Reg container getting killed without throwing exceptions
Date Thu, 21 Jan 2016 08:21:49 GMT
Hi,
	
	Thanks for your support. 
	
	The issue is because of the business processing logic that we have put and the operator is
not able to process the tuples. I have implemented partitioning on the operator to distribute
the load and the issue got resolved.

	Like Ashwin suggested, the root cause looks like 'all the data getting processed in the same
window though it is not supposed to as per the code'. I shall get back with queries on this
after my analysis.

Regards,
Venkatesh.

-----Original Message-----
From: Thomas Weise [mailto:thomas@datatorrent.com] 
Sent: Wednesday, January 20, 2016 5:01 PM
To: dev@apex.incubator.apache.org
Subject: Re: Reg container getting killed without throwing exceptions

Venkatesh,

Can you please check the AM log for messages containing "heartbeat timeout"?

That would be a condition under which the container gets killed and where you won't find any
exceptions or messages in the container log.

Thanks,
Thomas


On Wed, Jan 20, 2016 at 4:44 PM, Ashwin Chandra Putta < ashwinchandrap@gmail.com> wrote:

> Venkatesh,
>
> I am thinking the code in the end window might be blocking, can you 
> try putting a log at the first line in end window.
>
> If you see that this log in end window is not printed, then it is 
> possible that the one of the process tuple calls might be blocked 
> although the previous tuples have been processed. You might want to 
> put one log at the starting of process tuple and one log right before 
> completing the process tuple just to ensue that each process tuple call is completed.
>
> If you are implementing IdleTimeHandler, then the handleIdleTime 
> method call will behave similar to processTuple.
>
> Just to give a little more detail as how it works. All the operator 
> method calls are happening in the same thread. For every window, the 
> begin window is called followed by loop of processTuple calls for all 
> tuples within the window, followed by endWindow call. In your case, 
> seems like  begin window is called. You also see a series of process 
> tuple calls. We are not sure if all the process tuple calls are 
> completed or not, and if endWindow is reached.
>
> Regards,
> Ashwin.
>
> On Wed, Jan 20, 2016 at 4:32 PM, Kottapalli, Venkatesh < 
> VKottapalli@directv.com> wrote:
>
> > Yes Ashwin, the window id isn’t moving forward, the current window 
> > id for the operator is "-". In the operator, I see the processing 
> > part in "processTuple" getting completed for incoming tuples but not 
> > calling "endWindow".
> >
> > -----Original Message-----
> > From: Ashwin Chandra Putta [mailto:ashwinchandrap@gmail.com]
> > Sent: Wednesday, January 20, 2016 3:57 PM
> > To: dev@apex.incubator.apache.org
> > Subject: Re: Reg container getting killed without throwing 
> > exceptions
> >
> > Venkatesh,
> >
> > If you do not see the window id moving forward, it usually means 
> > that the business logic is blocking the operator. Please check if 
> > window id is moving forward.
> >
> > Regards,
> > Ashwin.
> >
> > On Wed, Jan 20, 2016 at 3:37 PM, Gaurav Gupta 
> > <gaurav@datatorrent.com>
> > wrote:
> >
> > > Venkatesh,
> > >
> > > I think actually issue is that operator is getting blocked as you 
> > > mentioned that operator is taking too long to process and it is 
> > > not showing any processed and emitted tuples. AM is not getting 
> > > any heart beat from operator so it kills it.
> > >
> > > Thanks
> > > - Gaurav
> > >
> > > > On Jan 20, 2016, at 3:17 PM, Kottapalli, Venkatesh
> > > <VKottapalli@DIRECTV.com> wrote:
> > > >
> > > > Thanks for your inputs Gaurav and Tim.
> > > >
> > > > When it is OOM, I see it in the container logs but it in this 
> > > > case I
> > > don’t find any.
> > > >
> > > > I see the processing part in the operator running and printing 
> > > > logs
> > > without any issues end to end but not reaching the end window. It 
> > > might be because of the grouping logic that we have added  in the 
> > > end window that is causing OOM but the container logs doesn’t show it.
> > > >
> > > > The operator is taking long to process.  Total processed and 
> > > > emitted by
> > > that operator is always 0.
> > > >
> > > > I shall try to increase memory on the Application master and the
> > > container as well and see if it works else I will try on a smaller 
> > > load and see if it is a scaling issue because of OOM.
> > > >
> > > > Right now, I don’t have access to the AM logs.
> > > >
> > > >
> > > > Regards,
> > > > Venkatesh.
> > > >
> > > > -----Original Message-----
> > > > From: Timothy Farkas [mailto:tim@datatorrent.com]
> > > > Sent: Wednesday, January 20, 2016 3:11 PM
> > > > To: dev@apex.incubator.apache.org
> > > > Subject: Re: Reg container getting killed without throwing 
> > > > exceptions
> > > >
> > > > Hey Venkatesh,
> > > >
> > > > How much memory is allocated to the App Master? You should 
> > > > allocate
> > > atleast 2GB to app master with this property.
> > > >
> > > >
> > > >  <property>
> > > >    <name>dt.attr.MASTER_MEMORY_MB</name>
> > > >    <value>2048</value>
> > > >  </property>
> > > >
> > > > Otherwise the App Master may die suddenly without printing 
> > > > anything to
> > > logs.
> > > >
> > > > Thanks,
> > > > Tim
> > > >
> > > > On Wed, Jan 20, 2016 at 2:47 PM, Gaurav Gupta 
> > > > <gaurav@datatorrent.com>
> > > > wrote:
> > > >
> > > >> Venkatesh,
> > > >>
> > > >> Did you see any OOM exception? It would be good to see the AM 
> > > >> logs and container logs to find out more.
> > > >>
> > > >> Thanks
> > > >> - Gaurav
> > > >>
> > > >>> On Jan 20, 2016, at 2:42 PM, Kottapalli, Venkatesh <
> > > >> VKottapalli@directv.com> wrote:
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>>               I get the following message when the container 
> > > >>> is getting
> > > >> killed. I don't find logs for any exceptions being thrown. How 
> > > >> do we identify the root cause for this issue?
> > > >>> Sorry for being very abstract.
> > > >>>
> > > >>> Container killed by the ApplicationMaster.
> > > >>> Container killed on request. Exit code is 143 Container exited

> > > >>> with a non-zero exit code 143
> > > >>>
> > > >>> Regards,
> > > >>> Venkatesh.
> > > >>
> > > >>
> > >
> > >
> >
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>
>
>
> --
>
> Regards,
> Ashwin.
>
Mime
View raw message