Return-Path: X-Original-To: apmail-airavata-dev-archive@www.apache.org Delivered-To: apmail-airavata-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A90D0114FE for ; Wed, 23 Apr 2014 13:35:32 +0000 (UTC) Received: (qmail 97763 invoked by uid 500); 23 Apr 2014 13:35:20 -0000 Delivered-To: apmail-airavata-dev-archive@airavata.apache.org Received: (qmail 97726 invoked by uid 500); 23 Apr 2014 13:35:18 -0000 Mailing-List: contact dev-help@airavata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airavata.apache.org Delivered-To: mailing list dev@airavata.apache.org Received: (qmail 97596 invoked by uid 99); 23 Apr 2014 13:35:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Apr 2014 13:35:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of glahiru@gmail.com designates 209.85.212.179 as permitted sender) Received: from [209.85.212.179] (HELO mail-wi0-f179.google.com) (209.85.212.179) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Apr 2014 13:35:07 +0000 Received: by mail-wi0-f179.google.com with SMTP id z2so1148924wiv.6 for ; Wed, 23 Apr 2014 06:34:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=z4xQiB8rHr/x2QeA7yiCyPwOjlOO6MWzoo6mR8WIk+I=; b=Ognqmn/GioJWzfH4ea2DnrFeAl6EpVrz65HjiIy5ezRmDyjUo3iQf9hIq7y/qKN1mu Yr6HdRHg3IEb70VwFoyYEcqrf5wsxYXOywl5vv42G+yFooe3wsrkvBwxeR8Yr0YE+eQg c94Hcy58V+m+ybbsiV430ho8rJn4G1gSY9BxeBFV/g3vlr4tinwkugl51ilReu3MqA/A RdpRFkzoEnCxvx5v/PICCHU5xLRSt2yTLCOcCEnECe1Nr+1f8VQGVlxN2nybZbkDIAAu MinhafZwc1082c6x/Vk57DTVlMgIbAHmCENnPKroWnA2WqD4JBSZzk6nFQbybJEdkZYf grBg== MIME-Version: 1.0 X-Received: by 10.194.186.140 with SMTP id fk12mr12774485wjc.47.1398260085122; Wed, 23 Apr 2014 06:34:45 -0700 (PDT) Received: by 10.216.191.136 with HTTP; Wed, 23 Apr 2014 06:34:45 -0700 (PDT) In-Reply-To: References: Date: Wed, 23 Apr 2014 09:34:45 -0400 Message-ID: Subject: Re: Update on BES Provider implementation From: Lahiru Gunathilake To: dev Content-Type: multipart/alternative; boundary=047d7bea4518ad6b6c04f7b5cba8 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bea4518ad6b6c04f7b5cba8 Content-Type: text/plain; charset=UTF-8 Hi Shahbaz, I had a look at the code and I think the actual error is not a NPE but in side the catch claus we get NPE because currentMonitorID is null, so if you change the code as following and run again, we will get some meaningful information. I can see you have followed the same implementation as QstatMonitor, I will change the code in QstatMonitor too. else if (!this.queue.contains(take)) { // we put the job back to the queue only if its state is not unknown if (currentMonitorID == null) { logger.error("Monitoring the jobs failed, for user: " + take.getUserName() + " in Host: " + currentHostDescription.getType().getHostAddress()); } else { if (currentMonitorID != null) { if (currentMonitorID.getFailedCount() < 2) { try { currentMonitorID.setFailedCount(currentMonitorID.getFailedCount() + 1); this.queue.put(take); } catch (InterruptedException e1) { e1.printStackTrace(); } } else { logger.error(e.getMessage()); logger.error("Tried to monitor the job 3 times, so dropping of the the Job with ID: " + currentMonitorID.getJobID()); } } } } throw new AiravataMonitorException("Error retrieving the job status", e); } Thanks Lahiru On Wed, Apr 23, 2014 at 9:18 AM, Shahbaz Memon wrote: > Thanks Lahiru. > > airavata.log -> https://gigamove.rz.rwth-aachen.de/d/id/3pxEa6Ksf9Vf39 > > Cheers, > > Shahbaz > > > On Wed, Apr 23, 2014 at 3:07 PM, Lahiru Gunathilake wrote: > >> Hi Shahbaz, >> >> Are you seeing any logs in the server ? >> >> Regards >> Lahiru >> >> >> On Wed, Apr 23, 2014 at 9:00 AM, Shahbaz Memon wrote: >> >>> Hi all, >>> >>> I am facing one issue while testing the bes's pull monitor >>> implementation. >>> >>> Before stating my issue, let me write details on the current >>> implementation state, >>> >>> For the bes extension I have forked the github repository under the >>> following url, >>> >>> https://github.com/msmemon/airavata >>> >>> In the forked sources most of the classes are not touched except a >>> couple of modifications and additions. I have also modified project poms >>> with multiple dependency exclusions to avoid class loading horrors. >>> >>> There is a partially tested implementation available with input / >>> output handlers, provider,and monitor classes. >>> >>> For the monitoring purposes (as it is the place where I am facing an >>> issue), I have written a pull monitor that is very similar to the QStat >>> one, the only exception is the connection object which contains a different >>> credential and proxy client instance that is suitable for BES supported >>> endpoints. >>> >>> Now my issue is, >>> >>> during the job submission process, input handler and provider is >>> properly invoked, and after that, BESPullJobMonitor [1] is throwing a NPE, >>> thus my workflow is not reaching the final phase of output handler >>> invocation and completion. >>> >>> java.lang.NullPointerException >>> at >>> org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.startPulling(BESPullJobMonitor.java:173) >>> at >>> org.apache.airavata.job.monitor.impl.pull.bes.BESPullJobMonitor.run(BESPullJobMonitor.java:60) >>> at java.lang.Thread.run(Thread.java:744) >>> >>> May be I am not rightly following the NEW monitoring extensions. Any >>> feedback is more than welcome. >>> >>> [1] >>> https://github.com/msmemon/airavata/blob/master/tools/job-monitor/src/main/java/org/apache/airavata/job/monitor/impl/pull/bes/BESPullJobMonitor.java >>> >>> Thanks in advance, >>> >>> Shahbaz >>> >>> >>> >>> >>> ------------------------------------------------------------------------------------------------ >>> >>> ------------------------------------------------------------------------------------------------ >>> Forschungszentrum Juelich GmbH >>> 52425 Juelich >>> Sitz der Gesellschaft: Juelich >>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 >>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher >>> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), >>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, >>> Prof. Dr. Sebastian M. Schmidt >>> >>> ------------------------------------------------------------------------------------------------ >>> >>> ------------------------------------------------------------------------------------------------ >>> >>> >> >> >> -- >> System Analyst Programmer >> PTI Lab >> Indiana University >> > > -- System Analyst Programmer PTI Lab Indiana University --047d7bea4518ad6b6c04f7b5cba8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Shahbaz,

I had a look at the code an= d I think the actual error is not a NPE but in side the catch claus we get = NPE because currentMonitorID is null, so if you change the code as followin= g and run again, we will get some meaningful information. I can see you hav= e followed the same implementation as QstatMonitor, I will change the code = in QstatMonitor too.


else if (!this.queue.contains(take)= ) { =C2=A0 // we put the job back to the queue only if its state is not unk= nown
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (= currentMonitorID =3D=3D null) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 logger.error("Monitoring the jo= bs failed, for user: " + take.getUserName()
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 + " in Host: " + currentHostDescripti= on.getType().getHostAddress());
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 } else {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (currentMonitorID !=3D null) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 if (currentMonitorID.getFailedCount() < 2) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 try {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 currentMonitorID.setFailedCount(currentMonitorID.getFailedCount() + = 1);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 this.queue.put(take);<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 } catch (InterruptedException e1) {
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 e1.printStackTrace();
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } else {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 logger.error(e.getMessage());
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 logger.error("Tried to monitor the job 3 times, so d= ropping of the the Job with ID: " + currentMonitorID.getJobID());
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 throw new AiravataMonitorExc= eption("Error retrieving the job status", e);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }=C2=A0

Tha= nks
Lahiru


On Wed, Apr 23, 2014 at 9:18 AM, Shahbaz Memon <m.m= emon@fz-juelich.de> wrote:
Thanks Lahiru.


Cheers,

Shahbaz


On Wed, Ap= r 23, 2014 at 3:07 PM, Lahiru Gunathilake <glahiru@gmail.com> wrote:
Hi Shahbaz,

<= div>Are you seeing any logs in the server ?

Regard= s
Lahiru


On Wed, Apr 23, 2014 at 9:00 AM, Shahbaz Memon <m.= memon@fz-juelich.de> wrote:
Hi all,=C2=A0

I am facing one issue while testing the bes's pull monitor impleme= ntation.=C2=A0

Before stating my issue, let me write details on the current implement= ation state,=C2=A0

For the bes extension I have forked the github repository under the fo= llowing url,=C2=A0


In the forked sources most of the classes are not touched except a cou= ple of modifications and additions. I have also modified project poms with = multiple dependency exclusions to avoid class loading horrors.=C2=A0

There is a partially tested implementation available with input / outp= ut handlers, provider,and monitor classes.=C2=A0

For the monitoring purposes (as it is the place where I am facing an i= ssue), I have written a pull monitor that is very similar to the QStat one,= the only exception is the connection object which contains a different cre= dential and proxy client instance that is suitable for BES supported endpoints.=C2=A0

Now my issue is,=C2=A0

during the job submission process, input handler and provider is prope= rly invoked, and after that, BESPullJobMonitor [1] is throwing a NPE, thus = my workflow is not reaching the final phase of output handler invocation an= d completion.=C2=A0

java.lang.NullPointerException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.airavata.job.monitor.impl.pu= ll.bes.BESPullJobMonitor.startPulling(BESPullJobMonitor.java:173)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.airavata.job.monitor.impl.pu= ll.bes.BESPullJobMonitor.run(BESPullJobMonitor.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Thread.java:744)

May be I am not rightly following the NEW monitoring extensions. Any f= eedback is more than welcome.


Thanks in advance,

Shahbaz



---------------------------------------------------------------------------= ---------------------
---------------------------------------------------------------------------= ---------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
---------------------------------------------------------------------------= ---------------------
---------------------------------------------------------------------------= ---------------------




--
System Analyst Programmer
PTI Lab
=
Indiana University




--
System Analy= st Programmer
PTI Lab
Indiana University
--047d7bea4518ad6b6c04f7b5cba8--