Return-Path: X-Original-To: apmail-airavata-dev-archive@www.apache.org Delivered-To: apmail-airavata-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E70D710FBC for ; Tue, 1 Apr 2014 18:43:26 +0000 (UTC) Received: (qmail 32285 invoked by uid 500); 1 Apr 2014 18:43:25 -0000 Delivered-To: apmail-airavata-dev-archive@airavata.apache.org Received: (qmail 32245 invoked by uid 500); 1 Apr 2014 18:43:24 -0000 Mailing-List: contact dev-help@airavata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airavata.apache.org Delivered-To: mailing list dev@airavata.apache.org Received: (qmail 32219 invoked by uid 99); 1 Apr 2014 18:43:22 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Apr 2014 18:43:22 +0000 Date: Tue, 1 Apr 2014 18:43:22 +0000 (UTC) From: "Eroma (JIRA)" To: dev@airavata.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (AIRAVATA-755) Incompatible parameters on HPC Configuration needs validation process MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AIRAVATA-755?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eroma updated AIRAVATA-755: --------------------------- Fix Version/s: 0.13 > Incompatible parameters on HPC Configuration needs validation process > --------------------------------------------------------------------- > > Key: AIRAVATA-755 > URL: https://issues.apache.org/jira/browse/AIRAVATA-755 > Project: Airavata > Issue Type: Improvement > Components: GFac, XBaya > Affects Versions: 0.6 > Reporter: Pedro da Silveira > Priority: Minor > Fix For: 0.13 > > > I setup a workflow to run on Lonestar with the following HPC configuratio= n > Max Wall time: 1440 > CPU Count: 64 > Node: 6 > Processor Per Node: 12 > Min Memory: 10240 > Max: 15360 > It had 4 inputs and one single output. Unfortunately, my workflow task ne= ver got submitted to Lonestar. > The Airavata Server had the error message below: > So, I only changed the parameter from "CPU Count: 64" to "CPU Count: 72" = and my workflow task got submitted to Lonestar correctly. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Error Message > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > [INFO] =09-----DATA----- > [INFO] =09=09https://gridftp1.ls4.tacc.utexas.edu:50393/16289878754566973= 521/8943296923859968446/ > [INFO] =09=09gridftp1.ls4.tacc.utexas.edu:2119/jobmanager-sge > [INFO] =09=09null > [INFO] =09=09null > [INFO] =09=09/C=3DUS/O=3DNational Center for Supercomputing Applications/= CN=3DOGCE Community User > [INFO] =09=09null > [INFO] =09=09&( queue =3D "normal" )( stdout =3D "/scratch/01437/ogce/Vla= b/Phonon/__p3_14/AppPhononSingle_Wed_Jan_30_19_33_57_CST_2013_81e76edb-dec5= -4a04-ac8b-dead149cf5b3/lonestar_application.stdout" )( count =3D "64" )( e= xecutable =3D "/scratch/01437/ogce/Vlab/Phonon/executePhonon.sh" )( stderr = =3D "/scratch/01437/ogce/Vlab/Phonon/__p3_14/AppPhononSingle_Wed_Jan_30_19_= 33_57_CST_2013_81e76edb-dec5-4a04-ac8b-dead149cf5b3/lonestar_application.st= derr" )( maxwalltime =3D "1440" )( hostCount =3D "6" )( minmemory =3D "1024= 0" )( project =3D "TG-STA110014S" )( jobtype =3D "mpi" )( environment =3D (= "inputData" "/scratch/01437/ogce/Vlab/Phonon/__p3_14/AppPhononSingle_Wed_J= an_30_19_33_57_CST_2013_81e76edb-dec5-4a04-ac8b-dead149cf5b3/inputData" ) (= "outputData" "/scratch/01437/ogce/Vlab/Phonon/__p3_14/AppPhononSingle_Wed_= Jan_30_19_33_57_CST_2013_81e76edb-dec5-4a04-ac8b-dead149cf5b3/outputData" )= )( proxy_timeout =3D "1" )( arguments =3D "///scratch/01437/ogce/Vlab/Phon= on/__p3_14/AppPhononSingle_Wed_Jan_30_19_33_57_CST_2013_81e76edb-dec5-4a04-= ac8b-dead149cf5b3/inputData/Pwscf_Input" "///scratch/01437/ogce/Vlab/Phonon= /__p3_14/AppPhononSingle_Wed_Jan_30_19_33_57_CST_2013_81e76edb-dec5-4a04-ac= 8b-dead149cf5b3/inputData/Cd_PON_sp_LDA.vdb" "///scratch/01437/ogce/Vlab/Ph= onon/__p3_14/AppPhononSingle_Wed_Jan_30_19_33_57_CST_2013_81e76edb-dec5-4a0= 4-ac8b-dead149cf5b3/inputData/Te_PON_LDA.vdb" "///scratch/01437/ogce/Vlab/P= honon/__p3_14/AppPhononSingle_Wed_Jan_30_19_33_57_CST_2013_81e76edb-dec5-4a= 04-ac8b-dead149cf5b3/inputData/Phonon_Input" )( directory =3D "/scratch/014= 37/ogce/Vlab/Phonon/__p3_14/AppPhononSingle_Wed_Jan_30_19_33_57_CST_2013_81= e76edb-dec5-4a04-ac8b-dead149cf5b3" )( maxmemory =3D "15360" ) > [INFO] =09-----END DATA----- > [INFO] Status is zero > [INFO] Status of job https://gridftp1.ls4.tacc.utexas.edu:50393/162898787= 54566973521/8943296923859968446/is FAILED > [INFO] =09-----DATA----- > [INFO] =09=09Status of job https://gridftp1.ls4.tacc.utexas.edu:50393/162= 89878754566973521/8943296923859968446/is FAILED > [INFO] =09-----END DATA----- > [INFO] Job Error Code: 14 > [ERROR] Context passed was NULL. > java.lang.RuntimeException: Context passed was NULL. > =09at org.apache.airavata.workflow.tracking.impl.ProvenanceNotifierImpl.s= endingFault(ProvenanceNotifierImpl.java:496) > =09at org.apache.airavata.workflow.tracking.impl.ProvenanceNotifierImpl.s= endingFault(ProvenanceNotifierImpl.java:485) > =09at org.apache.airavata.core.gfac.notification.impl.WorkflowTrackingNot= ification.executionFail(WorkflowTrackingNotification.java:108) > =09at org.apache.airavata.core.gfac.notification.impl.DefaultNotifier.exe= cutionFail(DefaultNotifier.java:135) > =09at org.apache.airavata.core.gfac.provider.impl.GramProvider.executeApp= lication(GramProvider.java:225) > =09at org.apache.airavata.core.gfac.provider.AbstractProvider.execute(Abs= tractProvider.java:69) > =09at org.apache.airavata.core.gfac.services.impl.AbstractSimpleService.e= xecute(AbstractSimpleService.java:118) > =09at org.apache.airavata.core.gfac.GfacAPI.gridJobSubmit(GfacAPI.java:14= 0) > =09at org.apache.airavata.xbaya.invoker.EmbeddedGFacInvoker.invoke(Embedd= edGFacInvoker.java:256) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.handleWSC= omponent(WorkflowInterpreter.java:749) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.executeDy= namically(WorkflowInterpreter.java:533) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.scheduleD= ynamically(WorkflowInterpreter.java:218) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.e= xecuteWorkflow(WorkflowInterpretorSkeleton.java:389) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.a= ccess$400(WorkflowInterpretorSkeleton.java:87) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton$2= .run(WorkflowInterpretorSkeleton.java:382) > =09at java.lang.Thread.run(Thread.java:680) > [INFO] =09-----DATA----- > [INFO] =09=09Job Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > [INFO] =09-----END DATA----- > [ERROR] Job Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > org.apache.airavata.core.gfac.exception.JobSubmissionFault: Job Protocol = : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > =09at org.apache.airavata.core.gfac.provider.impl.GramProvider.executeApp= lication(GramProvider.java:222) > =09at org.apache.airavata.core.gfac.provider.AbstractProvider.execute(Abs= tractProvider.java:69) > =09at org.apache.airavata.core.gfac.services.impl.AbstractSimpleService.e= xecute(AbstractSimpleService.java:118) > =09at org.apache.airavata.core.gfac.GfacAPI.gridJobSubmit(GfacAPI.java:14= 0) > =09at org.apache.airavata.xbaya.invoker.EmbeddedGFacInvoker.invoke(Embedd= edGFacInvoker.java:256) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.handleWSC= omponent(WorkflowInterpreter.java:749) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.executeDy= namically(WorkflowInterpreter.java:533) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.scheduleD= ynamically(WorkflowInterpreter.java:218) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.e= xecuteWorkflow(WorkflowInterpretorSkeleton.java:389) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.a= ccess$400(WorkflowInterpretorSkeleton.java:87) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton$2= .run(WorkflowInterpretorSkeleton.java:382) > =09at java.lang.Thread.run(Thread.java:680) > Caused by: java.lang.Exception: Job Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > =09... 12 more > Exception in thread "Thread-58" org.apache.airavata.workflow.model.except= ions.WorkflowRuntimeException: org.apache.airavata.workflow.model.exception= s.WorkflowException: Job Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.e= xecuteWorkflow(WorkflowInterpretorSkeleton.java:392) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.a= ccess$400(WorkflowInterpretorSkeleton.java:87) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton$2= .run(WorkflowInterpretorSkeleton.java:382) > =09at java.lang.Thread.run(Thread.java:680) > Caused by: org.apache.airavata.workflow.model.exceptions.WorkflowExceptio= n: Job Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > =09at org.apache.airavata.xbaya.invoker.EmbeddedGFacInvoker.invoke(Embedd= edGFacInvoker.java:321) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.handleWSC= omponent(WorkflowInterpreter.java:749) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.executeDy= namically(WorkflowInterpreter.java:533) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpreter.scheduleD= ynamically(WorkflowInterpreter.java:218) > =09at org.apache.airavata.xbaya.interpretor.WorkflowInterpretorSkeleton.e= xecuteWorkflow(WorkflowInterpretorSkeleton.java:389) > =09... 3 more > Caused by: org.apache.airavata.core.gfac.exception.JobSubmissionFault: Jo= b Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > =09at org.apache.airavata.core.gfac.provider.impl.GramProvider.executeApp= lication(GramProvider.java:222) > =09at org.apache.airavata.core.gfac.provider.AbstractProvider.execute(Abs= tractProvider.java:69) > =09at org.apache.airavata.core.gfac.services.impl.AbstractSimpleService.e= xecute(AbstractSimpleService.java:118) > =09at org.apache.airavata.core.gfac.GfacAPI.gridJobSubmit(GfacAPI.java:14= 0) > =09at org.apache.airavata.xbaya.invoker.EmbeddedGFacInvoker.invoke(Embedd= edGFacInvoker.java:256) > =09... 7 more > Caused by: java.lang.Exception: Job Protocol : https > Host name : gridftp1.ls4.tacc.utexas.edu > Port number : 50393 > Url path : 16289878754566973521/8943296923859968446/ > User : null > Pwd : null > on host lonestar4.tacc.teragrid.org Job Exit Code =3D 14 > =09... 12 more > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -- This message was sent by Atlassian JIRA (v6.2#6252)