airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Marru <sma...@apache.org>
Subject Re: Error while running a BES job
Date Wed, 24 Jun 2015 14:29:54 GMT
Hi Shahabaz,

Sorry missed to respond to this earlier. Please see below:

> On Jun 12, 2015, at 12:52 PM, Shahbaz Memon <m.memon@fz-juelich.de> wrote:
> 
> Hi Suresh and Devs,
> 
> I am testing job resource requirement mappings between Airavata-Thrift and JSDL resource
model.

This is a good comparison, we need to make sure all the relevant JSDL semantics have been
covered.

> In the Airavata-thrift's model I would propose to include NumberOfCPUsPerNode (or IndividualCPUCount)
attribute as one more resource requirement field. By defining this field user will be able
to specify TotalCPUCount (which is already there) or TotalNodeCount with IndividualCPUCount.
I believe it should be easy to include this field in the existing resource model. What do
you think? 

I debated to add this field early on, I compared against SAGA, JSDL, RSL and few other API’s
like Moab web services. I could not conclusively narrow on which of these three parameters
to include. The problem with including all three is they could contradict making the validation
tricky. I think we need to select two of the three. Having all three will confuse usage. May
be Kenneth has an opinion? 

> Another point is, if I set the application's parallelism type as Parallel I am not able
to specify a parallel environment instance supported by target resource, for example could
be one of, Generic MPI, OpenMPI, MPICH2, POE, IntelMPI, etc. I think it is more feasible for
a deployment scenario where multiple MPI environments are supported but user wants to submit
a job using parallel environment of her choice. This use case is taken from the JSDL-SPMD
specification [1], and unicore supports it :).

In the current app catalog design [1], the interfaces and deployment are decoupled. So the
parallel environment is described in the application  deployment [2]. If your question is,
how can a user specify a particular parallel environment, currently it is implicitly done
by picking the right deployment. We could further re-think if we should allow the users/gateway
to select a deployment based on the parallelism. This reverses the current approach. This
is timely, we are actively revisiting all data models. I will keep this in mind and get back
on this. 

Suresh 

[1] - https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Application+Catalog <https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Application+Catalog>
[2] - https://github.com/apache/airavata/blob/master/thrift-interface-descriptions/airavata-api/application_deployment_model.thrift
<https://github.com/apache/airavata/blob/master/thrift-interface-descriptions/airavata-api/application_deployment_model.thrift>

> 
> Best Regards,
> 
> Shahbaz
> 
> [1] https://www.ogf.org/documents/GFD.115.pdf <https://www.ogf.org/documents/GFD.115.pdf>
> 
> 
> 
> 
> 
> 
> 
> 
> On Wed, Jun 10, 2015 at 4:33 PM, Suresh Marru <smarru@apache.org <mailto:smarru@apache.org>>
wrote:
> Thanks Shahbaz, for testing this in the 0.15 branch. Once you ensure the new cluster
is working well for Unicore, we can make sure the release candidate is used by Ultrascan production
and then add a release note to indicate BES provider being used in production by Ultrascan
using the 0.15 release. 
> 
> Suresh
> 
>> On Jun 10, 2015, at 10:27 AM, Shahbaz Memon <m.memon@fz-juelich.de <mailto:m.memon@fz-juelich.de>>
wrote:
>> 
>> Its working now. Thanks.
>> 
>> On Wed, Jun 10, 2015 at 3:24 PM, Chathuri Wimalasena <kamalasini@gmail.com <mailto:kamalasini@gmail.com>>
wrote:
>> HI Shabaz, 
>> 
>> I fixed the issue. Take a pull and see whether it fixed the issue. 
>> 
>> Thanks..
>> Chathuri
>> 
>> On Wed, Jun 10, 2015 at 8:26 AM, Shahbaz Memon <m.memon@fz-juelich.de <mailto:m.memon@fz-juelich.de>>
wrote:
>> Hi Devs, 
>> 
>> Just checked out the 0.15 branch. 
>> 
>> During create and launch an Echo experiment on a BES endpoint I see the following
exception, 
>> 
>> ERROR] Error while updating the resource JOB_DETAIL
>> org.apache.airavata.registry.cpi.RegistryException: java.lang.ClassCastException:
java.lang.String cannot be cast to org.apache.airavata.registry.cpi.CompositeIdentifier
>>         at org.apache.airavata.persistance.registry.jpa.impl.RegistryImpl.update(RegistryImpl.java:270)
>>         at org.apache.airavata.gfac.core.utils.GFacUtils.updateJobStatus(GFacUtils.java:255)
>>         at org.apache.airavata.gfac.bes.provider.impl.BESProvider.execute(BESProvider.java:196)
>> ..
>> ..
>> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.airavata.registry.cpi.CompositeIdentifier
>>         at org.apache.airavata.persistance.registry.jpa.impl.RegistryImpl.update(RegistryImpl.java:239)
>>         ... 11 more
>> 
>> 
>> More detailed log is pasted under, http://pastebin.com/jCMnMmsu <http://pastebin.com/jCMnMmsu>
>> 
>> Thanks,
>> 
>> Shahbaz
>> 
>> 
>> ------------------------------------------------------------------------------------------------
>> ------------------------------------------------------------------------------------------------
>> Forschungszentrum Juelich GmbH
>> 52425 Juelich
>> Sitz der Gesellschaft: Juelich
>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> Prof. Dr. Sebastian M. Schmidt
>> ------------------------------------------------------------------------------------------------
>> ------------------------------------------------------------------------------------------------
>> 
>> 
>> 
> 
> 


Mime
View raw message