airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marlon Pierce <marpi...@iu.edu>
Subject Re: Orchestration Component implementation review
Date Mon, 20 Jan 2014 14:41:18 GMT
+1 also. We have lots of discussions, photographs of whiteboards, etc,
but we need to capture this in the wiki.  Volunteers?


Marlon

On 1/20/14 12:42 AM, Saminda Wijeratne wrote:
> On Sun, Jan 19, 2014 at 8:09 PM, Danushka Menikkumbura <
> danushka.menikkumbura@gmail.com> wrote:
>
>> Hi all,
>>
>> Sorry for jumping in.
>>
>> Can we please have the following just to make things little easy?. Also to
>> make sure everybody is on the same page.
>>
>> 1. High-level architecture diagram of the Orchestration component
>> including how it fits into the current architecture.
>>
>> 2. Threading model and the data model.
>>
> +1. Lahiru since that we now have considerable amount of implementation
> done on the Orchestrator do you think we could create a simple wiki article
> with a summarized facts about the component and with those images you've
> attached on JIRA?
>
>> Cheers,
>> Danushka
>>
>>
>> On Mon, Jan 20, 2014 at 4:16 AM, Lahiru Gunathilake <glahiru@gmail.com>wrote:
>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 1:29 PM, Amila Jayasekara <
>>> thejaka.amila@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samindaw@gmail.com>wrote:
>>>>
>>>>> Following are few thoughts I had during my review of the component,
>>>>>
>>>>> *Multi-threaded vs single threaded*
>>>>> If we are going to have multi-threaded job submission the
>>>>> implementation should work on handling race conditions. Essentially
>>>>> JobSubmitter should be able to "lock" an experiment request before
>>>>> continuing processing that request so that other JobSubmitters accessing
>>>>> the experiment requests a the same time would skip it.
>>>>>
>>>> +1. These are implementation details.
>>>>
>>>>
>>>>> *Orchestrator service*
>>>>> We might want to think of the possibility in future where we will be
>>>>> having multiple deployments of an Airavata service. This could particularly
>>>>> be true for SciGaP. We may have to think how some of the internal data
>>>>> structures/SPIs should be updated to accomodate such requirements in
future.
>>>>>
>>>> +1.
>>>>
>>>>
>>>>> *Orchestrator Component configurations*
>>>>> I see alot of places where the orchestrator can have configurations.
I
>>>>> think its too early finalize them, but I think we can start refactoring
>>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>>> orchestrator is now hardcoded to use default/admin gateway and username.
I
>>>>> think it should come from the request itself.
>>>>>
>>>> +1. But in overall we may need to change the way we handle
>>>> configurations within Airavata. Currently we have multiple configuration
>>>> files and multiple places where we read configurations. IMO we should have
>>>> a separate module to handle configurations. Only this module should be
>>>> aware how to intepret configurations in the file and provide a component
>>>> interface to access those configuration values.
>>>>
>>> I like this approach, this will work with deploying components separate
>>> (if we are planning to do that). We can come to an approach like we have in
>>> Airavata API (Managers), like we have ServerSetting might composed of
>>> multiple settings (GFACsettings, OrchestratorSettings). So during isolated
>>> deployments some Settings objects could be null.
>>>
>>>>
>>>>> *Visibility of API functions*
>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>>> should not be part of the API because I don't see a scenario where the
>>>>> gateway developer would be responsible for using them. They serve a more
>>>>> internal purpose of managing the orchestrator component IMO. As Amila
>>>>> pointed out so long ago (wink) functions that do not concern outside
>>>>> parties should not be used as part of the API.
>>>>>
>>>> +1
>>>>
>>>>
>>>>> *Return values of Orchestrator API*
>>>>> IMO unless it is specifically required to do so I think the functions
>>>>> does not necessarily need to return anything other than throw exceptions
>>>>> when needed. For example the launchExperiment can simply return void
if all
>>>>> is succesful and return an exception if something fails. Handling issues
>>>>> with a try catch is not only simpler but also the explanations are readily
>>>>> available for the user.
>>>>>
>>>> +1. Also try to have different exception for different scenarios. For
>>>> example if persistence (hypothetical) fails, DatabasePersistenceException,
>>>> if validation fails, ValidationFailedException etc ... Then the developer
>>>> who uses the API can catch these different exceptions and act on them
>>>> appropriately.
>>>>
>>>>
>>>>> *Data persisted in registry*
>>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>>> username denotes. In current API, in experiment submission we consider
two
>>>>> types of users. Submission user (the user who submits the experiment
to the
>>>>> Airavata Server - this is inferred by the request itself) and the execution
>>>>> user (the user who corelates to the application executions of the gateway
-
>>>>> thus this user can be a different user for different gateway, eg: community
>>>>> user, gateway user).
>>>>> I think we should persist the date/time of the experiment request as
>>>>> well.
>>>>>
>>>> +1
>>>>
>>>>>  Also when retrying of API functions in the case of a failure in an
>>>>> previous attempt there should be a way to not to repeat already performed
>>>>> steps or gracefully roleback and redo those required steps as necessary.
>>>>> While such actions could be transparent to the user sometimes it might
make
>>>>> sense to allow user to be notified of success/failure of a retry. However
>>>>> this might mean keeping additional records at the registry level.
>>>>>
>>>> In addition we should also have a way of cleaning up unsubmitted
>>>> experiment ids. (But not sure whether you want to address this right now).
>>>> The way I see this is to have a periodic thread which goes through the
>>>> table and clear up experiments which are not submitted for a defined time.
>>>>
>>> +1
>>>
>>> Lahiru
>>>
>>>> BTW, nice review notes, Saminda.
>>>>
>>>> Thanks
>>>> Amila
>>>>
>>>>
>>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>


Mime
View raw message