airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chathura Herath <>
Subject Re: XBaya/Hadoop Integration - Concern
Date Sat, 22 Jun 2013 05:16:59 GMT
Lets think about this a bit more.
Hadoop is a data driven model, which is very different from the MPI
model that we deal  in scientific computing. When you launch an MPI
job the number of nodes are decided by you(meaning Suresh or Sudhakar
who configure the app ), who knows how many nodes are available and
required for the job. Now if you think about Hadoop,  the number of
nodes/partitions are decides by the framework dynamically(may be using
the Formatters). I believe this is the reason why this idea of dynamic
node selection was left untouched for most art.

I agree that we need a configuration file to store all the
configuration that are static for most part but required for hadoop
job launch. Current encapsulation of hadoop configuration is not the
best way and certainly be improved.

Point i am trying to make is, may be what we need is a
HadoopAccountDescription.xml file instead of trying to push this to
the existing hostdescription. Their semantics are different as well as
the paremerters and model. The host description schema was defined
with super computing applications in mind may be this schema was
revisited since i last seen it and rethought. I wouldn't worry about
this if you are working on a conference paper due in three weeks. But
definitely something to think about.

On Fri, Jun 21, 2013 at 11:40 PM, Lahiru Gunathilake <> wrote:
> Hi Danushka,
> I am +1 for this approach, but I am sure you need to patch gfac-core
> without breaking default gfac functionality.
> Lahiru
> On Fri, Jun 21, 2013 at 7:44 PM, Danushka Menikkumbura <
>> wrote:
>> Hadoop deployment model (single node, local cluster, EMR, etc) is not
>> exactly a host, as in Airavata, but is along the lines of host IMO.
>> Therefore we can still stick to a similar model but need to have a
>> different UI interface to configure them. Still Hadoop jobs would be
>> treated differently and have them configured in workflow itself (i.e. the
>> current implementation), as opposed to having them predefined as in GFac
>> applications.
>> Please kindly let me know if you think otherwise.
>> Cheers,
>> Danushka
>> On Wed, Jun 19, 2013 at 12:57 AM, Danushka Menikkumbura <
>>> wrote:
>>> Hi All,
>>> The current UI implementation does not take application/host description
>>> into account simply because they have little or no meaning in the Hadoop
>>> world as I believe. The current implementation enables configuring each
>>> individual job using the UI (Please see the attached xbaya-hadoop.png).
>>> The upside of this approach is that new jobs could be added/configured
>>> dynamically, without adding application descriptions/generating
>>> code/compiling/re-deploying/etc. The downside is that it is different from
>>> general GFac application invocation, where each application has an
>>> associated application/host/etc. Nevertheless we are trying to incorporate
>>> something that does not quite fit into application/host domain.
>>> Thoughts appreciated.
>>> Thanks,
>>> Danushka
> --
> System Analyst Programmer
> PTI Lab
> Indiana University

Chathura Herath Ph.D

View raw message