airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Mark" <mmil...@sdsc.edu>
Subject RE: Evaluate Suitable Scientific Workflow Language for Airavata.
Date Fri, 19 Sep 2014 17:17:00 GMT
Very interesting thoughts Bruce.
The issue of sustainability is certainly a key one, and building dependencies on another platform
always brings risk.

Going back to Lavanya’s comment, I think this thought is critical.
What workflow tools in use really have big uptake by users?

At CIPRES we have tended to focus on being driven by what users are actually asking for
(of course we are always behind schedule, but in adding new features this is not the worst
thing).
What do we know about the requirements for workflow within Airavata from the user POV?

It seems like the correct place to start.

Mark

From: Bruce Barkstrom [mailto:brbarkstrom@gmail.com]
Sent: Friday, September 19, 2014 9:29 AM
To: dev@oodt.apache.org
Cc: architecture@airavata.apache.org; dev
Subject: Re: Evaluate Suitable Scientific Workflow Language for Airavata.

One factor that should be included in the group's deliberations
on adding a workflow language to the other things in OODT
is the impact on long-term maintenance.  While there's a lot
of enthusiasm in the developer community right now, we need
to think about what happens when development turns into
maintenance.  The account that follows is based on my experience
with trying to resurrect a W3C-related project to visualize RDF
graphs.
The project is called IsaViz.  It's even got a W3 web site:
http://www.w3.org/2001/11/IsaViz/
IsaViz identifies itself as a visual authoring tool for RDF.
Right up near the top are two dates that should serve as
a cautionary note for people who want to pick up this tool:
Current Stable Version: October 2007 and Current Development Version:
May 2007.  It also looks like the site was maintained by a single
developer who did the development as a postdoc project.
The overview page was last modified on Oct. 21, 2007.
The installation instructions were last modified on Oct. 18, 2004.
IsaViz uses a number of tools.  The Installation Instructions
identify the following:

  *   A Java Virtual Machine version 1.3.0 or later (1.4 strongly recommended - see Known
problems<http://www.w3.org/2001/11/IsaViz/overview.html#bugs>)
  *   A distribution of IsaViz, which contains the following Java JAR files:

     *   IsaViz itself (isaviz.jar)
     *   Zoomable Visual Transformation Machine (zvtm.jar)
     *   Jena 2.1 for IsaViz 2.1, Jena 1.6.1 for IsaViz 1.2
     *   Xerces-J version 2 (xercesImpl.jar,xmlParserAPIs.jar) for IsaViz 2, Xerces 1.4.4
for IsaViz 1.2

  *   GraphViz from AT&T version 1.8.9 or later (version 1.7.x is no longer supported
in IsaViz 2, and has actually only been tested with version 1.9). Note: some instances of
version 1.10.0 had a bug that produced incomplete SVG files, but it has been corrected in
subsequent releases (newer versions can be obtained on the graphviz.org site<http://www.graphviz.org/pub/graphviz/CURRENT>).
So, what complications ensue:
1.  Java has moved way beyond version 1.3 or 1.4.  Since Java can deprecate code and
since there's Oracle and OpenJDK, there may be some unpleasantries that might need
fixes.  I haven't seen comments from the community on whether or not these might be
significant.  The IsaViz documentation refers to the ancient time when Sun controlled
the language.  Apparently, the IsaViz code was only tested with Sun's j2se/1.3 or 1.4.
2.  I suppose the jar files from IsaViz version 2 would be the place to start in reconstructing
this piece of software.  However, one might be careful about this when you get into the
installation scripts from the Installation Instructions.
3.  The Zoomable Visual Transformation Machine project is on Sourceforge.  It's apparently
done in Java.  However, the IsaViz code used version 0.9.0, while the current Sourceforge
project (at http://zvtm.sourceforge.net/) is now up to 0.11.1 for the stable version (Aug.
2013)
with a more recent development version (0.11.2 - snapshot; June 2014).  No idea if
there would be any serious ramifications from this change.
4.  The Installation Instructions have a link to the HP Jena site.  If you link to it, the
page says "Oops! ..."  Jena was moved from HP to apache.  So if you want to do
Jena, you now need to consult <https://jena.apache.org/>.  I'm not sure exactly how
the apache Jena source code or binary installations compare with what IsaViz is expecting.
As a note, Jena is a BIG chunk of software.  I think the tutorials on RDF (including OWL
and related reasoners) are going to take a novice user (including many scientists) a month
or two of dedicated time to work through.  I don't know how easy IsaViz would be to install
without at least a basic understanding of RDF and of the related triple store database.
5.  Xerces-J is the XML Java parser (see <Xerces.apache.org<http://Xerces.apache.org>>),
which is now up to
version 2.11.0.  Again, it isn't clear what kinds of difficulties one would encounter to use
this library.
6.  GraphViz (at <http://www.graphviz.org/>) is now at version 2.38.0-1.  AT&T seems
to be maintaining a lot of installation options.  I was interested in Ubuntu - and then
there are different versions of that.
As an additional note, Linux has developed a bunch of variants.  A particularly active
area of development is the creation of automated package managers - often with centralized
control over installation procedures and source code libraries.  The packages have dependencies
on the libraries -- and there's no guarantee that an RPM package has the same dependencies
as a Debian package.  This is a bit like the DOI guarantee of providing a unique location
to
obtain original items - although publishers have been known to substitute new versions of
the unique object for the "true" original.
At the same time, software packages with complex networks of dependencies are
not exactly easy to maintain with Linux (or Unix) scripts.  Exploring the integrity of
the whole package requires a fair amount of work by experienced system administrators.
If the intent is to produce data archives (or data production facilities) that have long-term
maintainability, they need to handle replication [see Barkstrom and Mattman, 2010, ESI]
of objects, as well as transparency.  The key attributes of such systems need to be
simplicity, provenance integrity, and reliability.  They aren't easy attributes to maintain
over the long haul.  The article on "being digital" in the current CACM has a useful
perspective on how our enthusiasm for "rupture talk" plays out:
Haigh, T., 2014: We Have Never Been Digital, CACM, 57,No. 09,24-28
Peter Denning's article that follows immediately in the print version
[Denning, P. J., 2014: The Profession of IT: Learning for the New Digital Age,
CACM, 57, No. 09, 29-31] offers some additional perspective that's probably
relevant to the issue of the learning curve for new technologies.  That curve
is usually underestimated.  While everyone wants "user friendly" tools, it isn't
easy for developers to get an accurate idea for how many person-hours of work
it will require to make a user proficient enough to use new tools, particularly in the
presence of "version churn" like we can see in the IsaViz example.
Bruce B.


On Thu, Sep 18, 2014 at 2:54 PM, Lavanya Ramakrishnan <lramakrishnan@lbl.gov<mailto:lramakrishnan@lbl.gov>>
wrote:
Here is my 2c -

I think it is important to try and understand what your users are going to
do with workflow and what kind of language they are used to
(domain-specific, functional, etc). They are processes called user-centered
design processes you can use to do this or do at a minimum an informal
study.

 A couple of years ago, we did an introspection on why all the existing
workflow tools didn't have the uptake we had assumed it would. I have been
part of a half dozen different tools over my career. We have since launched
a project called Tigres - http://tigres.lbl.gov/ where we have learned a
lot due to using a user-centered design approach. We have an IEEE eScience
paper on our initial work - which you might find interesting. I am also
happy to share more details on Tigres and/or the process.

Lavanya





On Thu, Sep 18, 2014 at 10:53 AM, BW <bwebb@mysoftcloud.com<mailto:bwebb@mysoftcloud.com>>
wrote:

> Is there a list of graphical BEL workflow tools?
>
> On Thursday, September 18, 2014, Mattmann, Chris A (3980) <
> chris.a.mattmann@jpl.nasa.gov<mailto:chris.a.mattmann@jpl.nasa.gov>> wrote:
>
> > Hi Guys,
> >
> > I've been interested in this too - we don't per have a specific
> > OODT workflow language, but we specific workflows using XML, and
> > other configuration (we are also thinking of moving to JSON for
> > this).
> >
> > In the past I've also looked at YAWL and BPEL - both seem complex
> > to me.
> >
> > I wonder at the end of the day if we should adopt something more
> > modern like PIG or some other data flow type of language (PIG
> > is really neat).
> >
> > Cheers,
> > Chris
> >
> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Chris Mattmann, Ph.D.
> > Chief Architect
> > Instrument Software and Science Data Systems Section (398)
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 168-519, Mailstop: 168-527
> > Email: chris.a.mattmann@nasa.gov<mailto:chris.a.mattmann@nasa.gov> <javascript:;>
> > WWW:  http://sunset.usc.edu/~mattmann/
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Adjunct Associate Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Shameera Rathnayaka <shameerainfo@gmail.com<mailto:shameerainfo@gmail.com>
<javascript:;>>
> > Reply-To: "architecture@airavata.apache.org<mailto:architecture@airavata.apache.org>
<javascript:;>"
> > <architecture@airavata.apache.org<mailto:architecture@airavata.apache.org>
<javascript:;>>
> > Date: Thursday, September 18, 2014 8:26 AM
> > To: "architecture@airavata.apache.org<mailto:architecture@airavata.apache.org>
<javascript:;>" <
> > architecture@airavata.apache.org<mailto:architecture@airavata.apache.org>
<javascript:;>>,
> > dev <dev@airavata.apache.org<mailto:dev@airavata.apache.org> <javascript:;>>
> > Subject: Evaluate Suitable Scientific Workflow Language for Airavata.
> >
> > >Hi All,
> > >
> > >As we all know Airavata has its own workflow language call XWF. When XWF
> > >was introduced, main focus points are interoperability and
> convertibility.
> > >But with years of experience it is convinced that above requirements are
> > >not really useful when we come to real world use cases. And XWF is XML
> > >based bulky language where we attache WSDLs and Workflow image it self.
> > >But
> > >with the recent changes WSDL part is being removed from XWF.
> > >
> > >It is worth to evaluate handy Scientific workflow languages in industry
> > >and
> > >find out pros and cons, at the end of this evaluation we need to come up
> > >with idea how we should improve Airavata workflow language, either we
> can
> > >improve existing XWF language, totally change to a new language
> available
> > >in industry or write a new light weight language. Basic requirements
> that
> > >we expect from new improvement are, high usability, flexible, light
> weight
> > >and real time monitoring support. As you can see above requirements are
> > >not
> > >direct comes with workflow languages but we need workflow language which
> > >help to support above requirements.
> > >
> > >After reading few papers and googling, initially i have come up with
> > >following three existing languages,
> > >1. YAWL <http://www.yawlfoundation.org/>
> > >2. WS-BPEL
> > >​3. SIDL
> > ><http://computation.llnl.gov/casc/components/index.html#page=home>
> > >
> > >In my opinion SIDL is more familiar with scientific domain, Radical-SAGA
> > >also uses slightly modified version of SIDL. Other than above three
> > >languages we can come up with simple workflow language base on json(or
> > >yaml) which support all our requirements for some extends.
> > >
> > >It would be grate if I can get more input regarding the $Subject form
> the
> > >airavata community. You all are more than welcome to provide any type of
> > >suggestions.
> > >
> > >Thanks,
> > >Shameera.
> > >
> > >​
> > >
> > >--
> > >Best Regards,
> > >Shameera Rathnayaka.
> > >
> > >email: shameera AT apache.org<http://apache.org> , shameerainfo AT gmail.com<http://gmail.com>
> > >Blog : http://shameerarathnayaka.blogspot.com/
> >
> >
>

Mime
View raw message