airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Big data challenges in Airavata
Date Tue, 25 Sep 2012 04:34:47 GMT
Hi Danushka,

Thanks for your email and glad that you are interested in the SKA work. You can find some
of the information that we've been working on with the SKA in some research papers that
I've written on the subject. Let me see about putting them online on my USC website
and then sending a link back to the list.

I would love to see an integration coming out of this that brings together OODT and Airavata.
I am going to reply with some more specifics as soon as I can get a little more time to respond
in full. In the meanwhile, let me get those papers up so we can talk more.

Thanks!

Cheers,
Chris

On Sep 24, 2012, at 5:10 PM, Danushka Menikkumbura wrote:

> Hi all,
> 
> I am a student of 2012 M.Sc.(CS) batch of University of Moratuwa, Sri
> Lanka. Big data is one of the areas that I research and I am currently
> looking into possibilities and challenges in bringing in big data
> capabilities to science gateways under the supervision of Dr. Shahani
> Weerawarana. With the knowledge that I have gathered so far, I understand
> that Airavata lacks its strength in this area.
> 
> Basically support for big data in Airavata could be in different shapes.
> 
> 1. Simply make big data techniques available during workflow execution.
> This could be in the form of MapReduce (Hadoop), BigTable data models
> (Cassandra), etc. The idea is to handle huge data volumes as mentioned in
> [1]. (e.g. 700 TB/sec data flood off the SKA [2] in near future).
> 
> 2. Using a big-data-ready distributed filesystem as the core filesystem of
> Airavata (e.g. HDFS) and make is available across the framework.
> 
> 3. Challenges related to data provenance [3], [4].
> 
> I believe you see things better when you look at Airavata from these
> perspectives and maybe you have already put thoughts into these aspects.
> 
> Please share your thoughts and help me understand what I should actually
> look into.
> 
> [1] - http://www.slideshare.net/Hadoop_Summit/big-data-challenges-at-nasa
> [2] - http://en.wikipedia.org/wiki/Square_Kilometre_Array
> [3] - http://rac.uits.iu.edu/sites/default/files/SimmhanICWS06.pdf
> [4] - http://bit.ly/PC2Eq4
> 
> Thanks,
> Danushka


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Mime
View raw message