Return-Path: X-Original-To: apmail-airavata-dev-archive@www.apache.org Delivered-To: apmail-airavata-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5DC7DB61 for ; Tue, 25 Sep 2012 04:35:18 +0000 (UTC) Received: (qmail 47131 invoked by uid 500); 25 Sep 2012 04:35:18 -0000 Delivered-To: apmail-airavata-dev-archive@airavata.apache.org Received: (qmail 47000 invoked by uid 500); 25 Sep 2012 04:35:15 -0000 Mailing-List: contact dev-help@airavata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airavata.apache.org Delivered-To: mailing list dev@airavata.apache.org Received: (qmail 46976 invoked by uid 99); 25 Sep 2012 04:35:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Sep 2012 04:35:15 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS,URIBL_DBL_REDIR X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [128.149.139.109] (HELO mail.jpl.nasa.gov) (128.149.139.109) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Sep 2012 04:35:10 +0000 Received: from mail.jpl.nasa.gov (ap-ehub-sp01.jpl.nasa.gov [128.149.137.148]) by smtp.jpl.nasa.gov (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id q8P4YmNS020266 (using TLSv1/SSLv3 with cipher AES128-SHA (128 bits) verified NO) for ; Mon, 24 Sep 2012 21:34:48 -0700 Received: from AP-EMBX-SP40.RES.AD.JPL ([169.254.7.32]) by ap-ehub-sp01.RES.AD.JPL ([169.254.3.32]) with mapi id 14.02.0318.001; Mon, 24 Sep 2012 21:34:47 -0700 From: "Mattmann, Chris A (388J)" To: "" Subject: Re: Big data challenges in Airavata Thread-Topic: Big data challenges in Airavata Thread-Index: AQHNmrJCqwsEfwjYok6zdodyoP9mkJea72yA Date: Tue, 25 Sep 2012 04:34:47 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [128.149.137.114] Content-Type: text/plain; charset="us-ascii" Content-ID: <2952FD81A17F7F4ABC1384BF8ED5DA5D@ad.jpl> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Source-Sender: chris.a.mattmann@jpl.nasa.gov X-AUTH: Authorized X-Virus-Checked: Checked by ClamAV on apache.org Hi Danushka, Thanks for your email and glad that you are interested in the SKA work. You= can find some of the information that we've been working on with the SKA in some research= papers that I've written on the subject. Let me see about putting them online on my USC= website and then sending a link back to the list. I would love to see an integration coming out of this that brings together = OODT and Airavata. I am going to reply with some more specifics as soon as I can get a little = more time to respond in full. In the meanwhile, let me get those papers up so we can talk more. Thanks! Cheers, Chris On Sep 24, 2012, at 5:10 PM, Danushka Menikkumbura wrote: > Hi all, >=20 > I am a student of 2012 M.Sc.(CS) batch of University of Moratuwa, Sri > Lanka. Big data is one of the areas that I research and I am currently > looking into possibilities and challenges in bringing in big data > capabilities to science gateways under the supervision of Dr. Shahani > Weerawarana. With the knowledge that I have gathered so far, I understand > that Airavata lacks its strength in this area. >=20 > Basically support for big data in Airavata could be in different shapes. >=20 > 1. Simply make big data techniques available during workflow execution. > This could be in the form of MapReduce (Hadoop), BigTable data models > (Cassandra), etc. The idea is to handle huge data volumes as mentioned in > [1]. (e.g. 700 TB/sec data flood off the SKA [2] in near future). >=20 > 2. Using a big-data-ready distributed filesystem as the core filesystem o= f > Airavata (e.g. HDFS) and make is available across the framework. >=20 > 3. Challenges related to data provenance [3], [4]. >=20 > I believe you see things better when you look at Airavata from these > perspectives and maybe you have already put thoughts into these aspects. >=20 > Please share your thoughts and help me understand what I should actually > look into. >=20 > [1] - http://www.slideshare.net/Hadoop_Summit/big-data-challenges-at-nasa > [2] - http://en.wikipedia.org/wiki/Square_Kilometre_Array > [3] - http://rac.uits.iu.edu/sites/default/files/SimmhanICWS06.pdf > [4] - http://bit.ly/PC2Eq4 >=20 > Thanks, > Danushka ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattmann@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++