incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: [PROPOSAL] Climate Model Diagnostic Analyzer
Date Mon, 06 Apr 2015 17:53:31 GMT
Awesome, please add yourself as a mentor. I’d appreciate it!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: James Carman <james@carmanconsulting.com>
Reply-To: "general@incubator.apache.org" <general@incubator.apache.org>
Date: Monday, April 6, 2015 at 10:05 AM
To: "general@incubator.apache.org" <general@incubator.apache.org>
Subject: Re: [PROPOSAL] Climate Model Diagnostic Analyzer

>I would love to help out.  I don't know much about the problem domain,
>but I am a "sciency" kind of guy.
>
>
>On Mon, Apr 6, 2015 at 12:30 PM, Mattmann, Chris A (3980)
><chris.a.mattmann@jpl.nasa.gov> wrote:
>> :) you volunteering as a mentor? Could use you help!
>>
>> Sent from my iPhone
>>
>>> On Apr 6, 2015, at 9:18 AM, James Carman <james@carmanconsulting.com>
>>>wrote:
>>>
>>> Apache Camdan?
>>>
>>> On Monday, March 23, 2015, Mattmann, Chris A (3980) <
>>> chris.a.mattmann@jpl.nasa.gov> wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> I am pleased to submit for consideration to the Apache Incubator
>>>> the Climate Model Diagnostic Analyzer proposal. We are actively
>>>> soliciting interested mentors in this project related to climate
>>>> science and analytics and big data.
>>>>
>>>> Please find the wiki text of the proposal below and the link up
>>>> on the wiki here:
>>>>
>>>> 
>>>>https://wiki.apache.org/incubator/ClimateModelDiagnosticAnalyzerProposa
>>>>l
>>>>
>>>> Thank you for your consideration!
>>>>
>>>> Cheers,
>>>> Chris
>>>> (on behalf of the Climate Model Diagnostic Analyzer community)
>>>>
>>>> = Apache ClimateModelDiagnosticAnalyzer Proposal =
>>>>
>>>> == Abstract ==
>>>>
>>>> The Climate Model Diagnostic Analyzer (CMDA) provides web services for
>>>> multi-aspect physics-based and phenomenon-oriented climate model
>>>> performance evaluation and diagnosis through the comprehensive and
>>>> synergistic use of multiple observational data, reanalysis data, and
>>>>model
>>>> outputs.
>>>>
>>>> == Proposal ==
>>>>
>>>> The proposed web-based tools let users display, analyze, and download
>>>> earth science data interactively. These tools help scientists quickly
>>>> examine data to identify specific features, e.g., trends, geographical
>>>> distributions, etc., and determine whether a further study is needed.
>>>>All
>>>> of the tools are designed and implemented to be general so that data
>>>>from
>>>> models, observation, and reanalysis are processed and displayed in a
>>>> unified way to facilitate fair comparisons. The services prepare and
>>>> display data as a colored map or an X-Y plot and allow users to
>>>>download
>>>> the analyzed data. Basic visual capabilities include 1) displaying
>>>> two-dimensional variable as a map, zonal mean, and time series 2)
>>>> displaying three-dimensional variable’s zonal mean, a two-dimensional
>>>> slice at a specific altitude, and a vertical profile. General
>>>>analysis can
>>>> be done using the difference, scatter plot, and conditional sampling
>>>> services. All the tools support display options for using linear or
>>>> logarithmic scales and allow users to specify a temporal range and
>>>>months
>>>> in a year. The source/input datasets for these tools are CMIP5 model
>>>> outputs, Obs4MIP observational datasets, and ECMWF reanalysis
>>>>datasets.
>>>> They are stored on the server and are selectable by a user through
>>>>the web
>>>> services.
>>>>
>>>> === Service descriptions ===
>>>>
>>>> 1. '''Two dimensional variable services'''
>>>>
>>>> * Map of two-dimensional variable:  This services displays a two
>>>> dimensional variable as a colored longitude and latitude map with
>>>>values
>>>> represented by a color scheme. Longitude and latitude ranges can be
>>>> specified to magnify a specific region.
>>>>
>>>> * Two dimensional variable zonal mean:  This service plots the zonal
>>>>mean
>>>> value of a two-dimensional variable as a function of the latitude in
>>>>terms
>>>> of an X-Y plot.
>>>>
>>>> * Two dimensional variable time series:  This service displays the
>>>>average
>>>> of a two-dimensional variable over the specific region as function of
>>>>time
>>>> as an X-Y plot.
>>>>
>>>> 2. '''Three dimensional variable services'''
>>>>
>>>> * Map of a two dimensional slice of a three-dimensional variable:
>>>>This
>>>> service displays a two-dimensional slice of a three-dimensional
>>>>variable
>>>> at a specific altitude as a colored longitude and latitude map with
>>>>values
>>>> represented by a color scheme.
>>>>
>>>> * Three dimensional zonal mean:  Zonal mean of the specified
>>>> three-dimensional variable is computed and displayed as a colored
>>>> altitude-latitude map.
>>>>
>>>> * Vertical profile of a three-dimensional variable:  Compute the area
>>>> weighted average of a three-dimensional variable over the specified
>>>>region
>>>> and display the average as function of pressure level (altitude) as
>>>>an X-Y
>>>> plot.
>>>>
>>>> 3. '''General services'''
>>>>
>>>> * Difference of two variables:  This service displays the differences
>>>> between the two variables, which can be either a two dimensional
>>>>variable
>>>> or a slice of a three-dimensional variable at a specified altitude as
>>>> colored longitude and latitude maps
>>>>
>>>> * Scatter and histogram plots of two variables:  This service
>>>>displays the
>>>> scatter plot (X-Y plot) between two specified variables and the
>>>>histograms
>>>> of the two variables. The number of samples can be specified and the
>>>> correlation is computed. The two variables can be either a
>>>>two-dimensional
>>>> variable or a slice of a three-dimensional variable at a specific
>>>>altitude.
>>>>
>>>> * Conditional sampling:  This service lets user to sort a physical
>>>> quantity of two or dimensions according to the values of another
>>>>variable
>>>> (environmental condition, e.g. SST) which may be a two-dimensional
>>>> variable or a slice of a three-dimensional variable at a specific
>>>> altitude. For a two dimensional quantity, the plot is displayed an X-Y
>>>> plot, and for a two-dimensional quantity, plot is displayed as a
>>>> colored-map.
>>>>
>>>>
>>>> == Background and Rationale ==
>>>>
>>>> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth
>>>> Assessment Report stressed the need for the comprehensive and
>>>>innovative
>>>> evaluation of climate models with newly available global
>>>>observations. The
>>>> traditional approach to climate model evaluation, which is the
>>>>comparison
>>>> of a single parameter at a time, identifies symptomatic model biases
>>>>and
>>>> errors but fails to diagnose the model problems. The model diagnosis
>>>> process requires physics-based multi-variable comparisons, which
>>>>typically
>>>> involve large-volume and heterogeneous datasets, and computationally
>>>> demanding and data-intensive operations. We propose to develop a
>>>> computationally efficient information system to enable the
>>>>physics-based
>>>> multi-variable model performance evaluations and diagnoses through the
>>>> comprehensive and synergistic use of multiple observational data,
>>>> reanalysis data, and model outputs.
>>>>
>>>> Satellite observations have been widely used in model-data
>>>> inter-comparisons and model evaluation studies. These studies normally
>>>> involve the comparison of a single parameter at a time using a time
>>>>and
>>>> space average. For example, modeling cloud-related processes in global
>>>> climate models requires cloud parameterizations that provide
>>>>quantitative
>>>> rules for expressing the location, frequency of occurrence, and
>>>>intensity
>>>> of the clouds in terms of multiple large-scale model-resolved
>>>>parameters
>>>> such as temperature, pressure, humidity, and wind. One can evaluate
>>>>the
>>>> performance of the cloud parameterization by comparing the cloud water
>>>> content with satellite data and can identify symptomatic model biases
>>>>or
>>>> errors. However, in order to understand the cause of the biases and
>>>> errors, one has to simultaneously investigate several parameters that
>>>>are
>>>> integrated in the cloud parameterization.
>>>>
>>>> Such studies, aimed at a multi-parameter model diagnosis, require
>>>> locating, understanding, and manipulating multi-source observation
>>>> datasets, model outputs, and (re)analysis outputs that are physically
>>>> distributed, massive in volume, heterogeneous in format, and provide
>>>> little information on data quality and production legacy.
>>>>Additionally,
>>>> these studies involve various data preparation and processing steps
>>>>that
>>>> can easily become computationally demanding since many datasets have
>>>>to be
>>>> combined and processed simultaneously. It is notorious that scientists
>>>> spend more than 60% of their research time on just preparing the
>>>>dataset
>>>> before it can be analyzed for their research.
>>>>
>>>> To address these challenges, we propose to build Climate Model
>>>>Diagnostic
>>>> Analyzer (CMDA) that will enable a streamlined and structured
>>>>preparation
>>>> of multiple large-volume and heterogeneous datasets, and provide a
>>>> computationally efficient approach to processing the datasets for
>>>>model
>>>> diagnosis. We will leverage the existing information technologies and
>>>> scientific tools that we developed in our current NASA ROSES COUND,
>>>>MAP,
>>>> and AIST projects. We will utilize the open-source Web-service
>>>>technology.
>>>> We will make CMDA complementary to other climate model analysis tools
>>>> currently available to the research community (e.g., PCMDI’s CDAT and
>>>> NCAR’s CCMVal) by focusing on the missing capabilities such as
>>>>conditional
>>>> sampling, and probability distribution function and cluster analysis
>>>>of
>>>> multiple-instrument datasets. The users will be able to use a web
>>>>browser
>>>> to interface with CMDA.
>>>>
>>>> == Current Status ==
>>>>
>>>> The current version of ClimateModelDiagnosticAnalyzer was developed
>>>>by a
>>>> team at The Jet Propulsion Laboratory (JPL). The project was
>>>>initiated as
>>>> a NASA-sponsored project (ROSES-CMAC) in 2011.
>>>>
>>>> == Meritocracy ==
>>>>
>>>> The current developers are not familiar with meritocratic open source
>>>> development at Apache, but would like to encourage this style of
>>>> development for the project.
>>>>
>>>> == Community ==
>>>>
>>>> While ClimateModelDiagnosticAnalyzer started as a JPL research
>>>>project, it
>>>> has been used in The 2014 Caltech Summer School sponsored by the JPL
>>>> Center for Climate Sciences. Some 23 students from different
>>>>institutions
>>>> over the world participated. We deployed the tool to the Amazon Cloud
>>>>and
>>>> let every student each has his or her own virtual machine. Students
>>>>gave
>>>> positive feedback mostly on the usability and speed of our web
>>>>services.
>>>> We also collected a number of enhancement requests. We seek to further
>>>> grow the developer and user communities using the Apache open source
>>>> venue. During incubation we will explicitly seek increased academic
>>>> collaborations (e.g., with The Carnegie Mellon University) as well as
>>>> industrial participation.
>>>>
>>>> One instance of our web services can be found at:
>>>> http://cmacws.jpl.nasa.gov:8080/cmac/
>>>>
>>>> == Core Developers ==
>>>>
>>>> The core developers of the project are JPL scientists and software
>>>> developers.
>>>>
>>>> == Alignment ==
>>>>
>>>> Apache is the most natural home for taking the
>>>> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned
>>>>with
>>>> some Apache projects such as Apache Open Climate Workbench.
>>>> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style
>>>> development model; it is seeking a broader community of contributors
>>>>and
>>>> users in order to achieve its full potential and value to the Climate
>>>> Science and Big Data community.
>>>>
>>>> There are also a number of dependencies that will be mentioned below
>>>>in
>>>> the Relationships with Other Apache products section.
>>>>
>>>>
>>>> == Known Risks ==
>>>>
>>>> === Orphaned products ===
>>>>
>>>> Given the current level of intellectual investment in
>>>> ClimateModelDiagnosticAnalyzer, the risk of the project being
>>>>abandoned is
>>>> very small. The Carnegie Mellon University and JPL are collaborating
>>>> (2014-2015) to build a service for climate analytics workflow
>>>> recommendation using fund from NASA. A two-year NASA AIST project
>>>> (2015-2016) will soon start to add diagnostic analysis methodologies
>>>>such
>>>> as conditional sampling method, conditional probability density
>>>>function,
>>>> data co-location, and random forest. We will also infuse the
>>>>provenance
>>>> technology into CMDA so that the history of the data products and
>>>> workflows will be automatically collected and saved. This information
>>>>will
>>>> also be indexed so that the products and workflows can be searchable
>>>>by
>>>> the community of climate scientists and students.
>>>>
>>>> === Inexperience with Open Source ===
>>>>
>>>> The current developers of ClimateModelDiagnosticAnalyzer are
>>>>inexperienced
>>>> with Open Source. However, our Champion Chris Mattmann is experienced
>>>> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be
>>>> working closely with us, also as the Chief Architect of our JPL
>>>>section.
>>>>
>>>> === Relationships with Other Apache Products ===
>>>>
>>>> Clearly there is a direct relationship between this project and the
>>>>Apache
>>>> Open Climate Workbench already a top level Apache project and also
>>>>brought
>>>> to the ASF by its Champion (and ours) Chris Mattmann. We plan on
>>>>directly
>>>> collaborating with the Open Climate Workbench community via our
>>>>Champion
>>>> and we also welcome ASF mentors familiar with the OCW project to help
>>>> mentor our project. In addition our team is extremely welcoming of ASF
>>>> projects and if there are synergies with them we invite participation
>>>>in
>>>> the proposal and in the discussion.
>>>>
>>>> === Homogeneous Developers ===
>>>>
>>>> The current community is within JPL but we would like to increase the
>>>> heterogeneity.
>>>>
>>>> === Reliance on Salaried Developers ===
>>>>
>>>> The initial committers are full-time JPL staff from 2013 to 2014. The
>>>> other committers from 2014 to 2015 are a mix of CMU faculty, students
>>>>and
>>>> JPL staff.
>>>>
>>>> === An Excessive Fascination with the Apache Brand ===
>>>>
>>>> We believe in the processes, systems, and framework Apache has put in
>>>> place. Apache is also known to foster a great community around their
>>>> projects and provide exposure. While brand is important, our
>>>>fascination
>>>> with it is not excessive. We believe that the ASF is the right home
>>>>for
>>>> ClimateModelDiagnosticAnalyzer and that having
>>>> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better
>>>> long-term outcome for the Climate Science and Big Data community.
>>>>
>>>> === Documentation ===
>>>>
>>>> The ClimateModelDiagnosticAnalyzer services and documentation can be
>>>>found
>>>> at: http://cmacws.jpl.nasa.gov:8080/cmac/.
>>>>
>>>> === Initial Source ===
>>>>
>>>> Current source resides in ...
>>>>
>>>> === External Dependencies ===
>>>>
>>>> ClimateModelDiagnosticAnalyzer depends on a number of open source
>>>>projects:
>>>>
>>>> * Flask
>>>> * Gunicorn
>>>> * Tornado Web Server
>>>> * GNU octave
>>>> * epd python
>>>> * NOAA ferret
>>>> * GNU plot
>>>>
>>>> == Required Resources ==
>>>>
>>>> === Developer and user mailing lists ===
>>>>
>>>> * private@cmda.incubator.apache.org <javascript:;> (with moderated
>>>> subscriptions)
>>>> * commits@cmda.incubator.apache.org <javascript:;>
>>>> * dev@cmda.incubator.apache.org <javascript:;>
>>>> * users@cmda.incubator.apache.org <javascript:;>
>>>>
>>>> A git repository
>>>>
>>>> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git
>>>>
>>>> A JIRA issue tracker
>>>>
>>>> https://issues.apache.org/jira/browse/CMDA
>>>>
>>>> === Initial Committers ===
>>>>
>>>> The following is a list of the planned initial Apache committers (the
>>>> active subset of the committers for the current repository at Google
>>>>code).
>>>>
>>>> * Seungwon Lee (seungwon.lee@jpl.nasa.gov <javascript:;>)
>>>> * Lei Pan (lei.pan@jpl.nasa.gov <javascript:;>)
>>>> * Chengxing Zhai (chengxing.zhai@jpl.nasa.gov <javascript:;>)
>>>> * Benyang Tang (benyang.tang@jpl.nasa.gov <javascript:;>)
>>>>
>>>>
>>>> === Affiliations ===
>>>>
>>>> JPL
>>>>
>>>> * Seungwon Lee
>>>> * Lei Pan
>>>> * Chengxing Zhai
>>>> * Benyang Tang
>>>>
>>>> CMU
>>>>
>>>> * Jia Zhang
>>>> * Wei Wang
>>>> * Chris Lee
>>>> * Xing Wei
>>>>
>>>> == Sponsors ==
>>>>
>>>> NASA
>>>>
>>>> === Champion ===
>>>>
>>>> Chris Mattmann (NASA/JPL)
>>>>
>>>> === Nominated Mentors ===
>>>>
>>>> TBD
>>>>
>>>> === Sponsoring Entity ===
>>>>
>>>> The Apache Incubator
>>>>
>>>>
>>>>
>>>>
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Chris Mattmann, Ph.D.
>>>> Chief Architect
>>>> Instrument Software and Science Data Systems Section (398)
>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 168-519, Mailstop: 168-527
>>>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>>>> WWW:  http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Adjunct Associate Professor, Computer Science Department
>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>> <javascript:;>
>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>> <javascript:;>
>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org
Mime
View raw message