commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: proposal: numerical models [physics?]
Date Sun, 19 Apr 2009 18:08:43 GMT
I think that this project is at once too limited and too grandiose.

Too limited because of the limitation on distributing model building tool +
data rather than allowing some models to be included without the underlying
data (Farenheit to celsius or MKS to CGS, for instance) or documentation
describing methods of derivation and limitations of range and applicability
(i.e. the academic paper where the data was analyzed).

Too grandiose because you could just as well break it down into four (or
more) big projects.  If you do break this down, then I think you will find
that many parts of what you want to address are already done.

So, for instance, there is:

a) data repository a la wiki.  This is being done and done very well things
like the manyeye <http://manyeyes.alphaworks.ibm.com/manyeyes/>s and
verifiable <http://verifiable.com/> and swivel <http://www.swivel.com/>.
Combine that with the dictates of modern academic publishing to publish
on-line and to provide the raw data in addition to the primary article and
it is clear that one of these efforts is going to succeed.

b) data visualization for these public data sets.  Manyeyes and verifiable
both provide this and very nicely indeed.

c) data mining / curve fitting.  Commons math is providing these tools at
one end, apache mahout is providing differently focused versions of this
same sort of thing.

d) a compendium of well founded physical correlations.  This is the only
part of what you are talking about that isn't already readily available and
this might be of some interest, but I don't understand how to attack.  My
own tendency would be to say that the derivation of the law should not be up
to the reader since that so often leads to serious error.  Instead, there
should be some simple way to export an executable form of a correlation from
more or less academically rigorous publications.  This is because it is
otherwise hard to state the limitations and requirements of these
relationships.

I also worry that you will have a serious long-tail problem very, very
quickly where the interest base of each rule is too small for there to be a
significant audience.  Compounded with inevitable difficulties in finding
the correlation rule you need, the effective social benefit could easily be
essentially nil.  In order to address the long-tail problem and this search
problem, it might be best if there were some way to effectively link from
the original academic publication to an executable form of the conclusions
of the paper.  Then, if you could strike a deal with arxiv or jmlr or ploss,
you might have something useful.  Approaching manyeyes or verifiable might
also be useful.

Good luck with your idea!

On Sun, Apr 19, 2009 at 8:55 AM, Bear Giles <bgiles@coyotesong.com> wrote:

> Luc Maisonobe wrote:
>
>> Bear Giles a écrit :
>>
>>
>>> Hi, I was wondering if there would be interest in numerical models of
>>> physical constants.  For instance, saturation pressure of water vapor in
>>> air at a particular temperature.  It would also be appropriate to
>>> provide a method to get relative humidity from wet and dry bulb
>>> temperatures since it directly relates to this saturation pressure.  All
>>> of the models should be time-invariant, e.g., no historical weather
>>> observations.
>>>
>>>
>>
>> ...
>>
> It's not an all-or-nothing situation where the project is only usable after
> thousands of models exist.  Instead I would see it slowly adding models as
> people 1) discover the tools and 2) scratch their own itch.  There may only
> be 10-20 methods added at a time, but that could be enough to significantly
> enhance the project.  I've worked with meteorological models in the past so
> it's a natural place for me to use as a seed.
>
> Also, I might not have been clear earlier that I'm thinking -solely- of
> curve-fitting observational data, either directly or via simple calculations
> of the same.  As a model, we might have observational data like:
> ...



-- 
Ted Dunning, CTO
DeepDyve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message