couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "View_Snippets" by MarcaJames
Date Tue, 27 Oct 2009 18:00:38 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "View_Snippets" page has been changed by MarcaJames.
The comment on this change is: Clarified why one would *want* to use the complicated algorithm..
http://wiki.apache.org/couchdb/View_Snippets?action=diff&rev1=28&rev2=29

--------------------------------------------------

  <<Anchor(summary_stats)>>
  == Computing simple summary statistics (min,max,mean,standard deviation)  ==
  
- Implementation in {{{JavaScript}}} by MarcaJames.  Mistakes in coding are my fault, algorithms
are from others, as noted.  To the best of my knowledge the algorithms are public domain,
and my implementation freely available to all (Perl Artistic License if you really need a
license to consult)
+ This implementation of standard deviation is more complex than the above algorithm, called
the "textbook one-pass algorithm" by Chan, Golub, and LeVeque.  While it is mathematically
equivalent to the standard two-pass computation of standard deviation, it can be numerically
unstable under certain conditions.  Specifically, if the square of the sums and  the sum of
the squares terms are large, then they will be computed with some rounding error.  If the
variance of the data set is small, then subtracting those two large numbers (which have been
rounded off slightly) might wipe out the computation of the variance.  See http://www.jstor.org/stable/2683386,
http://people.xiph.org/~tterribe/notes/homs.html, and the wikipedia description of Knuth's
algorithm http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance.
  
- Here is some code I have developed to compute standard deviation.  I do it two ways, both
of which are different from jchris' github version (add link?).  In practice of course you
wouldn't need both ways.  The view is specialized to my dataset, but the reduce function might
be useful to others.
+ The below implementation in {{{JavaScript}}} by MarcaJames.  Any mistakes in the js coding
are my fault.  The algorithms are from others (all smarter than I), as noted in the comments
in the code.  To the best of my knowledge the algorithms are public domain, and my implementation
freely available to all.  
  
- I've only ever tested it on futon, and have no idea what the "group" parameter does to the
output.  Probably nothing!
+ Note that the view is specialized to my dataset, but the reduce function is written to be
fairly generic.  I kept the view as is because I'm too lazy to write up a generic view, and
also because when I wrote it I wasn't sure one could use Date, Math, and Reg``Exp in Couch``DB
Java``Script.  
  
  {{{
  // Map function

Mime
View raw message