Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@www.apache.org Received: (qmail 90236 invoked from network); 12 Aug 2004 21:25:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 12 Aug 2004 21:25:46 -0000 Received: (qmail 62265 invoked by uid 500); 12 Aug 2004 21:25:42 -0000 Delivered-To: apmail-jakarta-commons-dev-archive@jakarta.apache.org Received: (qmail 62230 invoked by uid 500); 12 Aug 2004 21:25:42 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 62216 invoked by uid 99); 12 Aug 2004 21:25:42 -0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=DNS_FROM_RFC_ABUSE X-Spam-Check-By: apache.org Received: from [216.136.131.51] (HELO web11001.mail.yahoo.com) (216.136.131.51) by apache.org (qpsmtpd/0.27.1) with SMTP; Thu, 12 Aug 2004 14:25:38 -0700 Message-ID: <20040812212537.40303.qmail@web11001.mail.yahoo.com> Received: from [192.5.2.136] by web11001.mail.yahoo.com via HTTP; Thu, 12 Aug 2004 14:25:37 PDT Date: Thu, 12 Aug 2004 14:25:37 -0700 (PDT) From: Kim van der Linde Reply-To: kim@kimvdlinde.com Subject: Re: [math] Fwd: Re: Median in stats To: Jakarta Commons Developers List , mark_diggory@harvard.edu In-Reply-To: <411BDBEB.9020305@latte.harvard.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N I just downloaded it, so if anyone wants a copy, let me know.... Theory and Methods The Remedian: A Robust Averaging Method for Large Data Sets Peter J. Rousseeuw; Gilbert W. Bassett, Jr. Journal of the American Statistical Association, Vol. 85, No. 409. (Mar., 1990), pp. 97-104. Stable URL: http://links.jstor.org/sici?sici=0162-1459%28199003%2985%3A409%3C97%3ATRARAM%3E2.0.CO%3B2-R Abstract It is often assumed that to compute a robust estimator on n data values one needs at least n storage elements (contrary to the sample average, that may be calculated with an updating mechanism). This is one of the main reasons why robust estimators are seldom used for large data sets and why they are not included in most statistical packages. We introduce a new estimator that takes up little storage space, investigate its statistical properties, and provide an example on real-time curve �veraging" in a medical context. The remedian with base b proceeds by computing medians of groups of b observations, and then medians of these medians, until only a single estimate remains. This method merely needs k arrays of size b (where n = bk), so the total storage is O(logn) for fixed b or, alternatively, O(n 1/k ) for fixed k. Its storage economy makes it useful for robust estimation in large data bases, for real-time engineering applications in which the data themselves are not stored, and for resistant �veraging" of curves or images. The method is equivariant for monotone transformations. Optimal choices of b with respect to storage and finite-sample breakdown are derived. The remedian is shown to be a consistent estimator of the population median, and it converges at a nonstandard rate to a median-stable distribution. --------- Cheers, Kim --- "Mark R. Diggory" wrote: > This is about as close as I can get to a citation in > JSTOR without > authentication: > > http://links.jstor.org/sici?sici=0162-1459%28199003%2985%3A409%3C97%3ATRARAM%3E2.0.C0%3B2-R > > Here's an abstract of it I found via google. > > http://www.agoras.ua.ac.be/abstract/Remrob90.htm > http://www.inomics.com/cgi/repec?handle=RePEc:boc:bocode:R141101 > > > Phil Steitz wrote: > > >Posting a reference to the paper would be better > ;-) > > > >-Phil > > > > -----Original Message----- > > From: Mark R. Diggory > [mailto:mdiggory@latte.harvard.edu] > > Sent: Thu 8/12/2004 1:52 PM > > To: Jakarta Commons Developers List > > Cc: David Aubespin > > Subject: Re: [math] Fwd: Re: Median in stats > > > > > > > > Sorry about that, JSTOR just gets more and more > anal about even > > searching thier contents. > > > > I'll send it to anyone privately who wishes to see > it. -Mark > > > > Phil Steitz wrote: > > > > >I agree that this looks interesting. The link > below is to an authenticated site. Can you (David) > provide a full reference to the paper? > > > > > >Phil > > > > > > -----Original Message----- > > > From: Mark R. Diggory > [mailto:mdiggory@latte.harvard.edu] > > > Sent: Thu 8/12/2004 1:31 PM > > > To: David Aubespin; Jakarta Commons > Developers List > > > Cc: > > > Subject: [math] Fwd: Re: Median in stats > > > > > > > > > > > > I think we would be very interested in > such an approach. I'll forward > > > this onto the list so others may know > about it. > > > > > > Here's one of the authors: > > > > http://tigger.uic.edu/~gib/vit.htm#Statistics > > > > > > Their paper: > > > > http://www.jstor.org.ezp2.harvard.edu/view/01621459/di985983/98p0207b/0?config=jstor&frame=noframe&userID=80673cc8@harvard.edu/018dd5533b005013e31f0&dpi=3 > > > > > > David Aubespin wrote: > > > > > > > Hi, > > > > > > > > I have an implementation of a median > computation based on the Remedian > > > > algorithm (a paper had be published in > 1990 about it). Its main > > > > advantage is that it takes up a fixed > amount of memory regardless of > > > > the data sample size (very useful when > dealing with large datasets > > > > like in astronomy). > > > > > > > > I was wondering if I could contribute it > to the Math project. > > > > > > > > Let me know, > > > > > > > > > thanks, > > > Mark > > > > > > -- > > > Mark R. Diggory > > > Software Developer > > > Harvard MIT Data Center > > > http://www.hmdc.harvard.edu > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > commons-dev-unsubscribe@jakarta.apache.org > > > For additional commands, e-mail: > commons-dev-help@jakarta.apache.org > > > > > > > > > > > > > > > > > > > > > -- > > Mark R. Diggory > > Software Developer > > Harvard MIT Data Center > > http://www.hmdc.harvard.edu > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > commons-dev-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: > commons-dev-help@jakarta.apache.org > > > > > > > > > > > > > -- > Mark R. Diggory > Software Developer > Harvard MIT Data Center > http://www.hmdc.harvard.edu > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > commons-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: > commons-dev-help@jakarta.apache.org > > __________________________________ Do you Yahoo!? Yahoo! Mail is new and improved - Check it out! http://promotions.yahoo.com/new_mail --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org