Return-Path: X-Original-To: apmail-commons-dev-archive@www.apache.org Delivered-To: apmail-commons-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B0ED79C1 for ; Thu, 14 Jul 2011 03:32:01 +0000 (UTC) Received: (qmail 44078 invoked by uid 500); 14 Jul 2011 03:31:58 -0000 Delivered-To: apmail-commons-dev-archive@commons.apache.org Received: (qmail 43441 invoked by uid 500); 14 Jul 2011 03:31:45 -0000 Mailing-List: contact dev-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Developers List" Delivered-To: mailing list dev@commons.apache.org Received: (qmail 43433 invoked by uid 99); 14 Jul 2011 03:31:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jul 2011 03:31:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of phil.steitz@gmail.com designates 74.125.83.171 as permitted sender) Received: from [74.125.83.171] (HELO mail-pv0-f171.google.com) (74.125.83.171) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jul 2011 03:31:35 +0000 Received: by pva4 with SMTP id 4so5487014pva.30 for ; Wed, 13 Jul 2011 20:31:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=heD1pR1Tbp/TErQ39zRoK296F77OYLd+zmKPG4KXrNI=; b=FregV1QQumCbF2orhkVB5oSX2fJGyTLh5WHXZRQhfaIeq8PFhJgMctiNmVxXdZQo69 8YkdE5RbQGutwgDOa5tDxXvH0q9inK4ET8FgSfI3dMLceg8stGwF2wB1tGOOosmo1L1S az0sZUz5w4bbIFYRCJKz94wOM6PSZbj9SH1y8= Received: by 10.143.41.11 with SMTP id t11mr807980wfj.354.1310614274548; Wed, 13 Jul 2011 20:31:14 -0700 (PDT) Received: from a.local (71-223-74-208.phnx.qwest.net [71.223.74.208]) by mx.google.com with ESMTPS id j5sm2359283wff.16.2011.07.13.20.31.13 (version=SSLv3 cipher=OTHER); Wed, 13 Jul 2011 20:31:13 -0700 (PDT) Message-ID: <4E1E6300.2030209@gmail.com> Date: Wed, 13 Jul 2011 20:31:12 -0700 From: Phil Steitz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Commons Developers List Subject: Re: [math] Refactoring multiple regression classes References: <4E1DF37B.90805@gmail.com> <4E1DF74B.8060409@gmail.com> <4E1E39B5.6010606@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 7/13/11 7:14 PM, Greg Sterijevski wrote: > Phil, > > "How exactly do interfaces make the hierarchy flatter in this case? > I agree we should aim for as simple a structure as possible. The > question is, what is that structure?" > > They may or may not make the structure different. Any design we come up with > today is likely to be outmoded in 6 months. (In war throw your battle plans > out the window after the first five minutes.) What I propose is an interface > which is the most minimal set of functionality (identifiable now) that > comprise regression. Over time, as we define more and more implementations > of regression we might see further functionality which is common across > regressions. These methods will migrate to the interface. The interface will > grow organically. More importantly any dependency which is not too picky can > use the interface reference, instead of referencing the concrete class. > Dependencies which care, will and should have intimate knowledge of the > class. Most pieces of code which depend on regression will not. The > interface will not preclude abstract classes. Fortunately for users, maybe less fortunately for developers, we can't really "evolve" our API rapidly and incrementally, unless that evolution avoids backward-incompatible change. The reason for this is that we combine bug fixes and API changes in point releases and users need to be able to upgrade to point releases without having to make code changes. We make incompatible changes in major releases only. The good news is that we are in the runup right now to a major release of [math], so we have once-every-few-years opportunity to make incompatible changes. The maybe less wonderful news is that what we design for 3.0 we will need to live with for a couple of years, so we need to be careful not to lock ourselves in to design constraints that will be hard to innovate within. This is why we favor abstract classes over interfaces. > > The way I see it, you would have a core interface: > > public interface RegressionIface{ > boolean hasIntercept(); > long getN(); > void addObservation(double[] x, double y); > void addObservation(double[] xy); > RegressionResults regress() > RegressionResults regress(int[] vars) > } > > You would then have a subinterface > public interface UpdatingRegression{ > void clear(); > void addObservations( double[][] x, double[] y); > } I thought about that model; but the "fixed model" versions may not need to or want to support the "addAll" semantics - just setData. I was thinking that addObservations above would be included in the base, since it could always be implemented serially. > Why should code which is running a regression need to know more than this? > If for example, the QR regression and the SVD based regression share common > functionality for manipulating the data incore, then they can inherit from > an abstract base class which implements RegressionIface. The user in most > cases will not care. He/she may care whether the data is incore or not, but > thats about it. Exactly, which is why I like your design at the top level. > > The real action, in my opinion, is in the RegressionResults class. Here you > might need a bushy, thick tree. All regressions must generation an immutable > RegressionResults. However, that is the minimum info that would be > generated. We might, for example, have ConstrainedRegressionResults. > > public class ConstrainedRegressionResults. extends RegressionResults{ > private double[] lagrangian; > > > } Agree here again. RegressionResults should include only the basic stuff that every model will include and subclasses will extend it. > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org For additional commands, e-mail: dev-help@commons.apache.org