Return-Path: Delivered-To: apmail-commons-user-archive@www.apache.org Received: (qmail 878 invoked from network); 13 Jul 2009 17:46:15 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Jul 2009 17:46:15 -0000 Received: (qmail 48205 invoked by uid 500); 13 Jul 2009 17:46:23 -0000 Delivered-To: apmail-commons-user-archive@commons.apache.org Received: (qmail 48097 invoked by uid 500); 13 Jul 2009 17:46:23 -0000 Mailing-List: contact user-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Users List" Delivered-To: mailing list user@commons.apache.org Received: (qmail 48079 invoked by uid 99); 13 Jul 2009 17:46:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2009 17:46:23 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com designates 209.85.217.219 as permitted sender) Received: from [209.85.217.219] (HELO mail-gx0-f219.google.com) (209.85.217.219) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2009 17:46:12 +0000 Received: by gxk19 with SMTP id 19so4204824gxk.18 for ; Mon, 13 Jul 2009 10:45:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=PYpvPH5txRbvanWuMmyJNYpDBPMvba9JVkf3bnFYdZ8=; b=D3Va7vum4xxlEbOgbNIf/8ud42Ha/cncOf7KAY+v+HKIqOWGUtCCf1qeZNQeddWAVw 5wi2EhOIWL9e57IDaGwGYO7BY+KIDdkT7qYYDx55Tjis7sCLUwrawMJhSZltx/j4hBfE mRYRXJxcFpxnsFiisIs3/eC+cUf1mQ6EjLQUU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=d8UZu4GQ+xhyovRtS04NBa57Nhv9I3y5xNtQD9UfDsrQ2aaIge6QPGEShWVLrV+e9j LA73JOJFzgnR2VKflg0XIYO/ygoB167VUPnZuZEWJ6SiVrTdhYTNyPbSyDXxgNl/+BJA j52NxxxJ1/VBYPoDvUiMEcNUOlVFgAAR0C6sY= MIME-Version: 1.0 Received: by 10.151.100.17 with SMTP id c17mr8583715ybm.185.1247507151059; Mon, 13 Jul 2009 10:45:51 -0700 (PDT) In-Reply-To: <1247506010.25410.4.camel@lysdexic.healthline.com> References: <8396f8260907130925hb7707aevcc18b1f7c3910519@mail.gmail.com> <1247506010.25410.4.camel@lysdexic.healthline.com> From: Ted Dunning Date: Mon, 13 Jul 2009 10:45:31 -0700 Message-ID: Subject: Re: Basic Maths Help To: Commons Users List , sujit.pal@comcast.net Content-Type: multipart/alternative; boundary=00151751146c97fd6c046e99e4ac X-Virus-Checked: Checked by ClamAV on apache.org --00151751146c97fd6c046e99e4ac Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit And if you are really working on time series for stocks, you will likely have explosively bad results applying a simple polynomial fit. You should, at least, remove the long-term exponential trend. This is probably best done using something like lowess smoothing. If you are looking at long-term data, you should also rescale as a percentage of long term trend. Then for modeling the data, you have to be very careful to avoid over-fitting to noise. Simply throwing polynomials at the problem is the road to ruin. Without significant math skills it will be difficult to get really good results. You might try penalizing your fit by also minimizing the summed squares of your coefficients. This is equivalent to weight decay in neural networks. Commons math is probably a very nice way to implement such algorithms in production. For exploratory development, I would recommend R instead. On Mon, Jul 13, 2009 at 10:26 AM, Sujit Pal wrote: > Hi Graham, > > You want multiple linear regression. Check out this page from the > commons-math docs. > > http://commons.apache.org/math/userguide/stat.html#a1.5_Multiple_linear_regression > > HTH > Sujit > > On Mon, 2009-07-13 at 17:25 +0100, Graham Smith wrote: > > Hi, > > > > I'm hoping that someone with a bit more maths knowledge than I have can > help > > me with my current problem. I've got a data set that contains the daily > > closing price for a number of different stocks. What I want to do is find > an > > equation that fits those points and then use it to predict the future > price. > > > > In the past I've written an application that did a simple least squares > > linear regression (what is handled by the SimpleRegression class I > believe) > > e.g. finding a line of best fit with the formula y = mx + c. What I need > now > > is something that can give me a formula of y = ax^n + bx^n-1 .... mx + c > > where I can choose n, the number of terms. > > > > I think this can be handled by general least squares but the simple case > I > > implemented in the past was already pushing my understanding of maths. Is > > this what the GLSMultipleLinearRegression class does? If so what do I > need > > to read up on to understand it? > > > > Many thanks, > > Graham > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@commons.apache.org > For additional commands, e-mail: user-help@commons.apache.org > > -- Ted Dunning, CTO DeepDyve 111 West Evelyn Ave. Ste. 202 Sunnyvale, CA 94086 http://www.deepdyve.com 858-414-0013 (m) 408-773-0220 (fax) --00151751146c97fd6c046e99e4ac--