Thanks guys that's really useful and tells me I'm at least looking in the
right area. I understand some of what you are talking about but the rest
leaves me scratching my head in bewilderment. Do you happen to know of any
good sites where I could learn about this? Many years ago I did advanced
mathematics but that was focused on engineering rather than statistics so
the ideas aren't completely alien to me.
The solution I have at the moment gives really poor results so throwing
polynomials at it would probably be an improvement but I understand what you
are saying about over fitting to noise.
2009/7/13 Ted Dunning <ted.dunning@gmail.com>
> And if you are really working on time series for stocks, you will likely
> have explosively bad results applying a simple polynomial fit.
>
> You should, at least, remove the longterm exponential trend. This is
> probably best done using something like lowess smoothing. If you are
> looking at longterm data, you should also rescale as a percentage of long
> term trend.
>
> Then for modeling the data, you have to be very careful to avoid
> overfitting to noise. Simply throwing polynomials at the problem is the
> road to ruin. Without significant math skills it will be difficult to get
> really good results. You might try penalizing your fit by also minimizing
> the summed squares of your coefficients. This is equivalent to weight
> decay in neural networks.
>
> Commons math is probably a very nice way to implement such algorithms in
> production. For exploratory development, I would recommend R instead.
>
> On Mon, Jul 13, 2009 at 10:26 AM, Sujit Pal <sujit.pal@comcast.net> wrote:
>
> > Hi Graham,
> >
> > You want multiple linear regression. Check out this page from the
> > commonsmath docs.
> >
> >
> http://commons.apache.org/math/userguide/stat.html#a1.5_Multiple_linear_regression
> >
> > HTH
> > Sujit
> >
> > On Mon, 20090713 at 17:25 +0100, Graham Smith wrote:
> > > Hi,
> > >
> > > I'm hoping that someone with a bit more maths knowledge than I have can
> > help
> > > me with my current problem. I've got a data set that contains the daily
> > > closing price for a number of different stocks. What I want to do is
> find
> > an
> > > equation that fits those points and then use it to predict the future
> > price.
> > >
> > > In the past I've written an application that did a simple least squares
> > > linear regression (what is handled by the SimpleRegression class I
> > believe)
> > > e.g. finding a line of best fit with the formula y = mx + c. What I
> need
> > now
> > > is something that can give me a formula of y = ax^n + bx^n1 .... mx +
> c
> > > where I can choose n, the number of terms.
> > >
> > > I think this can be handled by general least squares but the simple
> case
> > I
> > > implemented in the past was already pushing my understanding of maths.
> Is
> > > this what the GLSMultipleLinearRegression class does? If so what do I
> > need
> > > to read up on to understand it?
> > >
> > > Many thanks,
> > > Graham
> >
> >
> > 
> > To unsubscribe, email: userunsubscribe@commons.apache.org
> > For additional commands, email: userhelp@commons.apache.org
> >
> >
>
>
> 
> Ted Dunning, CTO
> DeepDyve
>
> 111 West Evelyn Ave. Ste. 202
> Sunnyvale, CA 94086
> http://www.deepdyve.com
> 8584140013 (m)
> 4087730220 (fax)
>
