mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Finding best NearestNUserNeighborhood size
Date Wed, 23 Jan 2013 13:16:24 GMT
The stochastic nature of the evaluation means your results will vary
randomly from run to run. This looks to my eyeballs like most of the
variation you see. You probably want to average over many runs.

You will probably find that accuracy peaks around some neighborhood size:
adding more useful neighbors helps but at some point the next nearest isn't
so similar and the additional data harms the result more than helps.
On Jan 23, 2013 1:01 PM, "Zia mel" <ziad.kamel25@gmail.com> wrote:

> Hi
> I used NearestNUserNeighborhood with RMSE in a user recommender that
> use PearsonCorrelationSimilarity , I found that changing the
> neighborhood size has no clear pattern or effect. Sometimes it
> increase others decrease. While using the neighborhood size with
> precision has a better pattern. Any reason? Another point is that the
> RMSE change for every run since it choose different sample , so would
> running the code for 10 or 20 times and taking the average be a good
> idea or there is better thing to do?
>
> //-- RUN 1
>  2,  0.5523623146152608
>  3,  0.5425283201773704
>  4,  0.669846658662311
>  5,  0.5956616542334392
>  6,  0.6033911039809353
>  7,  0.6135206544496685
>  8,  0.5740444208649034
>  9,  0.642798288443049
>  10,  0.6266535555651472
>
> //-- RUN 2
>  2,  0.5415411343523825
>  3,  0.6784589323396696
>  4,  0.6347069968141124
>  5,  0.6968820296725008
>  6,  0.5953849874479478
>  7,  0.6791828191904128
>  8,  0.6072462830257853
>  9,  0.6461346217476011
>  10,  0.6043919119341171
>
> Thanks !
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message