Mailing-List: contact user-help@mahout.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@mahout.apache.org
Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com
 designates 209.85.220.54 as permitted sender)
References: <1361039400.65635.YahooMailNeo@web140003.mail.bf1.yahoo.com>
 <CAEccTyxLvJ8v848me5jXX6WACzzUqZteqV0m8y1i3UEVATG7_w@mail.gmail.com>
 <1361042819.19712.YahooMailNeo@web140001.mail.bf1.yahoo.com>
 <CAEccTywhBGiXapZek1wbmpJDoJRXGxUdFkdZHOfw77GaBNfaXg@mail.gmail.com>
 <CAH9ofMYePnNf=M5Beb+P9Eap9iq3VVxQ_ZeEKKMZJXB8_sk2jA@mail.gmail.com>
 <CAEccTywS2jZEsm4+Xjq2kX31uKRxUUcC0tA9Y+uGkYmrzBhW7g@mail.gmail.com>
 <CAH9ofMZ1oA-NUwb5qhCZWtnawYradSr=D0BiX2u1VMjqhJ_MNw@mail.gmail.com>
 <CAEccTyzRK++sfQqFZ9HjoPt732dKfsY-GoQMjMy7zAfHA-Qc1g@mail.gmail.com>
 <CAH9ofMZXmiDUy1_0ShnRGe4ga-z3gseQ-_toAzT0xfWJD=nWVA@mail.gmail.com>
In-Reply-To: 
 <CAH9ofMZXmiDUy1_0ShnRGe4ga-z3gseQ-_toAzT0xfWJD=nWVA@mail.gmail.com>
Mime-Version: 1.0 (1.0)
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii
Message-Id: <F7029153-4C5B-4EF0-9C6D-AC55043608A1@gmail.com>
Cc: "user@mahout.apache.org" <user@mahout.apache.org>
From: Ted Dunning <ted.dunning@gmail.com>
Subject: Re: Problems with Mahout's RecommenderIRStatsEvaluator
Date: Sat, 16 Feb 2013 16:12:11 -0700
To: "user@mahout.apache.org" <user@mahout.apache.org>

There are a variety of common time based effects which make time splits best=
 in many practical cases.  Having the training data all be from the past emu=
lates this better than random splits.=20

For one thing, you can have the same user under different names in training a=
nd test.  For another thing, in real life you get data from the past of the u=
ser under consideration. As a third consideration, topical events can influe=
nce all users in common. =20

These all mean that random training splits can have very large error in esti=
mated performance.=20

Sent from my iPhone

On Feb 16, 2013, at 1:41 PM, Tevfik Aytekin <tevfik.aytekin@gmail.com> wrote=
:

> What I mean is you can choose ratings randomly and try to recommend
> the ones above  the threshold
>=20
> On Sat, Feb 16, 2013 at 10:32 PM, Sean Owen <srowen@gmail.com> wrote:
>> Sure, if you were predicting ratings for one movie given a set of ratings=

>> for that movie and the ratings for many other movies. That isn't what the=

>> recommender problem is. Here, the problem is to list N movies most likely=

>> to be top-rated. The precision-recall test is, in turn, a test of top N
>> results, not a test over prediction accuracy. We aren't talking about RMS=
E
>> here or even any particular means of generating top N recommendations. Yo=
u
>> don't even have to predict ratings to make a top N list.
>>=20
>>=20
>> On Sat, Feb 16, 2013 at 9:28 PM, Tevfik Aytekin <tevfik.aytekin@gmail.com=
>wrote:
>>=20
>>> No, rating prediction is clearly a supervised ML problem
>>>=20
>>> On Sat, Feb 16, 2013 at 10:15 PM, Sean Owen <srowen@gmail.com> wrote:
>>>> This is a good answer for evaluation of supervised ML, but, this is
>>>> unsupervised. Choosing randomly is choosing the 'right answers' randoml=
y,
>>>> and that's plainly problematic.
>>>>=20
>>>>=20
>>>> On Sat, Feb 16, 2013 at 8:53 PM, Tevfik Aytekin <
>>> tevfik.aytekin@gmail.com>wrote:
>>>>=20
>>>>> I think, it is better to choose ratings of the test user in a random
>>>>> fashion.
>>>>>=20
>>>>> On Sat, Feb 16, 2013 at 9:37 PM, Sean Owen <srowen@gmail.com> wrote:
>>>>>> Yes. But: the test sample is small. Using 40% of your data to test is=

>>>>>> probably quite too much.
>>>>>>=20
>>>>>> My point is that it may be the least-bad thing to do. What test are
>>> you
>>>>>> proposing instead, and why is it coherent with what you're testing?
>>>>>>=20
>>>>>=20
>>>=20