mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafal Lukawiecki <>
Subject Re: RecommenderJob Recommending an Item Already Preferred by a User
Date Thu, 01 Aug 2013 15:56:26 GMT
Hi Sebastian,

I've rechecked the results, and, I'm afraid that the issue has not gone away, contrary to
my yesterday's enthusiastic response. Using 0.8 I have retested with and without --maxPrefsPerUser
9000 parameter (no user has more than 5000 prefs). I have also supplied the prefs file, without
the preference value, that is as: user,item (one per line) as a --filterFile, with and without
the -maxPrefsPerUser, and I am afraid we are also seeing recommendations for items the user
has expressed a prior preference for.

I suppose I need to file a bug report. 

Rafal Lukawiecki
Pardon my brevity, sent from a telephone.

On 31 Jul 2013, at 22:35, "Rafal Lukawiecki" <> wrote:

> Dear Sebastian,
> It looks like setting --maxPrefsPerUser 10000 have resolved the issue in our case—it
seems that the most preferences a user had was just about 5000, so I doubled it just-in-case,
but when I operationalise this model, I will make sure to calculate the actual max number
of preferences and set the parameter accordingly. I will double-check the resultset to make
sure the issue is really gone, as I have only checked the few cases where we have spotted
a recommendation of a previously preferred item.
> Would you like me to file a bug, and would you like me to test it on 0.8 or another version?
I am using 0.7.
> Thanks for your kind support.
> Rafal
> --
> Rafal Lukawiecki
> Strategic Consultant and Director 
> Project Botticelli Ltd
> On 31 Jul 2013, at 06:22, Sebastian Schelter <>
> wrote:
> Hi Rafal,
> can you try to set the option --maxPrefsPerUser to the maximum number of
> interactions per user and see if you still get the error?
> Best,
> Sebastian
> On 30.07.2013 19:29, Rafal Lukawiecki wrote:
>> Thank you Sebastian. The data set is not that large, as we are running tests on a
subset. It is about 24k users, 40k items, the preference file has 65k preferences as triples.
This was using Similarity Cooccurrence.
>> I can see if I could anonymise the data set to share if that would be helpful.
>> Thanks for your kind help. 
>> Rafal
>> --
>> Rafal Lukawiecki
>> Pardon my brevity, sent from a telephone.
>> On 30 Jul 2013, at 18:18, "Sebastian Schelter" <> wrote:
>>> Hi Rafal,
>>> can you issue a ticket for this problem at
>>> ? We have unit-tests that
>>> check whether this happens and currently they work fine. I can only imagine
>>> that the problem occurs in larger datasets where we sample the data in some
>>> places. Can you describe a scenario/dataset where this happens?
>>> Best,
>>> Sebastian
>>> 2013/7/30 Rafal Lukawiecki <>
>>>> I'm new here, just registered. Many thanks to everyone for working on an
>>>> amazing piece of software, thank you for building Mahout and for your
>>>> support. My apologies if this is not the right place to ask the question—I
>>>> have searched for the issue, and I can see this problem has been reported
>>>> here:
>>>> Unfortunately, the trail leads to the newsgroups, and I have not found a
>>>> way, yet, to get an answer from them, without asking you.
>>>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, and I
>>>> am finding that it is recommending items that the user has already
>>>> expressed a preference for in their input file. I understand that this
>>>> should not be happening, and I am not sure if there is a know fix or if I
>>>> should be looking for a workaround (such as using the entire input as the
>>>> filterFile).
>>>> I will double-check that there is no error on my side, but so far it does
>>>> not seem that way.
>>>> Many thanks and my regards from Ireland,
>>>> Rafal Lukawiecki
>>>> --
>>>> Rafal Lukawiecki
>>>> Strategic Consultant and Director
>>>> Project Botticelli Ltd

View raw message