mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Ekvall <niklas.ekv...@gmail.com>
Subject Re: Mahout - Recommenditemvalue with magnitude of 1
Date Tue, 24 Nov 2015 20:21:23 GMT
Okay!

No pre-filter and the user/item ids should start from 0 and go as many user
and items there are. So, all the data we have should go into Mahout and we
filter inside Mahout....correct?

We do the same pre-filter for Spark item-similarity, is that wrong to?

Best regards, Niklas

On Tuesday, November 24, 2015, Pat Ferrel <pat@occamsmachete.com> wrote:

> I wouldn’t pre-filter but in any case the ids input to hadoop-mahout need
> to follow those rules.
>
> The new recommender I mentioned has no such requirements, it uses string
> IDs.
>
> On Nov 24, 2015, at 11:44 AM, Niklas Ekvall <niklas.ekvall@gmail.com
> <javascript:;>> wrote:
>
> No, it does not start from 0 and does not cover all number between 0 and
> the number of items/users. We do a prefiltering before (a user must have
> bought at lest 5 product and a product must have been  bought by 3 users)
> we use Mahout on the dataset. Therefore we start with user 3, then it jumps
> to user 5, etc.
>
> Is this wrong? Should we use all data as input to Mahout and do the
> filtring inside Mahout?
>
> We use the second latest version of Mahout!
>
> Best regards, Niklas
>
> On Tuesday, November 24, 2015, Pat Ferrel <pat@occamsmachete.com
> <javascript:;>
> <javascript:_e(%7B%7D,'cvml','pat@occamsmachete.com <javascript:;>');>>
> wrote:
>
> > Do your ids start with 0 and cover all numbers between 0 and the number
> of
> > items -1 (same for user ids)?
> > The old hadoop-mahout code required ordinal ids starting at 0
> >
> >
> > On Nov 24, 2015, at 8:19 AM, Niklas Ekvall <niklas.ekvall@gmail.com
> <javascript:;>>
> > wrote:
> >
> > Hi Pat,
> >
> > Here is some input:
> >
> > 3       7414
> > 3       12682
> > 3       18947
> > 3       19980
> > 3       26975
> > 3       54635
> > 3       67789
> > 3       73212
> > 3       118932
> > 3       138846
> > 3       141268
> > 5       3
> > 5       2123
> > 5       37955
> > 5       39975
> > 5       113289
> > 6       3
> > 6       456
> > 6       2188
> > 6       2496
> > 6       6194
> > 6       6361
> > 6       6768
> > 6       6919
> > 6       6920
> > 6       7257
> > 6       7705
> > 6       7706
> > 6       11788
> >
> > And some output:
> >
> > 3
> >
> >
> [122086:1.0,1846:1.0,74638:1.0,63240:1.0,87540:1.0,2742:1.0,2981:1.0,8325:1.0,145598:1.0,49675:1.0,131388:1.0,72113:1.0,3493:1.0,56131:1.0,30422:1.0,87829:1.0,111190:1.0,13597:1.0,83436:1.0,61772:1.0]
> > 5
> >
> >
> [32349:1.0,29413:1.0,111896:1.0,61845:1.0,50016:1.0,1607:1.0,15237:1.0,133229:1.0,65805:1.0,34034:1.0,133071:1.0,28894:1.0,18658:1.0,32095:1.0,4402:1.0,47522:1.0,31022:1.0,23936:1.0,6243:1.0,53214:1.0]
> > 6
> >
> >
> [40756:1.0,34420:1.0,31153:1.0,114717:1.0,53945:1.0,71148:1.0,26095:1.0,112941:1.0,55284:1.0,111346:1.0,112201:1.0,65759:1.0,133127:1.0,61378:1.0,16413:1.0,113289:1.0,49675:1.0,14995:1.0,141028:1.0,27506:1.0]
> >
> > Best regards, Niklas
> >
> > 2015-11-24 16:48 GMT+01:00 Pat Ferrel <pat@occamsmachete.com
> <javascript:;>>:
> >
> >> Sounds like you may not have the input right. Recommendations should be
> >> sorted by the strength and so shouldn’t all be 1 unless the data is very
> >> odd.
> >>
> >> Can you give us a small sample of the input?
> >>
> >>
> >> BTW a newer recommender using Mahout’s Spark based code and a search
> >> engine is here:
> >>
> >
> https://github.com/PredictionIO/template-scala-parallel-universal-recommendation
> >> a single machine install script is here:
> > https://docs.prediction.io/start/
> >>
> >> On Nov 24, 2015, at 2:16 AM, Niklas Ekvall <niklas.ekvall@gmail.com
> <javascript:;>>
> >> wrote:
> >>
> >> Hello Mahout Users!
> >>
> >> I use today Mahout - Recommenditembased with Log-similarity to produce
> >> personal recommendations for Trigger Eamils in a offline mode. But when
> I
> >> produce e.g. 50 recommendations the rank value of the recommendations
> are
> >> always of magnitude 1. Why is this so? And, is the first recommendations
> > in
> >> this list the best one or is there some randomness in this list?
> >>
> >> Best regards,
> >>
> >> Niklas Ekvall
> >>
> >>
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message