mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject [B'A] h_v cross recommender
Date Tue, 19 Mar 2013 13:47:50 GMT
To pick up an old thread…

A = views items x users
B = purchases items x users

A cross recommender B'A h_v + B'B h_p = r_p
The B'B h_p is the basic boolean mahout recommender trained on purchases and we'll use that
implementation I assume.

B'A gives cooccurrences of views and purchases multiplying by the user history of views h_v
you get a prediction of purchase preferences cross recommended by view. The same can be done
for other non-purchase actions. The partial vectors then are summed, sorted and the top item-value
pairs returned as recs.

Hopefully I'm OK so far. Now on to implementation.

We'd like both user history based recs and perhaps more importantly item history based recs,
so similar in purchase actions or in this case views that cooccur with purchases.

[B'A] h_v is a model, built from the two action matrixes and is a sparse matrix, times  a
users view history sparse column vector. Seems like a pre-calculated thing because the calc
will be time consuming for each vector.

But how to calc the item to item similarity? Precalc all pairwise similarities so they are
just a runtime lookup? Also quite time consuming but fast at runtime

Here is where I'm fuzzy. To use Lucene it seems we would take B'A and index it, (a field per
value?) by row (or is it column?), then use the original row corresponding to the item in
question and taken from B'A as the query. Lucene would find the most similar and should be
pretty fast so we would not need to pre-calculate.

Any corrections are appreciated.
View raw message