mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: implementation of context-aware recommender in Mahout
Date Sun, 08 Mar 2015 18:32:12 GMT
Either architecture will work. Even if you want to pre-filter the data. The search engine can
post-filter in the query. The pre-filter is to create a separate model for each day of the
week, right? So works with any one. 

If you are relying on the evaluator implemented in Mahout then use the old java code because
the new one doesn’t supply an evaluator.

On Mar 8, 2015, at 10:21 AM, Efi Koulouri <> wrote:

Thanks for your help!

Actually, I want to build a recommender for experimental purposes following
the pre-filtering and post-filtering approaches that I described. I have
already two datasets and I want to show the benefits of using a
"context-aware" recommender. So,the recommender is going to work offline.

I saw that the search engine approach is very interesting but in my case I
think that building the recommender using the java classes is more
appropriate as I need to use both approaches (post filtering,pre
filtering). Am I right ?

On 8 March 2015 at 16:08, Ted Dunning <> wrote:

> The by far easiest way to build a recommender (especially for production)
> is to use the search engine approach (what Pat was recommending).
> Post filtering can be done using the search engine far more easily than
> using Java classes.
> On Sat, Mar 7, 2015 at 8:44 AM, Pat Ferrel <> wrote:
>> Ooops a several typos corrected below
>> On Mar 7, 2015, at 7:05 AM, Pat Ferrel <> wrote:
>> The new cooccurrence recommender can use context as part of the user
>> history, or as a method to bias or filter results. In any case you want
> to
>> record all actions. Filtering results is easy and tossing all data but
> for
>> one day of the week will reduce your cooccurrences and the quality of
> your
>> data.
>> 1) treat interaction with items at a day of week as a secondary action.
>> The primary action is video-view—it is what you want to recommend. Record
>> video-view and _when_ they are viewed by each user. Create a cooccurrence
>> indicator for video-view and a cross-cooccurrence indicator for
> day-of-week.
>> 2) your query will be:
>> history of video-views -> video-view indicators — this contribute in a
>> non-contextual way
>> history of day-of-week -> day-of-week cross-cooccurrence indicator — this
>> contributes in a non-contextual way
>> current day-of-week -> day-of-week boosted by some fairly high value OR
>> use as a filter — this biases towards or filters by the current
> day-of-week.
>> In this case you have 2 indicators, 1 of which is a day-of week. You can
>> use it as just another indicator (non-contextual) and/or as a boost or
>> filter results by it. The search engine supports all three methods in a
>> single query if you have both indicators indexed.
>> BTW this won’t work on the demo site since all you are doing is “Liking”
>> and the day-of-week you liked something is not likely to correlated to
>> “Liking”   So I changed the example above to _viewing_ a video. Not all
>> context will benefit your recommendations. But it is usually easy to
>> experiment since the context applied at query time and as long at the
> data
>> was used in training, no changes need to be made to the model (no
>> pre-filtering)
>> On Mar 7, 2015, at 5:16 AM, Efi Koulouri <> wrote:
>> Thanks for your reply!
>> Actually, the context in pre-filtering serves as a query for selecting
>> relevant data. An example of a contextual data filter for a movie
>> recommender system would be: if a person wants to see a movie on
> Saturday,
>> only the Saturday rating data is used to recommend movies.
>> So,what I need for the rating prediction is the data relevent to the
>> specific context.
>> Regards,
>> Efi
>> On 7 March 2015 at 01:32, Pat Ferrel <> wrote:
>>> The new Spark based recommender can easily handle context in many
> forms.
>>> See the top references section here
>>> It does not use the IDRescorer approach at all so perhaps you should
>>> describe what you want to use as context.
>>> In the demo site for the new stuff (a guide to online video)
>>> you’ll see a couple examples of
> “context”.
>>> For instance you are viewing a video that has several genre tags.
> You’ll
>>> see at least 3 lists of recommendations:
>>> 1) people who like the video you are looking at also like these other
>>> viedeos—non-personalized recs
>>> 2) people who like this video liked these, from similar genres
>>> 3) personalized recs from all genres based on your “liking” history
>>> Many other things can be used as context like time of day, location,
>>> mobile or desktop, user profile attributes, etc. The way it does this
> is
>>> through the search engine, which can take filters and boost certain
> item
>>> attributes. So I could show only recommendations made in the same year
> as
>>> the viewed movie or use the year to bias recommendations by boosting
> the
>>> “release-date” field in the recommender query. The recommender is also
>>> multimodal and so can use many user actions to better the quality of
>> recs.
>>> Removing some of your data, in what you call pre-filtering may not get
>> you
>>> what you want. Removing data that is actual user behavior can reduce
> the
>>> quality of recommendations so please give an example.
>>> On Mar 6, 2015, at 4:45 AM, Efi Koulouri <> wrote:
>>> Hi all,
>>> I am trying to implement an context-aware recommender in Mahout. As I
>>> haven't use the library before I haven't a lot experience. So, I would
>>> really appreciate your response!
>>> What I want to do is to implement the two context- aware approaches
> that
>>> have been proposed, pre-filtering and post-filtering. The former
> filters
>>> out the dataset based on the value of contextual factor before the
>>> collaborative filtering while the latter rescores the recommendations
>> after
>>> the collaborative filtering.
>>> I have already read older similar questions regarding the context-aware
>>> recommender implementation in mahout and I know that the post-filtering
>>> method can be implemented using the IDRescorer. For the pre-filtering
>>> approach there is the option to use the CandidateItemsStategy in case
> of
>>> the item-based recommender. On the other hand if we want to implement
>> this
>>> approach using the user-bsed recommender no such option is available.
>>> In order to implement the pre-filtering using the user-based
>> recommender, I
>>> was thinking to filter out the unrelated user,items pairs from the
>> dataset
>>> before the creation of the data model. This means that the data model
>> will
>>> take as input a subset of the initial dataset.
>>> Does this approach sound correct? There are some concerns regarding the
>>> evaluation of the recommender. Does it have any impact on this?
>>> Thank you in advance!
>>> Regards,
>>> Efi

View raw message