mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blade Liu <>
Subject Confused about train/test data split in recommender evaluation
Date Tue, 11 Nov 2014 06:28:52 GMT

I'm new to Mahout and got confused how train and test data are split when
evaluating recommenders.

I'm not sure whether data is split based on selecting partial item
preferences, or selecting specific users(together with all their
preferences). For example, train data accounts for 60%, and test data
accounts for 40%. Does it indicates 40% total preferences will used for
testing(regardless associated users)?  In classification, all features
associated with the users will be selected..

If partition criteria is based on preference, would it affect neighborhood
similarity before computing recommended score?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message