mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marko Ciric <ciric.ma...@gmail.com>
Subject Evaluating boolean preference data sets
Date Thu, 21 Jul 2011 12:49:31 GMT
Hi guys,

I wonder if Mahout should have a "precision and recall" evaluator that
calculates the relevant items data set without looking to the relevance
threshold. This would be suitable for data sets with boolean preference
nature. In addition, the relevant items can be removed from the training
data set by random (removing first couple of preferred items every time
wouldn't be a great idea).

On the other hand, having relevance threshold
with RecommenderIRStatsEvaluator set to 1.0 removes exactly "at" number of
items. As the recommender returns that number of items, the precision and
recall would have the same value. Is this Ok or is it a bug, given that
  precision = intersection / num_recommended_items (where
num_recommended_items is almost always "at")
  recall = intersection / num_relevant_items (also "at" as the previously
mentioned why relevanceThreshold is 1.0)?


--
Marko Ćirić
ciric.marko@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message