lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oloan, Aidan" <aol...@edmunds.com>
Subject Using large numbers of weighted tags to compare documents
Date Tue, 03 Mar 2009 22:12:16 GMT
I have a question regarding using large number of weighted tags in order to compare documents
using Solr.

Basically, I have a set of domain objects, each of which has many properties, and from these
I'm creating documents which are added to Solr. The properties are all being turned into tags,
so the Solr document simply has a field to identify the object, and a large number of tags
describing it (say on average ~ 150 of these tags for each document). Right now the tags are
bound to specific terms, but are sometimes accompanied with a numeric value. Each tag will
need to be weighted since some of the properties are more significant for comparison than
others.

Given one document, I want to be able to find similar documents by comparing the tags.   Should
I utilize Term Vectors and MoreLikeThis functionality for this, or do Term Vectors only work
with the frequency of the term (which will usually only be at most once for each tag in each
document)? Should I be looking at the DisMax query handler instead in order to apply boosts
to tag values?


Aidan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message