lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Fauerbach <chris.fauerb...@gmail.com>
Subject Re: Using MLT feature
Date Mon, 04 Apr 2011 09:22:17 GMT
Do you want to not index if something similar? Or don't index if exact.   I would look into
a hash code of the document if you don't want to index exact.    Similar though, I think has
to be based off a document in the index.   

On Apr 4, 2011, at 5:16, Frederico Azeiteiro <Frederico.Azeiteiro@cision.com> wrote:

> Hi,
> 
> 
> 
> I would like to hear your opinion about the MLT feature and if it's a
> good solution to what I need to implement.
> 
> 
> 
> My index has fields like: headline, body and medianame.
> 
> What I need to do is, before adding a new doc, verify if a similar doc
> exists for this media.
> 
> 
> 
> My idea is to use the MorelikeThisHandler
> (http://wiki.apache.org/solr/MoreLikeThisHandler) in the following way:
> 
> 
> 
> For each new doc, perform a MLT search with q= medianame and
> stream.body=headline+bodytext.
> 
> If no similar docs are found than I can safely add the doc.
> 
> 
> 
> Is this feasible using the MLT handler? Is it a good approach? Are there
> a better way to perform this comparison?
> 
> 
> 
> Thank you for your help.
> 
> 
> 
> Best regards,
> 
> ____________________________________________
> 
> Frederico Azeiteiro
> 
> 
> 

Mime
View raw message