lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: MoreLikeThis for multiple documents
Date Thu, 26 Jul 2007 15:23:00 GMT
I have some sample code for doing relevance feedback across multiple  
documents at

It could be modified to provide more of the MoreLikeThis  
functionality (i.e. determining important terms via tf/idf) for now  
it just takes the top X terms


On Jul 25, 2007, at 3:04 PM, Jens Grivolla wrote:

> Hello,
> I'm looking to extract significant terms characterizing a set of  
> documents (which in turn relate to a topic).
> This basically comes down to functionality similar to determining  
> the terms with the greatest offer weight (as used for blind  
> relevance feedback), or maximizing tf.idf (as is done in  
> MoreLikeThis).
> Is there anything like this already implemented, or do I need to  
> iterate through all documents in the set "manually", re-tokenize  
> each one (or maybe use TermVectors), and then calculate the weight  
> for each term?
> Thanks,
>    Jens
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message