lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-69) PATCH:MoreLikeThis support
Date Thu, 25 Jan 2007 15:00:49 GMT

    [ https://issues.apache.org/jira/browse/SOLR-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467399
] 

Yonik Seeley commented on SOLR-69:
----------------------------------

> MoreLikeThis queries should work irrelevant of whether fields are stored or not, as it's
based on what's indexed

I haven't looked at the lucene-code for more-like-this, but it's just like highlighting...
to get the tokens for a specific document, you need to either get it's stored field and re-analyze
or store term vectors and use them.
Looking up those terms in other documents is then fast (that's where the inverted index comes
in)


> PATCH:MoreLikeThis support
> --------------------------
>
>                 Key: SOLR-69
>                 URL: https://issues.apache.org/jira/browse/SOLR-69
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Bertrand Delacretaz
>            Priority: Minor
>         Attachments: lucene-queries-2.0.0.jar, SOLR-69.patch, SOLR-69.patch, SOLR-69.patch
>
>
> Here's a patch that implements simple support of Lucene's MoreLikeThis class.
> The MoreLikeThisHelper code is heavily based on (hmm..."lifted from" might be more appropriate
;-) Erik Hatcher's example mentioned in http://www.mail-archive.com/solr-user@lucene.apache.org/msg00878.html
> To use it, add at least the following parameters to a standard or dismax query:
>   mlt=true
>   mlt.fl=list,of,fields,which,define,similarity
> See the MoreLikeThisHelper source code for more parameters.
> Here are two URLs that work with the example config, after loading all documents found
in exampledocs in the index (just to show that it seems to work - of course you need a larger
corpus to make it interesting):
> http://localhost:8983/solr/select/?stylesheet=&q=apache&qt=standard&mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mindf=1&fl=id,score
> http://localhost:8983/solr/select/?stylesheet=&q=apache&qt=dismax&mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mindf=1&fl=id,score
> Results are added to the output like this:
> <response>
>   ...
>   <lst name="moreLikeThis">
>     <result name="UTF8TEST" numFound="1" start="0" maxScore="1.5293242">
>       <doc>
>         <float name="score">1.5293242</float>
>         <str name="id">SOLR1000</str>
>       </doc>
>     </result>
>     <result name="SOLR1000" numFound="1" start="0" maxScore="1.5293242">
>       <doc>
>         <float name="score">1.5293242</float>
>         <str name="id">UTF8TEST</str>
>       </doc>
>     </result>
>   </lst>
> I haven't tested this extensively yet, will do in the next few days. But comments are
welcome of course.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message