lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srisatya Pyla" <srisp...@in.ibm.com>
Subject RE: Need more info on MLT (More Like This) feature
Date Fri, 13 Sep 2019 17:55:02 GMT
Thank you very much for quick response. This is very much helpful to us.
While analyzing the results for some jobs, it is returning high score for 
a document which is not much relevant to the base document. 
Is there any way we can improve the results and scoring? 
How it exactly give the score for matching document based on a matching 
field?  This is helpful to know why it is giving highest matching score 
for the specific documents.


Regards,

SST  Narasimha Rao Pyla
IBM Talent Management Solutions
Mobile : +91 9849315546
E-mail : srispyla@in.ibm.com


IBM Visakha Hills
Visakhapatnam, AP 530045
India




From:   Chee Yee Lim <cheeyee.lim@gmail.com>
To:     Srisatya Pyla <srispyla@in.ibm.com>
Cc:     solr-user@lucene.apache.org, Rajeev Kasarabada1 
<kasarab1@in.ibm.com>, Archana Gavini1 <agavini1@in.ibm.com>
Date:   13/09/2019 04:32 PM
Subject:        [EXTERNAL] Re: Need more info on MLT (More Like This) 
feature



To use knnSearch, you need to submit a POST request to the Stream request 
handler.

Using your example query, you will need to rewrite them from this :

http://[SOLR
URL]/mlt?q=sjkey:1414462-25600-5258&wt=json&indent=true&mlt=true&rows=100&mlt.fl=jobdescription&mlt.mindf=1&mlt.mintf=1&fl=jobtitle,jobdescription&fq=siteid:5258

to this (using curl as an example to send POST request) :

curl --data-urlencode 'expr=knnSearch([collection_name],
id="1414462-25600-5258",
qf="jobdescription",
k=100,
fl="jobtitle,jobdescription,score",
sort="score desc",
fq="siteid:5258",
mintf=1, 
mindf=1)' http://[SOLRURL]/stream 

Note that this assume your document ID is sjkey.

More detailed documentation on how Stream handler works can be seen here, 
https://lucene.apache.org/solr/guide/8_1/streaming-expressions.html.

Best wishes,
Chee Yee

On Fri, 13 Sep 2019 at 17:57, Srisatya Pyla <srispyla@in.ibm.com> wrote:
Hi Chee Yee Lim,


Thank you for your quick response.  
We do not find much documentation on knnsearch on how to do use that.   
Could you please guide us with more info on how this can be used?

Can we use this the way we use Solr by querying with Solr URL like   
http://[SOLR URL]/mlt.... ?  OR any other way?
And also please provide with any more detailed documentation if you have 
any.


Regards,

SST  Narasimha Rao Pyla
IBM Talent Management Solutions
Mobile :+91 9849315546
E-mail :srispyla@in.ibm.com


IBM Visakha Hills
Visakhapatnam, AP 530045
India






 
 
----- Original message -----
From: Chee Yee Lim <cheeyee.lim@gmail.com>
To: solr-user@lucene.apache.org
Cc: Archana Gavini1 <agavini1@in.ibm.com>, Rajeev Kasarabada1 <
kasarab1@in.ibm.com>
Subject: [EXTERNAL] Re: Need more info on MLT (More Like This) feature
Date: Thu, Sep 12, 2019 6:43 PM
 
I've been working with MLT handler (Solr 8.1.1) by calling it the same way 
you did, http://[SOLRURL]/mlt. But the response is very unreliable with 
90% of the same queries resulting in Java null pointer exception, and only 
10% returning expected response. I do not know what is the cause of this.
 
I overcame this problem by using knnSearch via Stream handler (
https://lucene.apache.org/solr/guide/8_1/stream-source-reference.html#knnsearch
). It is just a wrapper on MLT, and it works brilliantly. It is worth 
checking it out if you are running Solr in cloud mode.
 
If you pass the fl="score"&sort="score desc" to knnSearch, you will be 
able to get the results sorted by matching scores.
 
Best wishes,
Chee Yee
  
On Thu, 12 Sep 2019 at 19:44, Srisatya Pyla <srispyla@in.ibm.com> wrote:
Hi Solr Seatch Team,

I am a developer from IBM Kenexa Brassring.  We are using Solr Search 
engine for searching jobs in our applications.
We are planning to use MLT feature to get the similar matching documents 
(jobs) based on one document (job).

When trying to explore this option, we are using matching field as 
JobDescription of the job and we are getting some unrelated documents in 
the MLT results which are not expected.

The query like below:

http://[SOLR
URL]/mlt?q=sjkey:1414462-25600-5258&wt=json&indent=true&mlt=true&rows=100&mlt.fl=jobdescription&mlt.mindf=1&mlt.mintf=1&fl=jobtitle,jobdescription&fq=siteid:5258



We have few questions:
1) Is there any way we can get the matching score for each of the matching 
document we get in the MLT results, so that we can get the sorting done on 
the score to have the highest matching document at the top of the result.

2) Are there any best practices using MLT Handler?


Regards, 

SST  Narasimha Rao Pyla
IBM Talent Management Solutions
Mobile :+91 9849315546
E-mail :srispyla@in.ibm.com


IBM Visakha Hills
Visakhapatnam, AP 530045
India


 
 






Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message