lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave <hastings.recurs...@gmail.com>
Subject Re: Need more info on MLT (More Like This) feature
Date Fri, 13 Sep 2019 20:09:00 GMT
As a side note, if you use shingles with the mlt handler I believe you will get better scores/relevant
results. So “to be free” becomes indexes as “to_be” “to_be_free” and “be_free”
but also as each word. It makes the index significantly larger but creates better “unique
terms” in my opinion and improved the results for me at least. 

> On Sep 13, 2019, at 2:51 PM, Srisatya Pyla <srispyla@in.ibm.com> wrote:
> 
> Thank you very much for quick response. This is very much helpful to us.
> While analyzing the results for some jobs, it is returning high score for a document
which is not much relevant to the base document. 
> Is there any way we can improve the results and scoring?  
> How it exactly give the score for matching document based on a matching field?  This
is helpful to know why it is giving highest matching score for the specific documents.
> 
> 
> Regards,
> SST  Narasimha Rao Pyla
> IBM Talent Management Solutions
> Mobile :+91 9849315546
> E-mail :srispyla@in.ibm.com	
> 
> 
> IBM Visakha Hills
> Visakhapatnam, AP 530045
> India
> 
> 
> 
> 
> 
> From:        Chee Yee Lim <cheeyee.lim@gmail.com>
> To:        Srisatya Pyla <srispyla@in.ibm.com>
> Cc:        solr-user@lucene.apache.org, Rajeev Kasarabada1 <kasarab1@in.ibm.com>,
Archana Gavini1 <agavini1@in.ibm.com>
> Date:        13/09/2019 04:32 PM
> Subject:        [EXTERNAL] Re: Need more info on MLT (More Like This) feature
> 
> 
> 
> To use knnSearch, you need to submit a POST request to the Stream request handler.
> 
> Using your example query, you will need to rewrite them from this :
> 
> http://[SOLRURL]/mlt?q=sjkey:1414462-25600-5258&wt=json&indent=true&mlt=true&rows=100&mlt.fl=jobdescription&mlt.mindf=1&mlt.mintf=1&fl=jobtitle,jobdescription&fq=siteid:5258
> 
> to this (using curl as an example to send POST request) :
> 
> curl --data-urlencode 'expr=knnSearch([collection_name],
> id="1414462-25600-5258",
> qf="jobdescription",
> k=100,
> fl="jobtitle,jobdescription,score",
> sort="score desc",
> fq="siteid:5258",
> mintf=1, 
> mindf=1)' http://[SOLRURL]/stream
> 
> Note that this assume your document ID is sjkey.
> 
> More detailed documentation on how Stream handler works can be seen here, https://lucene.apache.org/solr/guide/8_1/streaming-expressions.html.
> 
> Best wishes,
> Chee Yee
> 
> On Fri, 13 Sep 2019 at 17:57, Srisatya Pyla <srispyla@in.ibm.com> wrote:
> Hi Chee Yee Lim,
> 
> 
> Thank you for your quick response.  
> We do not find much documentation on knnsearch on how to do use that.   
> Could you please guide us with more info on how this can be used?
> 
> Can we use this the way we use Solr by querying with Solr URL like   http://[SOLR URL]/mlt....
?  OR any other way?
> And also please provide with any more detailed documentation if you have any.
> 
> 
> Regards,
> SST  Narasimha Rao Pyla
> IBM Talent Management Solutions
> Mobile :+91 9849315546
> E-mail :srispyla@in.ibm.com	
> 
> 
> IBM Visakha Hills
> Visakhapatnam, AP 530045
> India
> 
> 
> 
> 
> 
> 
>  
>  
> ----- Original message -----
> From: Chee Yee Lim <cheeyee.lim@gmail.com>
> To: solr-user@lucene.apache.org
> Cc: Archana Gavini1 <agavini1@in.ibm.com>, Rajeev Kasarabada1 <kasarab1@in.ibm.com>
> Subject: [EXTERNAL] Re: Need more info on MLT (More Like This) feature
> Date: Thu, Sep 12, 2019 6:43 PM
>  
> I've been working with MLT handler (Solr 8.1.1) by calling it the same way you did, http://[SOLRURL]/mlt.
But the response is very unreliable with 90% of the same queries resulting in Java null pointer
exception, and only 10% returning expected response. I do not know what is the cause of this.
>  
> I overcame this problem by using knnSearch via Stream handler (https://lucene.apache.org/solr/guide/8_1/stream-source-reference.html#knnsearch).
It is just a wrapper on MLT, and it works brilliantly. It is worth checking it out if you
are running Solr in cloud mode.
>  
> If you pass the fl="score"&sort="score desc" to knnSearch, you will be able to get
the results sorted by matching scores.
>  
> Best wishes,
> Chee Yee
>   
> On Thu, 12 Sep 2019 at 19:44, Srisatya Pyla <srispyla@in.ibm.com> wrote:
> Hi Solr Seatch Team,
> 
> I am a developer from IBM Kenexa Brassring.  We are using Solr Search engine for searching
jobs in our applications.
> We are planning to use MLT feature to get the similar matching documents (jobs) based
on one document (job).
> 
> When trying to explore this option, we are using matching field as JobDescription of
the job and we are getting some unrelated documents in the MLT results which are not expected.
> 
> The query like below:
> 
> http://[SOLRURL]/mlt?q=sjkey:1414462-25600-5258&wt=json&indent=true&mlt=true&rows=100&mlt.fl=jobdescription&mlt.mindf=1&mlt.mintf=1&fl=jobtitle,jobdescription&fq=siteid:5258
> 
> 
> We have few questions:
> 1) Is there any way we can get the matching score for each of the matching document we
get in the MLT results, so that we can get the sorting done on the score to have the highest
matching document at the top of the result.
> 
> 2) Are there any best practices using MLT Handler?
> 
> 
> Regards,
> SST  Narasimha Rao Pyla
> IBM Talent Management Solutions
> Mobile :+91 9849315546
> E-mail :srispyla@in.ibm.com	
> 
> 
> IBM Visakha Hills
> Visakhapatnam, AP 530045
> India
> 
>  
>  
> 
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message