lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Solr 6 and IDF
Date Tue, 08 Aug 2017 20:53:02 GMT
Yes, extend the default Similarity, return 1.0f for idf and probably the idfExplain methods,
and configure it in your schema, global or per-field.

If you think this is a good idea, why not also return 1.0f for tf? And while you're at it,
also omitNorms on all fields entirely?

I am curious if this is going to help you, please let us know!
 
-----Original message-----
> From:Webster Homer <webster.homer@sial.com>
> Sent: Tuesday 8th August 2017 22:44
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 6 and IDF
> 
> It appears that all I need to do is create a class that
> extends BM25Similarity, and have the new class return 1 as the idf. Is that
> correct?
> 
> On Tue, Aug 8, 2017 at 3:15 PM, Webster Homer <webster.homer@sial.com>
> wrote:
> 
> > I do want to use BM25, just disable IDF
> >
> > On Tue, Aug 8, 2017 at 2:58 PM, Peter Lancaster <
> > peter.lancaster@findmypast.com> wrote:
> >
> >> Hi Webster,
> >>
> >> If you're not worried about using BM25 searcher then you should just be
> >> able to continue as you were before by providing your own similarity class
> >> that extends ClassicSimilarity and then override the idf method to always
> >> return 1,  then reference that in your schema
> >> e.g.
> >> <similarity class="brightsolid.solr.plugins.MyCustomSimilarity" />
> >>
> >> As far as I know you've been able to have different similarities per
> >> field in solr for a while now. https://wiki.apache.org/solr/S
> >> chemaXml#Similarity
> >>
> >> Cheers,
> >> Peter Lancaster.
> >>
> >>
> >> -----Original Message-----
> >> From: Webster Homer [mailto:webster.homer@sial.com]
> >> Sent: 08 August 2017 20:39
> >> To: solr-user@lucene.apache.org
> >> Subject: Solr 6 and IDF
> >>
> >> Our most common use for solr is searching for products, not text search.
> >> My company is in the process of migrating away from an Endeca search
> >> engine,  the goal to keep the business happy is to make sure that search
> >> results from the different engines be fairly similar, one area that we have
> >> found that suppresses a result from being as good as it was in the old
> >> system is the idf.
> >>
> >> We are using Solr 6. After moving to it, a lot of our results got better,
> >> but idf still seems to deaden some results. Given that our focus is product
> >> searching I really don't see a need for idf at all. Previous to Solr 6 you
> >> could suppress idf by providing a custom similarity class. Looking over the
> >> newer documentation a lot of things have improved, but I'm not sure I see a
> >> simple way to turn off idf in Solr 6's BM25 searcher.
> >>
> >> How do I disable IDF in Solr 6?
> >>
> >> We also do have needs for text searching so it would be nice if we could
> >> suppress IDF on a field or schema level
> >>
> >> --
> >>
> >>
> >> This message and any attachment are confidential and may be privileged or
> >> otherwise protected from disclosure. If you are not the intended recipient,
> >> you must not copy this message or attachment or disclose the contents to
> >> any other person. If you have received this transmission in error, please
> >> notify the sender immediately and delete the message and any attachment
> >> from your system. Merck KGaA, Darmstadt, Germany and any of its
> >> subsidiaries do not accept liability for any omissions or errors in this
> >> message which may arise as a result of E-Mail-transmission or for damages
> >> resulting from any unauthorized changes of the content of this message and
> >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> >> subsidiaries do not guarantee that this message is free of viruses and does
> >> not accept liability for any damages caused by any virus transmitted
> >> therewith.
> >>
> >> Click http://www.emdgroup.com/disclaimer to access the German, French,
> >> Spanish and Portuguese versions of this disclaimer.
> >> ________________________________
> >>
> >> This message is confidential and may contain privileged information. You
> >> should not disclose its contents to any other person. If you are not the
> >> intended recipient, please notify the sender named above immediately. It is
> >> expressly declared that this e-mail does not constitute nor form part of a
> >> contract or unilateral obligation. Opinions, conclusions and other
> >> information in this message that do not relate to the official business of
> >> findmypast shall be understood as neither given nor endorsed by it.
> >> ________________________________
> >>
> >> ____________________________________________________________
> >> ______________
> >>
> >> This email has been checked for virus and other malicious content prior
> >> to leaving our network.
> >> ____________________________________________________________
> >> ______________
> >>
> >
> >
> 
> -- 
> 
> 
> This message and any attachment are confidential and may be privileged or 
> otherwise protected from disclosure. If you are not the intended recipient, 
> you must not copy this message or attachment or disclose the contents to 
> any other person. If you have received this transmission in error, please 
> notify the sender immediately and delete the message and any attachment 
> from your system. Merck KGaA, Darmstadt, Germany and any of its 
> subsidiaries do not accept liability for any omissions or errors in this 
> message which may arise as a result of E-Mail-transmission or for damages 
> resulting from any unauthorized changes of the content of this message and 
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
> subsidiaries do not guarantee that this message is free of viruses and does 
> not accept liability for any damages caused by any virus transmitted 
> therewith.
> 
> Click http://www.emdgroup.com/disclaimer to access the German, French, 
> Spanish and Portuguese versions of this disclaimer.
> 

Mime
View raw message