lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brisbart Franck <Franck.Brisb...@kelkoo.net>
Subject Re: score and frequency
Date Thu, 24 Jun 2004 14:08:33 GMT
I don't think your field title can be used to search for the exact 
match, because in your field, the title is probably tokenized.
The other field for the title should contain the title as a keyword, ie 
not tokenized. So, if the title is 'sea-lion', the term 
("title","sea-lion") is store.

The transformation I told you is to avoid storing the raw term, ie with 
the case and special characters.
Something like that:
--
StandardAnalyzer analyzer = new StandardAnalyzer();
Token token;
StringBuffer buf = new StringBuffer();
try {
   stream = analyzer.tokenStream("title", new StringReader(title));
   while ((token = stream.next()) != null) {
     qBuf.append(' ');
     qBuf.append(token.termText());
   }
} catch(IOException ioe) {
   ioe.printStackTrace();
}
String transformedTitle = buf.toString().trim();
--
So, if you search for "sea-lion", "sea  lion" or "Sea lion", the 
transformed text will be "sea lion", the TermQuery will be new 
TermQuery(new Term("newField", "sea lion")) and you could then search 
for an exact match.

Franck

Niraj Alok wrote:
> Hi Franck,
> 
> I already had a field separately for the title. Replacing the PhraseQuery
> with the TermQuery did not help.
> I haven't tried the transformation part. What kind of transformation are you
> talking about ? How to do this transformation?
> Can you provide some more details please?
> 
> 
> Regards,
> Niraj
> ----- Original Message -----
> From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Thursday, June 24, 2004 7:04 PM
> Subject: Re: score and frequency
> 
> 
> 
>>Forget about the PhraseQuery, I'm stupid, it can't work like that.
>>Because the phrase query will boost the documents which contain the
>>search and not the documents which match exactly the search. So, the
>>exact matches will come down. :-/
>>
>>You need to have some information about the lucene documents to know if
>>it's an exact match. Such as the number of terms in the documents. The
>>problem is that this number is store in the lengthNorm and as it's
>>encoded on 1 byte, you can't have it precisely. So, you should shunt the
>>problem.
>>Here's another suggestion (a good one I hope):
>>Add another field containing the title as a Keyword. Then you just have
>>to replace the PhraseQuery I told you to use by a TermQuery searching
>>for the term (newField,search)
>>Of course, it will be a bit too restrictive to store the title without
>>any transformation. You can for example store in this field the
>>concatenation of the token given by your analyzer. Just don't forget to
>>do the same transformation also for the search.
>>Sorry for the previous posts.
>>
>>Franck
>>
>>Niraj Alok wrote:
>>
>>>Hi Franck,
>>>
>>>You seem to be a genius in lucene !
>>>
>>>I have done finally all that which you have suggested, but now when I am
>>>searching for "lion", those terms are coming much below in terms of
> 
> scores.
> 
>>>This is despite me setting the boost for the phrase query. Infact, this
> 
> is
> 
>>>resulting in almost all the exact matches to come down.
>>>
>>>
>>>Regards,
>>>Niraj
>>>----- Original Message -----
>>>From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>Sent: Thursday, June 24, 2004 6:02 PM
>>>Subject: Re: score and frequency
>>>
>>>
>>>
>>>
>>>>It may come from the boolean clauses. If you add your sub-queries with a
>>>>'required' flag, you'll only get the results matching all the words in
>>>>your query.
>>>>It can also come from the score which is different. If you set up a
>>>>threshold to return the results, it can be the problem.
>>>>
>>>>Franck
>>>>
>>>>
>>>>Niraj Alok wrote:
>>>>
>>>>
>>>>>Hi Franck,
>>>>>
>>>>>Thank you so much for the detailed explanation.
>>>>>However, when I tried to break up my MultiFieldQueryParser into a
> 
> series
> 
>>>of
>>>
>>>
>>>>>BooleanQueries, the result set has got reduced drastically.
>>>>>Any idea why this could be happening?
>>>>>
>>>>>Regards,
>>>>>Niraj
>>>>>----- Original Message -----
>>>>>From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>Sent: Thursday, June 24, 2004 2:54 PM
>>>>>Subject: Re: score and frequency
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>The MultiFieldQueryParser give you a BooleanQuery containing 1 query
> 
> for
> 
>>>>>>each field.
>>>>>>Something like:
>>>>>>          BooleanQuery
>>>>>>          /   |   |   \
>>>>>>        QF1  QF2 QF3  QF4    (QFx=Query for field x)
>>>>>>
>>>>>>You can still use the MultiFieldQueryParser and create a BooleanQuery
> 
> to
> 
>>>>>>encapsulate the one parsed + the PhraseQuery, ie:
>>>>>>           BooleanQuery(created by you)
>>>>>>            /       \
>>>>>>          BQ      PhraseQuery
>>>>>>
>>>>>>Or create the whole query (I think you should do that) and have
>>>>>>something like that:
>>>>>>           _BooleanQuery__
>>>>>>          /   |   |   \   \
>>>>>>        QF1  QF2 QF3  QF4  PhraseQuery      (QFx=Query for field x)
>>>>>>
>>>>>>It's like parsing the following query:
>>>>>>(field1:query) (field2:query) (field3:query)...(fieldx:query)
>>>>>>(title:"query")~boost
>>>>>>
>>>>>>
>>>>>>Franck
>>>>>>
>>>>>>
>>>>>>Niraj Alok wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>I asked the previous question since I do not know how to use
>>>
>>>PhraseQuery
>>>
>>>
>>>>>>>I have one booleanquery and one query.
>>>>>>>The query is Query query =  MultiFieldQueryParser.parse( qs,
> 
> searchLoc,
> 
>>>>>>>flags, new StandardAnalyzer(stop));
>>>>>>>
>>>>>>>where qs is the word to be searched upon and searchLoc contains
all
> 
> the
> 
>>>>>four
>>>>>
>>>>>
>>>>>
>>>>>>>fields.
>>>>>>>
>>>>>>>How do I insert a PhraseQuery here for title field only, and that
too
>>>>>
>>>>>with
>>>>>
>>>>>
>>>>>
>>>>>>>its boosted value?
>>>>>>>
>>>>>>>
>>>>>>>Regards,
>>>>>>>Niraj
>>>>>>>----- Original Message -----
>>>>>>>From: "Niraj Alok" <niraj@emacmillan.com>
>>>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>>>Sent: Thursday, June 24, 2004 2:00 PM
>>>>>>>Subject: Re: score and frequency
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>Does it mean that I would need to abandon MultiFieldQueryParser?
>>>>>>>>
>>>>>>>>Regards,
>>>>>>>>Niraj
>>>>>>>>----- Original Message -----
>>>>>>>>From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
>>>>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>>>>Sent: Thursday, June 24, 2004 1:22 PM
>>>>>>>>Subject: Re: score and frequency
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>Hi,
>>>>>>>>>first, what do you consider as an 'exact matching' ? It
seems that
>>>
>>>you
>>>
>>>
>>>>>>>>>treat the search word by word, so 'lion sea' will be an
'exact
> 
> match'
> 
>>>>>of
>>>>>
>>>>>
>>>>>
>>>>>>>>>'sea-lion'.
>>>>>>>>>I think you should add a PhraseQuery to your query containing
the
>>>
>>>title
>>>
>>>
>>>>>>>>>and with a big boost. So, you don't need to boost your
title field.
>>>>>
>>>>>Only
>>>>>
>>>>>
>>>>>
>>>>>>>>>the results matching exactly (for the PhraseQuery) will
be boosted.
>>>>>>>>>
>>>>>>>>>Franck
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Niraj Alok wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>Hi Guys,
>>>>>>>>>>
>>>>>>>>>>I seem to have run into rough weather again.
>>>>>>>>>>To describe the problem as concisely as possible,
I have four
> 
> fields
> 
>>>>>>>to
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>search upon : title , first para, rest of the paras and content
> 
> (equal
> 
>>>>>to
>>>>>
>>>>>
>>>>>
>>>>>>>>title + first para + rest of the para) .  I am doing this
by using
>>>>>>>>MultiFieldQueryParser.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>Now there is a very complicated ranking algrorithm
specified by
> 
> the
> 
>>>>>>>>client and I have met most of them except one or two and really
need
>>>>>
>>>>>your
>>>>>
>>>>>
>>>>>
>>>>>>>>help as all my other efforts have failed.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>The most important rule is that exact matching titles
should come
>>>>>>>
>>>>>>>first
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>, i.e. get higher scores.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>I have given the highest boost factor to the title
than the rest
> 
> but
> 
>>>>>>>the
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>problem comes up when there is some other title which has
got just
> 
> one
> 
>>>>>>>word
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>matching. For e.g., if I search for lion, there is a title
sea-lion
>>>>>
>>>>>which
>>>>>
>>>>>
>>>>>
>>>>>>>>also has the same boost factor as that of "lion" in the index.
Also,
>>>>>>>>sea-lion has got some more "lion" in its first para or rest
of the
>>>
>>>paras
>>>
>>>
>>>>>>>>etc. such that its score comes higher than "lion".
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>Is there some way to get the exact matching titles
higher scores?
>>>>>>>>>>Please reply soon.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Regards,
>>>>>>>>>>Niraj
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>----- Original Message -----
>>>>>>>>>>From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
>>>>>>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>>>>>>Sent: Monday, June 07, 2004 12:50 PM
>>>>>>>>>>Subject: Re: score and frequency
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>It seems that you don't the length norm to be
used. It's a factor
>>>>>>>
>>>>>>>which
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>normalize the score of a doc depending on the
size of the
> 
> searched
> 
>>>>>>>field
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>of the doc. It's the field which make that 'ground
ice' has a
>>>
>>>higher
>>>
>>>
>>>>>>>>>>>score than 'ice hockey: British Sekonda Superleague
Play-Off
>>>>>>>>>>>Championship: finals' because it only has 2 terms.
>>>>>>>>>>>So, I suggest you to override the lengthNorm method
and to ignore
>>>
>>>the
>>>
>>>
>>>>>>>>>>>numTokens parameter.
>>>>>>>>>>>NB: The length norm is computed during the indexation
and the
> 
> norm
> 
>>>>>are
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>store in the index (in the _aaa.f# files). So,
you need to do
>>>>>
>>>>>re-index
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>your data, and use this similarity during the
indexation.
>>>>>>>>>>>
>>>>>>>>>>>Cheers,
>>>>>>>>>>>Franck
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Niraj Alok wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>I have set the searcher.setSimilarity  as
well as also tried
>>>
>>>setting
>>>
>>>
>>>>>>>>the
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>coord factor to 1.
>>>>>>>>>>>>
>>>>>>>>>>>>The problem as given by an example is : Lets
say I have titles
> 
> to
> 
>>>be
>>>
>>>
>>>>>>>>>>>>displayed depending upon the search.
>>>>>>>>>>>>E.g if i have "ice hockey" as the search item
and if it is
> 
> default
> 
>>>>>>>>>>>>similarity, my results are :
>>>>>>>>>>>>
>>>>>>>>>>>>ice hockey0.99999994
>>>>>>>>>>>>ice hockey0.75
>>>>>>>>>>>>ice hockey0.75
>>>>>>>>>>>>winter Olympics: hockey, ice, medallists0.17402513
>>>>>>>>>>>>ice age0.073680125
>>>>>>>>>>>>National Hockey League0.020266924
>>>>>>>>>>>>Cracking the Ice Age0.018420031
>>>>>>>>>>>>ground-ice0.011512519
>>>>>>>>>>>>ice hockey: British Sekonda Superleague Play-Off
Championship:
>>>>>>>>>>>>finals0.0069075115
>>>>>>>>>>>>(the numbers indicating the score).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>But if i set the similarity as my overridden
one, the results
>>>>>
>>>>>become:
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>ice hockey0.99999994
>>>>>>>>>>>>ice hockey0.75
>>>>>>>>>>>>ice hockey0.75
>>>>>>>>>>>>ice age0.22104037
>>>>>>>>>>>>winter Olympics: hockey, ice, medallists0.17402513
>>>>>>>>>>>>National Hockey League0.060800765
>>>>>>>>>>>>Cracking the Ice Age0.055260092
>>>>>>>>>>>>ground-ice0.034537554
>>>>>>>>>>>>ice hockey: British Sekonda Superleague Play-Off
Championship:
>>>>>>>>>>>>finals0.020722535
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>I want all the titles which have both "ice"
and "hockey" to come
>>>>>>>
>>>>>>>above
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>the
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>rest (to have higher scores)
>>>>>>>>>>>>Meaning i would wish the results to appear
like:
>>>>>>>>>>>>
>>>>>>>>>>>>ice hockey
>>>>>>>>>>>>ice hockey
>>>>>>>>>>>>ice hockey
>>>>>>>>>>>>winter Olympics: hockey, ice, medallists
>>>>>>>>>>>>ice hockey: British Sekonda Superleague Play-Off
Championship:
>>>>>
>>>>>finals
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>ice age
>>>>>>>>>>>>National Hockey League
>>>>>>>>>>>>Cracking the Ice Age
>>>>>>>>>>>>ground-ice
>>>>>>>>>>>>
>>>>>>>>>>>>My overriden similarity class contains just
this method:
>>>>>>>>>>>>public float coord(int overlap, int maxOverlap)
{
>>>>>>>>>>>>
>>>>>>>>>>>>return 1.0f;
>>>>>>>>>>>>
>>>>>>>>>>>>}
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>I feel it is the weight factor which is producing
indesirable
>>>>>>>
>>>>>>>results.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>Any
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>help in this regard would be highly appreciated.
>>>>>>>>>>>>
>>>>>>>>>>>>Regards,
>>>>>>>>>>>>Niraj
>>>>>>>>>>>>
>>>>>>>>>>>>----- Original Message -----
>>>>>>>>>>>>From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
>>>>>>>>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>>>>>>>>Sent: Friday, June 04, 2004 8:46 PM
>>>>>>>>>>>>Subject: Re: score and frequency
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>>Be careful to set the default similarity
>>>>>>>>>>>>>'Similarity.setDefault(similarity)' before
creating your search
>>>>>>>>
>>>>>>>>instance
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>(IndexSearcher).
>>>>>>>>>>>>>If you change the default similarity after,
you'll still use
> 
> the
> 
>>>>>old
>>>>>
>>>>>
>>>>>
>>>>>>>>one.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>You'd better use the 'searcher.setSimilarity'
method on your
>>>>>>>
>>>>>>>searcher.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>>>Franck
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Phil brunet wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>Hi to all.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Maybe the term frequency is not the
only parameter you need to
>>>>>>>>
>>>>>>>>override
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>>>to "customize" the score attributed
by Lucene.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Maybe you should consider the normalisation
factor, the idf
> 
> and
> 
>>>>>the
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>>>coord factor ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Philippe
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>From: "Niraj Alok" <niraj@emacmillan.com>
>>>>>>>>>>>>>>>Reply-To: "Lucene Users List"
> 
> <lucene-user@jakarta.apache.org>
> 
>>>>>>>>>>>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>>>>>>>>>>>Subject: Re: score and frequency
>>>>>>>>>>>>>>>Date: Fri, 4 Jun 2004 15:13:32
+0530
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Hi Erik,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Thanks for the suggestion.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>I tried this:
>>>>>>>>>>>>>>>public class RelevanceSimilarity
extends DefaultSimilarity
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>{
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>public float tf(float freq) {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>System.out.println("discounting
frequency");
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>return (float)1;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>}
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>}
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>and in my query class, I used
:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Similarity.setDefault(similarity);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Hits hits = is.search(query);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>for(i = 0; i < hits.length();
i ++)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>result = result + hits.score(i);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>However, this is still not giving
me the expected result. Do
> 
> I
> 
>>>>>>>need
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>to
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>do
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>something else?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Regards,
>>>>>>>>>>>>>>>Niraj
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>----- Original Message -----
>>>>>>>>>>>>>>>From: "Erik Hatcher" <erik@ehatchersolutions.com>
>>>>>>>>>>>>>>>To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>>>>>>>>>>>>>>>Sent: Friday, June 04, 2004 1:55
PM
>>>>>>>>>>>>>>>Subject: Re: score and frequency
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>On Jun 4, 2004, at 2:52 AM,
Niraj Alok wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>I am having some problems
with the score of lucene.
>>>>>>>>>>>>>>>>>I am trying to get the
results displayed according to
>>>>>
>>>>>hits.score
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>and
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>it is giving the results
correctly.
>>>>>>>>>>>>>>>>>However I do not want
the frequency factor to be used for
> 
> the
> 
>>>>>>>>>>>>>>>>>computation of the score.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Is it possible to get
the score which does not have the
>>>>>>>
>>>>>>>frequency
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>factor in it ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Have a look at the javadocs
for Similarity.
> 
> DefaultSimilarity
> 
>>>>>is
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>used
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>unless otherwise specified.
 You could subclass that and
>>>>>
>>>>>override
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>this:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>public float tf(float freq)
{
>>>>>>>>>>>>>>>>return (float)Math.sqrt(freq);
>>>>>>>>>>>>>>>>}
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>and return 1.0.  This might
give you the effect you want.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Erik
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>------------------------------------------------------------------
> 
> -
> 
>>>-
>>>
>>>
>>>>>-
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>>>>>To unsubscribe, e-mail:
>>>>>>>
>>>>>>>lucene-user-unsubscribe@jakarta.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>For additional commands, e-mail:
>>>>>>>>
>>>>>>>>lucene-user-help@jakarta.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>-------------------------------------------------------------------
> 
> -
> 
>>>-
>>>
>>>
>>>>>>>>>>>>>>>To unsubscribe, e-mail:
>>>>>
>>>>>lucene-user-unsubscribe@jakarta.apache.org
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>>>>For additional commands, e-mail:
>>>>>>>
>>>>>>>lucene-user-help@jakarta.apache.org
>>>>>>>
>>>>>>>
>>>>
>>>>>>>>>>>>_________________________________________________________________
>>>>>>>>>>>>
>>>>>>>>>>>>>>Bloquez les fenĂȘtres pop-up, c'est
gratuit !
>>>
>>>http://toolbar.msn.fr
>>>
>>>
>>>>>>>>>--------------------------------------------------------------------
> 
> -
> 
>>>>>>>>>>>>>>To unsubscribe, e-mail:
>>>
>>>lucene-user-unsubscribe@jakarta.apache.org
>>>
>>>
>>>>>>>>>>>>>>For additional commands, e-mail:
>>>>>>>
>>>>>>>lucene-user-help@jakarta.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>>>--
>>>>>>>>>>>>>Franck Brisbart
>>>>>>>>>>>>>R&D
>>>>>>>>>>>>>http://www.kelkoo.com
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>--------------------------------------------------------------------
> 
> -
> 
>>>>>>>>>>>>>To unsubscribe, e-mail:
>>>
>>>lucene-user-unsubscribe@jakarta.apache.org
>>>
>>>
>>>>>>>>>>>>>For additional commands, e-mail:
>>>>>
>>>>>lucene-user-help@jakarta.apache.org
>>>>>
>>>>>
>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>--------------------------------------------------------------------
> 
> -
> 
>>>>>>>>>>>>To unsubscribe, e-mail:
> 
> lucene-user-unsubscribe@jakarta.apache.org
> 
>>>>>>>>>>>>For additional commands, e-mail:
>>>
>>>lucene-user-help@jakarta.apache.org
>>>
>>>
>>>>>>>>>>>--
>>>>>>>>>>>Franck Brisbart
>>>>>>>>>>>R&D
>>>>>>>>>>>http://www.kelkoo.com
>>>>>>>>>>>
>>>>>>>>>>>
>>>>
>>>>>>>>>--------------------------------------------------------------------
> 
> -
> 
>>>>>>>>>>>To unsubscribe, e-mail:
> 
> lucene-user-unsubscribe@jakarta.apache.org
> 
>>>>>>>>>>>For additional commands, e-mail:
>>>
>>>lucene-user-help@jakarta.apache.org
>>>
>>>
>>>>>>>>>>
>>>>>>>>>--
>>>>>>>>>Franck Brisbart
>>>>>>>>>R&D
>>>>>>>>>http://www.kelkoo.com
>>>>>>>>>
>>>>>>>>>
>>
>>>>>>>>---------------------------------------------------------------------
>>>>>>>>
>>>>>>>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>>>>>>>For additional commands, e-mail:
> 
> lucene-user-help@jakarta.apache.org
> 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>---------------------------------------------------------------------
>>>>>>>
>>>>>>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>>>>>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>---------------------------------------------------------------------
>>>>>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>>>>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>>--
>>>>>>Franck Brisbart
>>>>>>R&D
>>>>>>http://www.kelkoo.com
>>>>>>
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>>>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>>>>
>>>>>
>>>>>
>>>>--
>>>>Franck Brisbart
>>>>R&D
>>>>http://www.kelkoo.com
>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>>
>>>
>>>
>>
>>--
>>Franck Brisbart
>>R&D
>>http://www.kelkoo.com
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
> 
> 


-- 
Franck Brisbart
R&D
http://www.kelkoo.com


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message