lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From A Z <4azfri...@gmail.com>
Subject Re: weightage of each word according to precedence in document
Date Sat, 04 Feb 2012 10:11:44 GMT
hi lan,

sorry for late reply ,

it is simple search with default similarity only,
here it gives same score for doc which has both token that is abcd pqrst,
there is no more weight for doc which has predence of abcd in document .

here is output with score and searcher.explain


Query content:abcd^10.0 content:pqrst^5.0

*title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score ->0.6175326
*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 0), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 0), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=0)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 0), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 0), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=0)

*title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score ->0.6175326
*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 1), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 1), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=1)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 1), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 1), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=1)

*title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
->0.6175326*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 3), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 3), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=3)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 3), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 3), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=3)

*title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
->0.6175326*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 4), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 4), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=4)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 4), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 4), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=4)

*title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score ->0.07735918*

Searcher.explain -> 0.07735918 = (MATCH) product of:

0.15471835 = (MATCH) sum of:

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 2), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 2), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=2)

0.5 = coord(1/2)


On Mon, Jan 30, 2012 at 2:59 PM, Ian Lea <ian.lea@gmail.com> wrote:

> They all give exactly the same score, even the 3rd doc which doesn't
> contain abcd at all?  Surprising.  What does searcher.explain() say?
> Is this a simple search with default Similarity or is there stuff
> you're not telling us?
>
> --
> Ian.
>
>
> On Sat, Jan 28, 2012 at 4:44 AM, A Z <4azfriend@gmail.com> wrote:
> > Hi lan
> >
> > thanks for your reply.
> >
> > when i boosting each term while searching like   abcd is boost with boost
> > factor of 10 and pqrst boost with boost factor of 5.
> > then also it gives same score for documents
> >
> > *Query content:abcd^10.0 content:pqrst^5.0*
>  >
> >
> > title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score
> ->0.40883923
> >
> > title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score
> ->0.40883923
> >
> > title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score
> ->0.40883923
> >
> > title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
> > ->0.40883923
> >
> > title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
> > ->0.40883923
> > Thanks
> >
> > On Wed, Jan 25, 2012 at 8:38 PM, Ian Lea <ian.lea@gmail.com> wrote:
> >
> >> If you want particular search terms to be more important than others
> >> you can use boosting.  See
> >> http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting a
> >> Term
> >>
> >> If you want the order of matched terms to matter, see PhraseQuery or
> >> SpanQuery.  The latter is more flexible. See
> >> http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for a
> >> good writeup.
> >>
> >> And you can of course use combinations of everything.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >>
> >> On Tue, Jan 24, 2012 at 5:08 PM, A Z <4azfriend@gmail.com> wrote:
> >> > Hi
> >> >
> >> >
> >> >
> >> > how can we assign custom score for each token/word.
> >> >
> >> >
> >> >
> >> > For Ex
> >> >
> >> > I have document
> >> >
> >> >
> >> >
> >> > 1    pqrst uvwx abcd
> >> >
> >> > 2    abcd pqrst uvwx
> >> >
> >> > 3    pqrst uvwx lmn
> >> >
> >> > 4    pqrst uvwx lmn abcd
> >> >
> >> > 5    pqrst abcd uvwx lmn
> >> >
> >> >
> >> >
> >> > *Now i m searching data ---> abcd pqrst*
> >> >
> >> > So it should give more weightage score to 2nd document then 1st
> document
> >> >
> >> >
> >> >
> >> > So i want is
> >> >
> >> > *document 1 :---*    *pqrst *has more *weight * then   *uvwx *word and
> >> *then
> >> >  abcd *word
> >> >
> >> > *document 2* *:---*    *abcd *has more *weight * then   *pqrst*  word
> >> > and *then  uvwx
> >> > *word
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message