lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sahin Buyrukbilen <sahin.buyrukbi...@gmail.com>
Subject Re: How to get the score of a term in a document?
Date Sat, 02 Oct 2010 13:42:59 GMT
Hi Mike,

I am already done with walking through the terms, frequencies and the docs
by using termenum, termdocs, and indexreader,. The only thing left is the
scores. I will try your suggestion. hope it works.

Thank you.

Sahin.

On Sat, Oct 2, 2010 at 5:30 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> It sounds like you can just use Lucene's enum APIs (IndexReader.terms,
> IndexReader.termDocs) to walk the entire index, converting it to your
> format?
>
> I'm not sure how Luke computes the "score"... but maybe you could, for
> every term, make a TermQuery and then directly walk its matching docs
> & scores?  You'd have to do something like:
>
>  Scorer s = TermQuery.weight(searcher).scorer(reader, true, false);
>
>  int docID;
>  while((docID = s.nextDoc()) != Scorer.NO_MORE_DOCS) {
>    float score = s.score();
>  }
>
> I think?
>
> Mike
>
> On Fri, Oct 1, 2010 at 11:49 PM, Sahin Buyrukbilen
>  <sahin.buyrukbilen@gmail.com> wrote:
> > Hi Erick,
> >
> > I mean the score of a term in a document (we can think this as a one word
> > query) which is calculated by using "Default Similarity". Actually, when
> I
> > walk through my index term-by-term, Luke shows me the number of documents
> in
> > which the term exists. And for each document there is a score field.
> please
> > check the attachment for the screenshot. I am very new to the jargon of
> > Lucene, so I am sorry if I explain things in an incorrect way.
> >
> > My question is: For a term in the index, can we retrieve the value (here
> I
> > say score) calculated by using default similarity? Is this a value which
> is
> > already stored in the index or is it calculated on the fly by Luke (since
> I
> > can only see by using Luke)?
> >
> > My goal is to create an inverted index and write it into a text file in
> the
> > following form:
> >
> > Term t        ft         Inverted list for t
> >
> ----------------------------------------------------------------------------------
> > big              2        <2, 0.148> <3, 0.088>
> > in                5        <6, 0.159> <2, 0.143> <5, 0.088> <1,
0.076>
> <4,
> > 0.065>
> > -
> > -
> > -
> > -
> > -
> > so on for all terms. Here ft is the total frequency of term t in the
> whole
> > index, <docID , score > pairs are ID of the document in which term t has
> a
> > score, and these pairs are listed according to the decreasing order of
> > scores.
> >
> >
> > I checked through the documentation, and found scorer class but couldnt
> > understand how to use it.
> >
> > I hope this is a kind of better explanation.
> >
> > Best.
> > Sahin.
> >
> >
> > On Fri, Oct 1, 2010 at 9:22 PM, Erick Erickson <erickerickson@gmail.com>
> > wrote:
> >>
> >> I'm not sure what you're asking for. "Score of a term in a document"? Do
> >> you
> >> mean the amount a term contributed to a search for a particular
> document?
> >> The frequency of a term in a document? ???
> >>
> >> Could you elaborate on what you're trying to do? If you describe the
> >> problem
> >> you're trying to solve, people can provide better answers.
> >>
> >> Best
> >> Erick
> >>
> >> On Fri, Oct 1, 2010 at 11:33 AM, Sahin Buyrukbilen <
> >> sahin.buyrukbilen@gmail.com> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I need to retrieve the score of a term in a document? I dont want to
> >> > play
> >> > different scoring schemes. I just checked my index with Luke and it
> >> > shows
> >> > me
> >> > a score for each term in each document the term exists. So, I need
> just
> >> > to
> >> > get that score.
> >> >
> >> > Can anybody help me?
> >> >
> >> > Thank you in advance.
> >> >
> >> > Sahin.
> >> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message