lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <ysee...@gmail.com>
Subject Re: TermDocs.freq()
Date Mon, 03 Oct 2005 16:04:04 GMT
See IndexWriter.setMaxFieldLength()

-Yonik
Now hiring -- http://tinyurl.com/7m67g

On 10/3/05, Tricia Williams <pgwillia@student.cs.uwaterloo.ca> wrote:
>
> To follow up on my post from Thursday. I have written a very basic test
> for TermPositions. This test allows me to identify that only the
> first 10001 tokens are considered to determine term frequency (ie with
> the searching term in a position greater than 10001 my test fails).
>
> Is this by design? Is there an obvious work-around so that the frequency
> that I receive is correct for my document?
>
> Thank you for your consideration,
> Tricia
>
> On Thu, 29 Sep 2005, Tricia Williams wrote:
>
> > I am finding that TermDocs.freq() method is returning an incorrect
> value.
> > I was wondering if anyone else had experienced this problem.
> >
> > I am using tp = IndexReader.termPositions( queryTerm ) to return a
> object
> > which implements TermPositions. I then use tp.skipTo( docid ) to go
> > directly to the document from which I wish to retrieve term positions.
> The
> > following for loop adds the positions to my ArrayList which I use later:
> >
> > for( int pos = tp.nextPosition(), k = 0;
> > k < tp.freq();
> > pos = tp.nextPosition(), k++ )
> > {
> > positionMatches.add( new Integer( pos ) );
> > }
> >
> > In a document which I know has 48 references to the term, a frequency of
> > 23 is returned. There doesn't seem to be a pattern to this as some other
> > documents have (frequency, actual): (25, 48), (36, 43), (30, 149).
> >
> > These frequencies are from results within my code and confirmed in Luke,
> > so I'm pretty certain that this isn't an error on my part.
> >
> > I've been trying to find out where the origin of this issue is without
> > luck thus far. Any help or advice would be appreciated.
> >
> > Thanks,
> > Tricia
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message