lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tricia Williams <pgwil...@student.cs.uwaterloo.ca>
Subject Re: TermDocs.freq()
Date Mon, 03 Oct 2005 15:24:43 GMT
To follow up on my post from Thursday.  I have written a very basic test
for TermPositions.  This test allows me to identify that only the
first 10001 tokens are considered to determine term frequency (ie with
the searching term in a position greater than 10001 my test fails).

Is this by design?  Is there an obvious work-around so that the frequency
that I receive is correct for my document?

Thank you for your consideration,
Tricia

On Thu, 29 Sep 2005, Tricia Williams wrote:

> I am finding that TermDocs.freq() method is returning an incorrect value.
> I was wondering if anyone else had experienced this problem.
>
> I am using tp = IndexReader.termPositions( queryTerm ) to return a object
> which implements TermPositions.  I then use tp.skipTo( docid ) to go
> directly to the document from which I wish to retrieve term positions. The
> following for loop adds the positions to my ArrayList which I use later:
>
> for( 	int pos = tp.nextPosition(), k = 0;
> 	k < tp.freq();
> 	pos = tp.nextPosition(), k++ )
> {
> 	positionMatches.add( new Integer( pos ) );
> }
>
> In a document which I know has 48 references to the term, a frequency of
> 23 is returned.  There doesn't seem to be a pattern to this as some other
> documents have (frequency, actual): (25, 48), (36, 43), (30, 149).
>
> These frequencies are from results within my code and confirmed in Luke,
> so I'm pretty certain that this isn't an error on my part.
>
> I've been trying to find out where the origin of this issue is without
> luck thus far.  Any help or advice would be appreciated.
>
> Thanks,
> Tricia
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message