lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: Last/max term in Lucene 4.x
Date Sat, 19 Feb 2011 13:42:29 GMT
> Instead of docFreq, did you mean numUniqueTerms?

Right.

> But you have to
> use a terms index impl that supports ord (eg FixedGap).

Ok, and the VariableGap is the new standard because the FST is much
more efficient as a terms index?  Perhaps I'd need to create a codec
(or patch the existing) to automatically store the max term?

On Sat, Feb 19, 2011 at 3:33 AM, Michael McCandless
<lucene@mikemccandless.com> wrote:
> I don't quite understand your question Jason...
>
> Seeking to the first term of the field just gets you the smallest term
> (in unsigned byte[] order, ie Unicode order if the byte[] is UTF8)
> across all docs.
>
> Instead of docFreq, did you mean numUniqueTerms?  Ie, you want to seek
> to the largest term for that field?  In which case, yes seeking by
> term ord to numUniqueTerms-1 gets you to that term.  But you have to
> use a terms index impl that supports ord (eg FixedGap).
>
> Mike
>
> On Fri, Feb 18, 2011 at 9:24 PM, Jason Rutherglen
> <jason.rutherglen@gmail.com> wrote:
>> This could be a rhetorical question.  The way to find the last/max
>> term that is a unique per document is to use TermsEnum to seek to the
>> first term of a field, then call seek to the docFreq-1 for the last
>> ord, then get the term, or is there a better/faster way?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
> --
> Mike
>
> http://blog.mikemccandless.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message