lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Re: Finding the highest term in a field
Date Thu, 19 Nov 2009 06:04:56 GMT
On Thu, Nov 19, 2009 at 16:01, Yonik Seeley <yonik@lucidimagination.com> wrote:
> On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll <daniel@nuix.com> wrote:
>> But what if I want to find the highest?  TermEnum can't step backwards.
>
> I've also wanted to do the same. It's coming with the new flexible
> indexing patch:
> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764020#action_12764020

This sounds interesting.

I take it the existing numeric fields can't already do stuff like
this?  (We don't have access to them yet anyway for backwards
compatibility reasons, otherwise I would have looked into it.  But
next major version...)

For now I am writing a routine which subdivides the term space until
it thinks it's down to some size which is small enough to use
iteration instead of seeking (which seems to be in the realm of
100,000 ~ 1,000,000 terms -- but the hard thing is guessing how many
terms would be either side of the split.)

Daniel

-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message