lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Finding the highest term in a field
Date Thu, 19 Nov 2009 03:48:18 GMT
Hi all.

If I want to find the lowest term in a field, I can do something like this:

    public Date computeEarliestDate(IndexReader reader) throws IOException {
        TermEnum terms = reader.terms(new Term("date", "00000000"));
        if (terms.term() == null || !"date".equals(terms.term().field()))
        {
            return new Date(); // some date before all data
        }

        return dateFormat.parse(terms.term().text());
    }

But what if I want to find the highest?  TermEnum can't step backwards.

I am working under these constraints:
    * It can't involve iterating every value in the TermEnum because
the number of documents is too large for that to be efficient.
    * It has to work with existing text indexes, so I can't cheat by
having another field which sorts in the other direction.

Is my best option to do a sort of binary search by getting the
TermEnum for different terms until I find a term where there are terms
higher than the term but no terms higher than the term for the next
day?

Daniel


-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message