lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dragon Fly" <dragon-fly...@hotmail.com>
Subject Re: Empty fields ...
Date Wed, 19 Jul 2006 14:13:09 GMT
Thank you very much.

>From: "Erick Erickson" <erickerickson@gmail.com>
>Reply-To: java-user@lucene.apache.org
>To: java-user@lucene.apache.org
>Subject: Re: Empty fields ...
>Date: Wed, 19 Jul 2006 09:48:04 -0400
>
>Try something like
>
>TermDocs         termDocs = reader.termDocs();
>termDocs.seek(new Term("<relevant field name here>", ""));
>while (termDocs.next()) {
>    bits.set(termDocs.doc());
>}
>
>I *think* (and I'm remembering things folks wrote, haven't done this 
>myself)
>that the empty string for the Term matches all terms. If not, you might 
>have
>to wrap in in an outer loop that loops through all the elements, something
>like
>
>        bits = new BitSet(reader.maxDoc());
>
>        TermDocs         termDocs = reader.termDocs();
>        FilteredTermEnum fEnum = new FilteredTermEnum(reader, new
>Term(field, ""));
>
>        for (Term term = null; (term = fEnum.term()) != null; fEnum.next())
>{
>            termDocs.seek(new Term(
>                    field,
>                    term.text()));
>
>            while (termDocs.next()) {
>                bits.set(termDocs.doc());
>            }
>        }
>
>
>
>That said, it may be best for you to loop through each document and add 
>that
>doc to the relevant filters if it had the fields you're interested in. 
>You'd
>only be fetching each document once, so it'd only be one loop. I don't know
>enough about relative efficiencies to make a call here, probably depends
>upon how many docs you're dealing with. I'd stop at the first solution that
>works with acceptable performance unless you expect your corpus to grow
>significantly.... And since this is done in off hours, there's not a
>pressing reason to go with the very most efficient solution unless it takes
>a too long or you expect to have orders of magnitued more documents in your
>index eventually.
>
>Best
>Erick

_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfeeŽ 
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message