lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: testing whether a field has terms before adding document to Index
Date Wed, 06 Feb 2013 19:55:49 GMT
You could just create the TokenStream yourself, try to read the first
token, and if you don't get a token (incrementToken returns false)
then skip it?

It's a bit wasteful since you'd then init a new TokenStream again if
you do index it ... but maybe it's not so bad since you only read one
token.

Mike McCandless

http://blog.mikemccandless.com

On Wed, Feb 6, 2013 at 2:32 PM, Jon Stewart
<jon@lightboxtechnologies.com> wrote:
> Hello,
>
> I have an application where a great many documents may not have any
> terms after StandardAnalyzer has had its way with the body. In that
> case, depending on some other metadata, I may not wish to add the
> document to the index altogether. Is there a way to tell?
>
> i.e., current I'm doing this:
>
> Document doc = new Document();
> doc.addField(new Field("body", bodyString, INDEXED | etc));
> MyIndexWriter.add(doc);
>
> and I'd like to do this:
>
> Field body = new Field("body", bodyString, INDEXED | etc);
> if (body has terms post-analysis) {
>   Document doc = new Document();
>   doc.addField(body);
>   MyIndexWriter.add(doc);
> }
>
> Is it possible to do this? I don't mind jumping through some hoops.
>
> Thanks!
>
> Jon
> --
> Jon Stewart, Principal
> (646) 719-0317 | jon@lightboxtechnologies.com | Arlington, VA
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message