lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: skip document header while indexing
Date Fri, 29 Apr 2005 12:26:01 GMT

On Apr 29, 2005, at 7:50 AM, Pablo Gomes Ludermir wrote:

> Hello all,
> Is it possible to skip the first "xx" words while indexing a document?
> For instance, on the code bellow, I would like to skip the "xx" first
> words of "file" on the "CONTENTS_FIELD". Is that possible?
> Document doc = new Document();
> FileInputStream is = new FileInputStream(file);
> Reader reader = new BufferedReader(new InputStreamReader(is));
> doc.add(Field.Text(PATH_FIELD, artifactModel));
> doc.add(Field.Text(CONTENTS_FIELD, reader, true));

I believe your best bet will be to put in a custom Analyzer that does 
this.  It wouldn't be too hard to code a wrapper around an analyzer 
that did this.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message