lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: skip document header while indexing
Date Fri, 29 Apr 2005 12:26:01 GMT

On Apr 29, 2005, at 7:50 AM, Pablo Gomes Ludermir wrote:

> Hello all,
>
> Is it possible to skip the first "xx" words while indexing a document?
> For instance, on the code bellow, I would like to skip the "xx" first
> words of "file" on the "CONTENTS_FIELD". Is that possible?
>
> Document doc = new Document();
> FileInputStream is = new FileInputStream(file);
> Reader reader = new BufferedReader(new InputStreamReader(is));
> doc.add(Field.Text(PATH_FIELD, artifactModel));
> doc.add(Field.Text(CONTENTS_FIELD, reader, true));

I believe your best bet will be to put in a custom Analyzer that does 
this.  It wouldn't be too hard to code a wrapper around an analyzer 
that did this.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message