lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pablo Gomes Ludermir <gom...@gmail.com>
Subject Re: skip document header while indexing
Date Fri, 29 Apr 2005 12:30:54 GMT
Could you give me some pointers (example or website) to how I could do that?

On 4/29/05, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> 
> On Apr 29, 2005, at 7:50 AM, Pablo Gomes Ludermir wrote:
> 
> > Hello all,
> >
> > Is it possible to skip the first "xx" words while indexing a document?
> > For instance, on the code bellow, I would like to skip the "xx" first
> > words of "file" on the "CONTENTS_FIELD". Is that possible?
> >
> > Document doc = new Document();
> > FileInputStream is = new FileInputStream(file);
> > Reader reader = new BufferedReader(new InputStreamReader(is));
> > doc.add(Field.Text(PATH_FIELD, artifactModel));
> > doc.add(Field.Text(CONTENTS_FIELD, reader, true));
> 
> I believe your best bet will be to put in a custom Analyzer that does
> this.  It wouldn't be too hard to code a wrapper around an analyzer
> that did this.
> 
>        Erik
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


-- 
Pablo Gomes Ludermir
gomesp@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message