lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Balmès <raymond.bal...@gmail.com>
Subject Re: Empty SinkTokenizer
Date Mon, 30 Mar 2009 08:42:56 GMT
Yes indeed confusing code... I was also very confused.
In the meantime I solved my problem by checking in the tokenStream method of
myAnalyzer which field was being looked at and applying the right stream to
the right field. No idea if this is how it is intended to be done, but it
works perfect in my case.

I found out that the fields are processed in alpha order... and not in
creation order. Is there any reason for that ?

-Ray-

On Sat, Mar 28, 2009 at 9:30 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> What kind of failures do you get? And I'm confused by the code. Are
> you creating a new IndexWriter every time? Do you ever close it?
>
> It'd help to see the surrounding code...
>
> Best
> Erick
>
> On Sat, Mar 28, 2009 at 1:36 PM, Raymond Balmès <raymond.balmes@gmail.com
> >wrote:
>
> > Hi guys,
> >
> > I'm using a SinkTokenizer to collect some terms of the documents while
> > doing
> > the main document indexing
> > I attached it to a specific field (tokenized, indexed).
> >
> > *
> >
> > writer* = *new* IndexWriter(index, *my _analyzer*, create,
> > *new*IndexWriter.MaxFieldLength(1000000));
> > doc.add(new Field("content", reader));
> >
> > doc.add(*new* Field("*myField*",*my_analyzer.sinkStream**));*
> >
> > writer.addDocument(doc);
> >
> > I have a set of document which don't have those terms so the Sink is
> empty.
> >
> > writer.addDocument works fine on the first document, but it fails always
> on
> > the second ????
> >
> > Any idea what I should look for... I kind of get stuck, because
> > understanding what's done under addDocument is tough.
> >
> > -Raymond-
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message