lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: not indexing analyzed field
Date Thu, 25 Nov 2010 17:18:55 GMT
What is your evidence that "the result never reaches the index?"

Are you sure:
1> you commit afterwards
2> you reopen the underlying reader to see
3> if you don't store the value for the field, how are you sure?
4> If you search and don't find it, did you index it?

First, I'd be sure the value in question is in the document just before
sending it to be added to your index to see if the value you think
is in there really is. Something like Document.get() and see if

Best
Erick

On Thu, Nov 25, 2010 at 8:08 AM, Bernd Fehling <
bernd.fehling@uni-bielefeld.de> wrote:

> I used KeywordAnalyzer and KeywordTokenizer as templates for
> a new analyzer.
> The analyzer works fine but the result never reaches the index.
>
> My analyzer is called in "DocInverterPerField.processFields"
> with "stream.incrementToken()".
> ...
> try {
>    boolean hasMoreTokens = stream.incrementToken();
>
>    fieldState.attributeSource = stream;
>
>    OffsetAttribute offsetAttribute =
> fieldState.attributeSource.addAttribute(OffsetAttribute.class);
>    PositionIncrementAttribute posIncrAttribute =
> fieldState.attributeSource.addAttribute(PositionIncrementAttribute.class);
>
>    consumer.start(field);
> ...
>
> The result goes to "fieldState.attributeSource" but is not in "field".
> So "field.fieldsData" still has the old content before calling my
> analyzer. And when calling "consumer.start(field)" the old content
> is going to the index and not the new analyzed one.
> Does the analyzer has to care about "Fieldable field.fieldsData"
> or who is responsible for it?
>
> Regards
> Bernd
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message