lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject Re: not indexing analyzed field
Date Fri, 26 Nov 2010 07:23:29 GMT
Hi Uwe,

my fieldType and fields are as follows:

<fieldType name="text_md" class="solr.TextField" omitNorms="true" >
  <analyzer type="index" class="de.ubbielefeld.solr.analysis.TextMessageDigestAnalyzer"
/>
</fieldType>

<!-- UNIQUE ID -->
<field name="id" type="string" indexed="true" stored="true" required="true" />
<field name="dcdocid" type="text_md" indexed="true" stored="true" />
<copyField source="id" dest="dcdocid" />

So the field dcdocid has the attribute *stored* which I can also see
in the debugger.
Why should I analyze a stored field?
I don't know if I need to analyze it, I also tried a filter but also no success.

My understanding is to send something to a field and the field has a processing
chain. The processing chain analyzes, filters, ... is doing something to the
content and then stores the content to that field in the index.

May be it is a misunderstanding on my side about the field based processing
because I'm normally working with FAST search engines which is document based.

Best regards
Bernd



Am 25.11.2010 18:33, schrieb Uwe Schindler:
> field.fieldsData is used for the stored field contents and so only *stored*
> in index, of course not analyzed (why should I analyze a stored field). The
> indexed tokens go of course through your analyzer and the returned tokens
> are indexed as terms. Where is the problem?
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
>> -----Original Message-----
>> From: Bernd Fehling [mailto:bernd.fehling@uni-bielefeld.de]
>> Sent: Thursday, November 25, 2010 2:08 PM
>> To: java-user@lucene.apache.org
>> Subject: not indexing analyzed field
>>
>> I used KeywordAnalyzer and KeywordTokenizer as templates for a new
>> analyzer.
>> The analyzer works fine but the result never reaches the index.
>>
>> My analyzer is called in "DocInverterPerField.processFields"
>> with "stream.incrementToken()".
>> ...
>> try {
>>     boolean hasMoreTokens = stream.incrementToken();
>>
>>     fieldState.attributeSource = stream;
>>
>>     OffsetAttribute offsetAttribute =
>> fieldState.attributeSource.addAttribute(OffsetAttribute.class);
>>     PositionIncrementAttribute posIncrAttribute =
>> fieldState.attributeSource.addAttribute(PositionIncrementAttribute.class);
>>
>>     consumer.start(field);
>> ...
>>
>> The result goes to "fieldState.attributeSource" but is not in "field".
>> So "field.fieldsData" still has the old content before calling my
> analyzer. And
>> when calling "consumer.start(field)" the old content is going to the index
> and
>> not the new analyzed one.
>> Does the analyzer has to care about "Fieldable field.fieldsData"
>> or who is responsible for it?
>>
>> Regards
>> Bernd
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

-- 
*************************************************************
Bernd Fehling                Universit├Ątsbibliothek Bielefeld
Dipl.-Inform. (FH)                        Universit├Ątsstr. 25
Tel. +49 521 106-4060                   Fax. +49 521 106-4052
bernd.fehling@uni-bielefeld.de                33615 Bielefeld

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message