lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <rochk...@jhu.edu>
Subject Re: Facet showing MORE results than expected when its selected?
Date Wed, 10 Nov 2010 22:26:37 GMT
I've had that sort of thing happen from 'corrupting' my index, by 
changing my schema.xml without re-indexing.

If you change field types or other things in schema.xml, you need to 
reindex all your data. (You can add brand new fields or types without 
having to re-index, but most other changes will require a re-index).

Could that be it?

PeterKerk wrote:
> LOL, very clever indeed ;)
>
> The thing is: when I select the amount of records matching the theme 'Hotel
> en Restaurant' in my db, I end up with 321 records. So that is correct. I
> dont know where the 370 is coming from.
>
> Now when I change the query to this: &fq=themes_raw:Hotel en Restaurant 
> I end up with 110 records...(another number even :s)
>
> What I did notice, is that this only happens on multi-word facets "Hotel en
> Restaurant" being a 3 word facet. The facets work correct on a facet named
> "Cafe", so I suspect it has something to do with the tokenization.
>
> As you can see, I'm using "text" and "string".
> For compleness Im posting definition of those in my schema.xml as well:
>
>     <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> 		
>         <!-- in this example, we will only use synonyms at query time
>         <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>         -->
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>     </fieldType>
>
>
> <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> omitNorms="true" />
>   

Mime
View raw message