lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bny Jo <bny...@yahoo.com>
Subject Re: How to avoid space on facet field
Date Wed, 03 Jun 2009 11:51:43 GMT
Anshuman, thanks for you input. I will try that, I can understand what you are trying.  

Marcus, I did not understand  how your KeyworkTokenizer work. Is that I have to define a septate
field like what we have in example schema and call that field. This what I came up with.

 <fieldType name="facet_tex" class="solr.TextField" sortMissingLast="true" omitNorms="true">
      <analyzer>
    
        <tokenizer class="solr.KeywordTokenizerFactory"/>
          <filter class="solr.LowerCaseFilterFactory" />
        <!-- The TrimFilter removes any leading or trailing whitespace -->
        <filter class="solr.TrimFilterFactory" />
       
        <filter class="solr.PatternReplaceFilterFactory"
                pattern="([^a-z])" replacement="" replace="all"
        />
      </analyzer>
    </fieldType>



Thanks

Boney


________________________________
From: Marc Sturlese <marc.sturlese@gmail.com>
To: solr-user@lucene.apache.org
Sent: Wednesday, June 3, 2009 3:45:49 AM
Subject: Re: How to avoid space on facet field


You can configure a "facet_text" instead of the normal "text" type. There you
use KeyWordTokenizer instead of StandardTokenizer. One of the advantages of
using it instead of "string" is that it will allow you to use synonyms,
stopwords and filters and all the properties from an analyzer.


Anshuman Manur wrote:
> 
> Hey,
> 
> From what you have written I'm guessing that in your schema.xml file, you
> have defined the field manu to be of type  "text", which is good for
> keyword
> searches, as the text type indexes on whitespace, i.e. Dell Inc. is
> indexed
> as dell, inc. so keyword searches matches either dell or inc. But when you
> want to facet on a particular field, you want exact matches regardless of
> whitespace in between. In such cases its a good idea to use the string
> type.
> Let me illustrate with an example based on my settings:
> 
> Here are my fields:
> 
>    <!-- Core Fields -->
>    <field name="id" type="string" indexed="true" stored="true"
> required="true" />
>    <field name="name" type="text" indexed="true" stored="true"/>
>    <field name="manu" type="text" indexed="true" stored="true"/>
>    <field name="sport" type="text" indexed="true" stored="true" />
>    <field name="type" type="text" indexed="true" stored="true" />
>    <field name="desc" type="text" indexed="true" stored="true" />
>    <field name="ldesc" type="text" indexed="true" stored="true" />
> 
>    <!-- default text Field for searching -->
>    <field name="text" type="text" indexed="true" stored="false"
> multiValued="true"/>
> 
>    <!-- exact string fields for faceting -->
>    <field name="sport_exact" type="string" indexed="true" stored="false"
> />
>    <field name="manu_exact" type="string" indexed="true" stored="false" />
>    <field name="type_exact" type="string" indexed="true" stored="false" />
> 
>    <copyField source="manu" dest="text"/>
>    <copyField source="name" dest="text"/>
>    <copyField source="sport" dest="text"/>
>    <copyField source="desc" dest="text"/>
>    <copyField source="ldesc" dest="text"/>
>    <copyField source="type" dest="text"/>
> 
>    <copyField source="manu" dest="manu_exact"/>
>    <copyField source="sport" dest="sport_exact"/>
>    <copyField source="type" dest="type_exact"/>
> 
> So, when doing keyword searches I use the <field name="text"...> to search
> in all the fields, as I copyField all the fields onto the field named
> text.
> But, for faceting I use the exact fields, which are of type string and
> don't
> split on whitespace.
> 
> 
> Anshu
> 
> On Wed, Jun 3, 2009 at 1:50 AM, Bny Jo <bnykjo@yahoo.com> wrote:
> 
>>
>> Hello,
>>
>>  I am wondering why solr is returning a manufacturer name field ( Dell,
>> Inc) as Dell one result and Inc another result. Is there a way to facet a
>> field which have space or delimitation on them?
>>
>> query.addFacetField("manu");
>> query.setFacetMinCount(1);
>>        query.setIncludeScore(true);
>>  List<FacetField> facetFieldList=qr.getFacetFields();
>>            for(FacetField facetField: facetFieldList){
>>                System.out.println(facetField.toString() +"Manufactures");
>>                }
>> And it returns
>> -----------------
>> [manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1),
>> viewson
>> (1), vizo (1)]]
>>
>>
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-avoid-space-on-facet-field-tp23840037p23847742.html
Sent from the Solr - User mailing list archive at Nabble.com.


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message