lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <>
Subject Re: Which tokenizer or analizer should use and field type
Date Fri, 12 Apr 2013 21:49:34 GMT
Unfortunately, Solr doesn't have a query parser that would give the meaning 
you want to:

project assistant,manager

For now, you would need to write that query as:

(project AND assistant) OR manager

Or maybe as:

"project assistant"~5 OR manager

That would require project and assistant to occur with a few words of each 

Or, if you have q.op defaulted to "OR":

"project assistant"~5 manager

Add the HTML strip char filter to your text field type:

<charFilter class="solr.HTMLStripCharFilterFactory" />

text_general is a semi-decent place to start.

-- Jack Krupansky

-----Original Message----- 
From: anurag.jain
Sent: Friday, April 12, 2013 11:32 AM
Subject: Which tokenizer or analizer should use and field type

my schema file is :

<copyField source="title" dest ="keyword"/>
<copyField source="body" dest ="keyword"/>
<copyField source="company_name" dest="keyword"/>
<copyField source="company_profile" dest="keyword"/>

<field name="title" type="text_general" indexed="true" stored="true"/>
<field name="body" type="text_general" indexed="true" stored="true"/>
<field name="company_name" type="text_general" indexed="true"
<field name="company_profile" type="text_general" indexed="true"

<fieldType name="text_general" class="solr.TextField"
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />

        <filter class="solr.LowerCaseFilterFactory"/>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>

values are like,

title: "Assistant Coach/ Junior Assistant"
body: "<p> <> <br /><br />Oil India Ltd.
applications for the post of <strong>Sr Medical Officer (Paediatrics)
</strong><br /><br /> <strong>Qualification</strong>
MD (Paediatrics) <br /><br /> <strong>No of Post</strong> : 1UR<br
/> <br
/><strong> Pay Scale</strong> : Rs 32900 -58000 <br /> <br /> <strong>Age
on 11.04.2013</strong> : 32 yrs<br /> </p><p><strong>Selection
Procedure :
</strong>Selection for the above post will be based on Written Test, Group
Discussion (GD), Viva-Voce and Medical Examination.<br /> </p>"

company_profile: "<p>The story of <strong>Oil India Limited (OIL)</strong>
traces and symbolises the development and growth of the Indian petroleum
industry. From the discovery of crude oil in the far east of India at
Digboi, Assam in 1889 to its present status as a fully integrated upstream
petroleum company, OIL has come far, crossing many milestones.</p>",

company_name: "Oil India Limited",

please give me suggestion about field type i should use.

keyword is copyfield i am using for search. i do not want to search on html

How search will happen ?

if i give words to search

project assistant,manager

it only should give me keyword have project assistance or manager.

right now it is giving me results which has project or assistance or manager
that is wrong case for me.

Please give me solution for it. I have to complete that task by today thats
why i am not able to do research on it.

need field type definitions for each field. and how search query i'll write

thanks in advance

View this message in context:
Sent from the Solr - User mailing list archive at 

View raw message