lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Problem with caps and star symbol
Date Tue, 31 May 2011 14:07:39 GMT
I think you're tripping over the issue that wildcards aren't analyzed, they
don't go through your analysis chain. So the casing matters. Try lowercasing
the input and I believe you'll see more like what you expect...

Best
Erick

On Mon, May 30, 2011 at 12:07 AM, Saumitra Chowdhury
<saumitra@smartitengineering.com> wrote:
> I am sending some xml to understand the scenario.
> Indexed term = ROLE_DELETE
> Search Term = roledelete
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">4</int>
> <lst name="params">
> <str name="indent">on</str>
> <str name="start">0</str>
> <str name="q">name : roledelete</str>
> <str name="version">2.2</str>
> <str name="rows">10</str>
> </lst>
> </lst>
> <result name="response" numFound="1" start="0">
>
> Indexed term = ROLE_DELETE
> Search Term = role
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">5</int>
> <lst name="params">
> <str name="indent">on</str>
> <str name="start">0</str>
> <str name="q">name : role</str>
> <str name="version">2.2</str>
> <str name="rows">10</str>
> </lst>
> </lst>
> <result name="response" numFound="1" start="0">
> <doc>
> <str name="creationDate">Mon May 30 13:09:14 BDST 2011</str>
> <str name="displayName">Global Role for Deletion</str>
> <str name="id">role:9223372036854775802</str>
> <str name="lastModifiedDate">Mon May 30 13:09:14 BDST 2011</str>
> <str name="name">ROLE_DELETE</str>
> </doc>
> </result>
> </response>
> <doc>
> <str name="creationDate">Mon May 30 13:09:14 BDST 2011</str>
> <str name="displayName">Global Role for Deletion</str>
> <str name="id">role:9223372036854775802</str>
> <str name="lastModifiedDate">Mon May 30 13:09:14 BDST 2011</str>
> <str name="name">ROLE_DELETE</str>
> </doc>
> </result>
> </response>
>
>
> Indexed term = ROLE_DELETE
> Search Term = role*
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">4</int>
> <lst name="params">
> <str name="indent">on</str>
> <str name="start">0</str>
> <str name="q">name : role*</str>
> <str name="version">2.2</str>
> <str name="rows">10</str>
> </lst>
> </lst>
> <result name="response" numFound="1" start="0">
> <doc>
> <str name="creationDate">Mon May 30 13:09:14 BDST 2011</str>
> <str name="displayName">Global Role for Deletion</str>
> <str name="id">role:9223372036854775802</str>
> <str name="lastModifiedDate">Mon May 30 13:09:14 BDST 2011</str>
> <str name="name">ROLE_DELETE</str>
> </doc>
> </result>
> </response>
>
>
> Indexed term = ROLE_DELETE
> Search Term = Role*
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">4</int>
> <lst name="params">
> <str name="indent">on</str>
> <str name="start">0</str>
> <str name="q">name : Role*</str>
> <str name="version">2.2</str>
> <str name="rows">10</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> </response>
>
>
> Indexed term = ROLE_DELETE
> Search Term = ROLE_DELETE*
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">4</int>
> <lst name="params">
> <str name="indent">on</str>
> <str name="start">0</str>
> <str name="q">name : ROLE_DELETE*</str>
> <str name="version">2.2</str>
> <str name="rows">10</str>
> </lst>
> </lst>
> <result name="response" numFound="0" start="0"/>
> </response>
> I am also adding a analysis html.....
>
>
> On Mon, May 30, 2011 at 7:19 AM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>>
>> I'd start by looking at the analysis page from the Solr admin page. That
>> will give you an idea of the transformations the various steps carry out,
>> it's invaluable!
>>
>> Best
>> Erick
>> On May 26, 2011 12:53 AM, "Saumitra Chowdhury" <
>> saumitra@smartitengineering.com> wrote:
>> > Hi all ,
>> > In my schema.xml i am using WordDelimiterFilterFactory,
>> > LowerCaseFilterFactory, StopFilterFactory for index analyzer and an
>> > extra
>> > SynonymFilterFactory for query analyzer. I am indexing a field name
>> > '*name*'.Now
>> > if a value with all caps like "NAME_BILL" is indexed I am able get this
>> > as
>> > search result with the term " *name_bill *", " *NAME_BILL *", "
>> > *namebill
>> *",
>> > "*namebill** ", " *nameb** " ... But for the term like following " *
>> > NAME_BILL** ", " *name_bill** ", " *namebill** ", " *NAME** " the result
>> > does mot show this document. Can anyone please explain why this is
>> > happening? .In fact star " * " is not giving any result in many
>> > cases specially if it is used after full value of a field.
>> >
>> > Portion of my schema is given below.............
>> >
>> > <fieldType name="text_ws" class="solr.TextField"
>> positionIncrementGap="100">
>> > -
>> > <analyzer>
>> > <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> > </analyzer>
>> > </fieldType>
>> > -
>> > <fieldType name="text" class="solr.TextField"
>> > positionIncrementGap="100">
>> > -
>> > <analyzer type="index">
>> > <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0"
>> > generateNumberParts="0" catenateWords="1" catenateNumbers="1"
>> > catenateAll="0"/>
>> > <filter class="solr.LowerCaseFilterFactory"/>
>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>> > words="stopwords.txt" enablePositionIncrements="true"/>
>> > </analyzer>
>> > -
>> > <analyzer type="query">
>> > <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0"
>> > generateNumberParts="0" catenateWords="1" catenateNumbers="1"
>> > catenateAll="0"/>
>> > <filter class="solr.LowerCaseFilterFactory"/>
>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> > ignoreCase="true" expand="true"/>
>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>> > words="stopwords.txt" enablePositionIncrements="true"/>
>> > </analyzer>
>> > </fieldType>
>> > -
>> > <fieldType name="textTight" class="solr.TextField"
>> > positionIncrementGap="100">
>> > -
>> > <analyzer>
>> > <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0"
>> > generateNumberParts="0" catenateWords="1" catenateNumbers="1"
>> > catenateAll="0"/>
>> > <filter class="solr.LowerCaseFilterFactory"/>
>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> > ignoreCase="true" expand="false"/>
>> > <filter class="solr.StopFilterFactory" ignoreCase="true"
>> > words="stopwords.txt"/>
>> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>> > </analyzer>
>> > </fieldType>
>
>

Mime
View raw message