lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shyam Bhaskaran <Shyam.Bhaska...@synopsys.com>
Subject RE: Question on Reverse Indexing
Date Wed, 18 Jan 2012 11:09:51 GMT
Dimitry,

We are using Solr 4.0. To confirm server caching issues I have restarted our tomcat webserver
after performing a re-index.

For reverseIndexing we have defined a fieldType "text_rev" and this fieldyType was used against
the fields.

  <fieldType name="text_rev" class="solr.TextField" sortMissingLast="true" omitNorms="true">
     <analyzer type="index">
                <tokenizer class="com.es.solr.backend.analysis.standard.SolvNetTokenizerFactory"/>
                <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
                <filter class="com.es.solr.backend.analysis.standard.SolvNetFilterFactory"/>
                <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
                <filter class="com.es.solr.backend.analysis.standard.SpecialCharSynonymFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
                maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>
     </analyzer>
     <analyzer type="query">
                <tokenizer class="com.es.solr.backend.analysis.standard.SolvNetTokenizerFactory"/>
                <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
                <filter class="com.es.solr.backend.analysis.standard.SolvNetFilterFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
     </analyzer>
  </fieldType>

But when it was found that ReversedWildcardFilterFactory is adding performance burden we removed
the ReversedWildcardFilterFactory filter
                <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
                maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>
and the whole collection was re-indexed.

But even after removing the ReversedWildcardFilterFactory leading wild card search like *lock
is working.

-Shyam

-----Original Message-----
From: Dmitry Kan [mailto:dmitry.kan@gmail.com] 
Sent: Wednesday, January 18, 2012 4:26 PM
To: solr-user@lucene.apache.org
Subject: Re: Question on Reverse Indexing

OK. Not sure what is your system architecture there, but could your queries
stay cached in some server caches even after you have re-indexed your data?
The way the index level leading wildcard works (reading SOLR 3.4 code, but
seems to be true circa 1.4) is that the following check is done for the
analysis chain:

[code src=SolrQueryParser.java]
boolean allow = false;
...
          if (factory instanceof ReversedWildcardFilterFactory) {
            allow = true;
            ...
          }
...
    if (allow) {
      setAllowLeadingWildcard(true);
    }
[/code]

so practically what you described can happen if
the ReversedWildcardFilterFactory is still mentioned in one of your shards.
A weird question, but have you reindexed your data to a clean index or on
top of the existing one?

On Wed, Jan 18, 2012 at 12:35 PM, Shyam Bhaskaran <
Shyam.Bhaskaran@synopsys.com> wrote:

> Dimitry,
>
> Using http://localhost:7070/solr/docs/admin/analysis.jsp passed the query
> *lock and did not find ReversedWildcardFilterFactory to the indexer or any
> other filters that could do the reversing.
>
> -Shyam
>
> -----Original Message-----
> From: Dmitry Kan [mailto:dmitry.kan@gmail.com]
> Sent: Wednesday, January 18, 2012 2:26 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question on Reverse Indexing
>
> Just to play safe here, can you double check that the reversing is not any
> more the case by issuing a query through the admin analysis page?
>
> Dmitry
>
> On Wed, Jan 18, 2012 at 4:23 AM, Shyam Bhaskaran <
> Shyam.Bhaskaran@synopsys.com> wrote:
>
> > Hi Francois,
> >
> > I understand that disabling of ReversedWildcardFilterFactory has improved
> > the performance.
> >
> > But I am puzzled over how the leading wild card search like *lock is
> > working even though I have now disabled the ReversedWildcardFilterFactory
> > and the indexes have been created without ReversedWildcardFilter ?
> >
> > How does reverse indexing work even after disabling
> > ReversedWildcardFilterFactory?
> >
> > Can anyone explain me how this feature is working.
> >
> > -Shyam
> >
> > -----Original Message-----
> > From: Fran├žois Schiettecatte [mailto:fschiettecatte@gmail.com]
> > Sent: Wednesday, January 18, 2012 7:49 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Question on Reverse Indexing
> >
> > Using ReversedWildcardFilterFactory will double the size of your
> > dictionary (more or less), maybe the drop in performance that you are
> > seeing is a result of that?
> >
> > Fran├žois
> >
> > On Jan 17, 2012, at 9:01 PM, Shyam Bhaskaran wrote:
> >
> > > Hi,
> > >
> > > For reverse indexing we are using the ReversedWildcardFilterFactory on
> > Solr 4.0
> > >
> > >
> > > <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
> > >
> > > maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>
> > >
> > >
> > > ReversedWildcardFilterFactory was helping us to perform leading wild
> > card searches like *lock.
> > >
> > > But it was observed that the performance of the searches was not good
> > after introducing ReversedWildcardFilterFactory filter.
> > >
> > > Hence we disabled ReversedWildcardFilterFactory filter and re-created
> > the indexes and this time we found the performance of Solr query to be
> > faster.
> > >
> > > But surprisingly it is observed that leading wild card searches were
> > still working inspite of disabling the ReversedWildcardFilterFactory
> filter.
> > >
> > >
> > > This behavior is puzzling everyone and wanted to know how this behavior
> > of reverse indexing works?
> > >
> > > Can anyone share with me on this Solr behavior.
> > >
> > > -Shyam
> > >
> >
> >
>
>
> --
> Regards,
>
> Dmitry Kan
>



-- 
Regards,

Dmitry Kan
Mime
View raw message