lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teague James" <teag...@insystechinc.com>
Subject RE: Partial Word Search
Date Thu, 06 Feb 2014 16:11:31 GMT
Jack,

Thanks for responding! I had tried configuring this asymmetrically before
with no luck, so I tried it again, and still no luck. My understanding is
that the default behavior for Solr is "OR" and I do not have a 'q.op='
anywhere that would change that behavior. Since it is only a 1 term search
for 'exam' the operator shouldn't matter, right? So here's my asymmetric
config:

NOTE: Every record in my test environment has the same value for
PartialSubject "Example"

<field name="PartialSubject" type="partialWord" indexed="true" stored="true"
multiValued="true" />

<copyField source="PartialSubject" dest="text">

<fieldType name="partialWord" class="solr.TextField"
positionIncrementGap="100">
<analyzer ="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="10" side="front"/>
</analyzer>
<analyzer="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer></fieldType>

Searching for 'exam' yields 0 results, even though every record has
'Example' in the PartialSubject field. Any thoughts on what my configuration
might be missing?

-Teague

-----Original Message-----
From: Jack Krupansky [mailto:jack@basetechnology.com] 
Sent: Wednesday, February 05, 2014 6:07 PM
To: solr-user@lucene.apache.org
Subject: Re: Partial Word Search

1. The ngramming occurs in the index, but does not modify the original,
"stored" value that a query will return. So, "Example" will be returned even
though the index will have all the sub-terms indexed (but not stored.)

2. You need the ngram filters to be asymmetric with regard to indexing and
query - the index analyzer does ngramming, but the query analyzer will not. 
You have a single analyzer, which means that the query will be expanded into
a sequence of sub-terms, which will be ORed or ANDed depending on your
default query operator. OR will generally work since it will query for all
the sub-terms, but AND will only work if all the sub-terms occur in the
document field.

-- Jack Krupansky

-----Original Message-----
From: Teague James
Sent: Wednesday, February 5, 2014 4:52 PM
To: solr-user@lucene.apache.org
Subject: Partial Word Search

I cannot get Solr 4.6.0 to do partial word search on a particular field that
is used for faceting. Most of the information I have found suggests
modifying the fieldType "text" to include either the NGramFilterFactory or
EdgeNGramFilterFactory in the filter. However since I am copying many other
fields to "text" for searching my expectation is that the NGramFilterFactory
would create ngrams for everything sent to it, which is unnecessary and
probably costly - right?

In an effort to try and troubleshoot the issue I created a new field in the
schema and stored it so that I could see what was getting populated.
However, what I'm finding is that no ngrams are being generated, just the
actual data that gets indexed from the database.

Here's what my setup looks like:
NOTE: Every record in my test environment has the same value "Example"

<field name="PartialSubject" type="partialWord" indexed="true" stored="true"
multiValued="true" />

<copyField source="PartialSubject" dest="text">

<fieldType name="partialWord" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="10" side="front"/>
</analyzer>
</fieldType>

When I query Solr it reports:
<arr name="PartialSubject">
<str>Example</str>
</arr>

I was expecting exa, exam, examp, example, example to be the values for
PartialSubject so that a search for "exam" would turn up all of the records
in this test index. Instead I get 0 results.

Can anyone provide any guidance on this please? 


Mime
View raw message