lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kydryavtsev Andrey <>
Subject Re: Solr - Match whole word only in text fields
Date Fri, 27 Dec 2013 04:18:27 GMT
Hi everybody!

Ahmet, do I get it correct - if I use this text_char_norm field type, for input "myName=aaa
bbb" I'll index terms "myName", "aaa", "bbb"? So I'll match with query like "myName" or query
like  "bbb", but not match with "myName aaa". I can use this type for query value, so split
"myName aaa" into ( "myName" && "aaa") - and it will work. But this approach will
give false positive match with "myName bbb". What do you think, how I can handle this? One
of the  approaches is to use in this field type KeywordTokenizer+ShingleFilter instead of
WhitespaceTokenizerFactory, so tokens like "myName", "myName aaa", "myName aaa bbb", "aaa",
"aaa bbb", "bbb" will be indexed, but it significantly increased index size in case of long

26.12.2013, 03:20, "Ahmet Arslan" <>:
> Hi Haya,
> With MappingCharFilter you can have full control over character set that you want to
> in mappings.txt you will have
> ":" => " "
> "=" => " "
> Use the following type and see if it suits for your needs. Update mappings.txt according
to your needs.
>     <fieldType name="text_char_norm" class="solr.TextField" positionIncrementGap="100"
>       <analyzer>
>         <charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.LowerCaseFilterFactory" />
>       </analyzer>
>     </fieldType>
> On Sunday, December 22, 2013 9:19 PM, haya.axelrod <> wrote:
> I have a text field that can contain very long values (like text files). I
> want to create field type for it (text, not string), in order to have
> something like "Match whole word only" in notepad++, but the delimiter
> should not be only white spaces. If i have:
> myName=aaa bbb
> I would like to get it for the following search strings "aaa", "bbb", "aaa
> bbb", "myName=aaa bbb", "myName", but not for "aa" or "ame=a" or "a bb".
> Another example is:
> <myName>aaa bbb</myName>
> Can i do this somehow?
> What should be my field type definition?
> The text can contain any character. Before search i'm escaping the search
> string using
> Thanks
> --
> View this message in context:
> Sent from the Solr - User mailing list archive at

View raw message