lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr special characters like '(' and '&'?
Date Tue, 08 Apr 2014 14:14:34 GMT
I'd seriously consider filtering these characters out when you index
and search, this is quite likely very brittle. The same item, say from
two different vendors, might have D (E & F) or D E & F. If you just
stripped all of the non alpha-num characters you'd likely get less
brittle results.

You know your problem domain better than I do though, so whatever
makes most sense.

Best,
Erick

On Tue, Apr 8, 2014 at 6:55 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
> Hi Peter,
>
> TermQueryParser is useful in your case.
> q={!term f=categories_string}A|B|D (E & F)
>
>
>
> On Tuesday, April 8, 2014 4:37 PM, Peter Kirk <pk@alpha-solutions.dk> wrote:
> Hi
>
> How to search for Solr special characters like '(' and '&'?
>
> I am trying to execute searches for "products" in my Solr (3.6.1) index, based on the
"categories" to which these products belong.
> The categories are stored in a multistring field for the products, and are hierarchical,
and are fed to the index like:
> A
> A|B
> A|B|C
>
> So this product would actually belong to category named "C", which is a child of "B",
which is a child of !"A".
>
> I am able to execute queries for simple category names like this (eg. fq=categories_string:A|B|C).
>
> But some categories have Solr special characters in their names, like: "D (E & F)"
> (Real example: "Power supplies (Battery and Solar)").
>
> A query like fq=categories_string:A|B|D (E & F) simply fails.
> But even if I try
> fq=categories_string:A|B|D%20\(E%20%26amp%3B%20F\)
> (where I try to escape the special characters) does not find the products in this category,
and actually finds other unrelated categories.
>
> What am I doing wrong?
>
> Thanks,
> Peter
>

Mime
View raw message