lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Modassar Ather <modather1...@gmail.com>
Subject Re: Clarification on WordDelimiterFilter.
Date Fri, 07 Aug 2015 04:21:21 GMT
Hi,

Any suggestion will be really helpful. Kindly provide your inputs.

Thanks,
Modassar

On Thu, Aug 6, 2015 at 2:06 PM, Modassar Ather <modather1981@gmail.com>
wrote:

> I am using WordDelimiterFilter while indexing and searching both with the
> following attributes. Parser used is edismax. Solr version is 5.2.1.
>
> *<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="1" splitOnCaseChange="1" preserveOriginal="1"/>*
>
> During search some of the results returned are not wanted. Following is
> the example.
>
> Search query: "3d image"
> Search results with 3-d image/3 d image/1d image are also returned. As per
> analysis page this is happening because of position increment in the token
> as explained below.
>
> On the analysis page it shows following four tokens for 3d and there
> positions.
> token         position
> 3d             1
> 3               1
> 3d             1
> d               2
>
> image        3
>
> Another example is "1d obj*" returning results containing "d-object"
> related result. This can bring a completely different search item.
>
> Here the token d is at position 2 which is causing the above matches.
> Please help me understand why this position increment is done?
> The position increment will also cause the "3d image" search fail on a
> document containing "3d image" as the "d" comes at position 2.
>
> Kindly help me understand the best practices of using WordDelimiterFilter
> or provide your inputs how we can resolve the issue.
>
> Thanks,
> Modassar
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message