lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Term no longer matches if PositionLengthAttr is set to two
Date Mon, 01 May 2017 10:33:00 GMT
Hello again, apologies for cross-posting and having to get back to this unsolved problem.

Initially i thought this is a problem i have with, or in Lucene. Maybe not, so is this problem
in Solr? Is here anyone who has seen this problem before?

Many thanks,
Markus

-----Original message-----
> From:Markus Jelsma <markus.jelsma@openindex.io>
> Sent: Tuesday 25th April 2017 13:40
> To: java-user@lucene.apache.org
> Subject: Term no longer matches if PositionLengthAttr is set to two
> 
> Hello,
> 
> We have a decompounder and recently implemented the PositionLengthAttribute in it and
set it to 2 for a two-word compound such as drinkwater (drinking water in dutch). The decompounder
runs both at index- and query-time on Solr 6.5.0.
> 
> The problem is, q=content_nl:drinkwater no longer returns documents containing drinkwater
when posLenAtt = 2 at query time.
> 
> This is Solr's debug output for drinkwater with posLenAtt = 2:
> 
>     <str name="rawquerystring">content_nl:drinkwater</str>
>     <str name="querystring">content_nl:drinkwater</str>
>     <str name="parsedquery">SynonymQuery(Synonym())</str>
>     <str name="parsedquery_toString">Synonym()</str>
> 
> This is the output where i reverted the decompounder, thus a posLenAtt = 1:
> 
>     <str name="rawquerystring">content_nl:drinkwater</str>
>     <str name="querystring">content_nl:drinkwater</str>
>     <str name="parsedquery">SynonymQuery(Synonym(content_nl:drink content_nl:drinkwater))
content_nl:water</str>
>     <str name="parsedquery_toString">Synonym(content_nl:drink content_nl:drinkwater)
content_nl:water</str>
> 
> The indexed terms still have posLenAtt = 2, but having a posLenAtt = 2 at query time
seems to be a problem.
> 
> Any thoughts on this issue? Is it a bug? Do i not understand PositionLengthAttribute?
Why does it affect term/document matching? At query time but not at index time?
> 
> Many thanks,
> Markus
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

Mime
View raw message