lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baris.ka...@oracle.com
Subject Re: ComplexPhraseQueryParser class question
Date Thu, 30 Jan 2020 00:16:32 GMT
Thanks Zhixiang. Yes, it cant find when there is an unrelated term in the middle that is not
indexed.

Similar to what You suggested:
i can try the queryText by excluding one term at a time with the ComplexPhraseQueryParser
and see best matches.
 But, i'd rather this is embedded into a Lucene api.

My question is asking also whether ComplexPhraseQueryParser has a way to support partial phrase
match capability? 

Elastic Search has this capability with a percentage indication. 

i am surprised Lucene Core does not have this, i hope i am wrong.

Best regards


> On Jan 29, 2020, at 7:02 PM, 陈志祥 <zhixiang.czx@alibaba-inc.com> wrote:
> 
> the standard phrasequery cannot do this, but you can prefilter the invalid term(abcd)
out by using MultiTerms api.
> 
> Also, I have found that “a b c”~2 phrase query does not really match “a x x b x
x c” by its implementation……
> 
> 
> 
> 
> 
> 
> 
> 陈志祥
> 阿里巴巴 地图引擎核心算法工程师
> 电话:057128223456-81124100
> 邮箱:zhixiang.czx@alibaba-inc.com
> 地址:上海-长宁-申通信息广场
>  
>  		阿里巴巴	企业主页		 
> 信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。
> 请接收者注意保密,未经发件人书面许可,不得向任何第三方组织和个人透露本邮件所含信息的全部或部分。以上声明仅适用于工作邮件。
> Information Security Notice: The information contained in this mail is solely property
of the sender's organization. 
> This mail communication is confidential. Recipients named above are obligated to maintain
secrecy and are not permitted to disclose the contents of this communication to others.
> ------------------------------------------------------------------
> 发件人:<baris.kazar@oracle.com>
> 日 期:2020年01月30日 05:02:50
> 收件人:java-user@lucene.apache.org<java-user@lucene.apache.org>
> 抄 送:baris.kazar<baris.kazar@oracle.com>
> 主 题:ComplexPhraseQueryParser class question
> 
> Hi,-
> 
>   I hope everyone is doing great.
> 
> 
> i have a question regarrding ComplexPhraseQueryParser class.
> 
> This class can handle this queryText case very well:
> 
> 
> "term1 erm2 abcd term3*"~2
> 
> (last term3 has * at the end and the whole phrase has slop value 2)
> 
> 
> The term1, term2 and term3 are all in the Lucene index but abcd is not.
> 
> In other words there is no "term1 term2 abcd term3" in the Lucene index
> 
> but i still would like to find the following in my results:
> 
> "term1 term2 term3" despite having abcd term there.
> 
> How can i achieve this?
> 
> 
> i setInOrder as true setPhraseSlop as 2 for the ComplexPhraseQueryParser.
> 
> 
> Best regards
> 
> baris
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message