lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomoko Uchida <tomoko.uchida.1...@gmail.com>
Subject Re: How to query for 'any word' in a phrase
Date Thu, 09 Jan 2020 17:09:55 GMT
Hi,
did you try or consider SpanNearQuery?
You might need to insert some kind of spetial token (e.g., <EOS>) to the
end of the text field to match the "end of the sentence" anyways.

2020年1月10日(金) 1:30 陈志祥 <zhixiang.czx@alibaba-inc.com>:

> To be more clear,i think you need build a custom PhraseQuery class,which
> can set each slop value between sub terms,also you need a special
> WildchardTerm matching any term which is only used in this custom
> PhraseQuery context……
>
> Or just use grep tool or regex automata to scan?
>
>
>
>
>
> 陈志祥
> 阿里巴巴 地图引擎核心算法工程师
> 电话:057128223456-81124100
> 邮箱:zhixiang.czx@alibaba-inc.com
> 地址:上海-长宁-申通信息广场
>
> <https://tms.dingtalk.com/markets/dingtalk/person-view-v2?token=1B6294454CD1D4499FF5DBCBBB2150CB765636FFF84AD096D62C7A74B9DD20DD7E289FE886C65C3A037689E72B9EF3FC>
>
> <https://h5.dingtalk.com/home/index.html?corpId=dingd8e1123006514592&token=dd9393e11685028a443f58f91cb00b2a&from=emailSign>
阿里巴巴
> 企业主页
> <https://h5.dingtalk.com/home/index.html?corpId=dingd8e1123006514592&token=dd9393e11685028a443f58f91cb00b2a&from=emailSign>
> <https://h5.dingtalk.com/home/index.html?corpId=dingd8e1123006514592&token=dd9393e11685028a443f58f91cb00b2a&from=emailSign>
> 信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。
> 请接收者注意保密,未经发件人书面许可,不得向任何第三方组织和个人透露本邮件所含信息的全部或部分。以上声明仅适用于工作邮件。
> Information Security Notice: The information contained in this mail is
> solely property of the sender's organization.
> This mail communication is confidential. Recipients named above are
> obligated to maintain secrecy and are not permitted to disclose the
> contents of this communication to others.
>
> ------------------------------------------------------------------
> 发件人:Jeroen Lauwers<Jeroen.Lauwers@CTLO.NET>
> 日 期:2020年01月09日 23:41:37
> 收件人:java-user@lucene.apache.org<java-user@lucene.apache.org>
> 主 题:RE: 回复:How to query for 'any word' in a phrase
>
> I don’t understand your question:
>
> In general: can it be set? Yes, : PhraseQuery<
> https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/search/PhraseQuery.html#PhraseQuery-int-java.lang.String-org.apache.lucene.util.BytesRef...-
> >(int slop, String<
> https://docs.oracle.com/javase/8/docs/api/java/lang/String.html?is-external=true
> > field, BytesRef<
> https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/util/BytesRef.html
> >... terms)
>
> In my specific case: also Yes. I’m parsing the query myself in a custom parser, so
yes I can do it
>
> As far as I understand, the slop is not specific to a position
> Please explain how this could help.
>
> Jeroen
>
> From: 陈志祥 <zhixiang.czx@alibaba-inc.com>
> Sent: donderdag 9 januari 2020 16:31
> To: java-user@lucene.apache.org
> Subject: 回复:How to query for 'any word' in a phrase
>
> could the slop parameter in phasequery be dynamically set?
>
> ------------------------------------------------------------------
> 发件人:Jeroen Lauwers<Jeroen.Lauwers@CTLO.NET<mailto:Jeroen.Lauwers@CTLO.NET
> >>
> 日 期:2020年01月09日 23:17:37
> 收件人:java-user@lucene.apache.org<java-user@lucene.apache.org<mailto:
> java-user@lucene.apache.org%3cjava-user@lucene.apache.org>>
> 主 题:How to query for 'any word' in a phrase
>
> Dear all,
>
> Is there a way to construct (spans?) a phrase search like the following:
> the quick brown * jumps over the * *
> where * = any word but exactly 1 word
>
>
> I introduced these *’s at a specific position, so a PhraseQuery with slop of 2 is just
not good enough
> and the two *’s at the end must be matched as well.
>
> Is there such a thing as a Term or BytesRef that always matches everything?
>
> Thanks,
> Jeroen
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message