lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 小鱼儿 <ctengc...@gmail.com>
Subject Re: Question about PhraseQuery's capacity...
Date Fri, 10 Jan 2020 11:14:48 GMT
explain api helps! thanks for hint~!
I have found out that one case failed becaused i carelessly add another
filter condition, but the other case (which is analyzed into 30 terms)
still failed, doesn't know why
I guess i need to write a unit testcase to use MultiTerms.getTerms API to
find out if there is any mismatch in analyzer's processing or if there is a
capacity limit in PhraseQuery...

Mikhail Khludnev <mkhl@apache.org> 于2020年1月10日周五 下午6:21写道:

> Hello,
> Sometimes IndexSearcher.explain(Query, int) allows to analyse mismatches.
>
> On Fri, Jan 10, 2020 at 1:13 PM 小鱼儿 <ctengctsh@gmail.com> wrote:
>
> > After i directly call Analyzer.tokenStream() method to extract terms from
> > query, i still cannot get results. Doesn't know the why...
> >
> > Code when build index:
> >            IndexWriterConfig iwc = new IndexWriterConfig(analyzer); //new
> > SmartChineseAnalyzer();
> >
> > Code do query:
> > (1) extract terms from query text:
> >
> >  public List<String> analysis(String fieldName, String text) {
> > List<String> terms = new ArrayList<String>();
> > TokenStream stream = analyzer.tokenStream(fieldName, text);
> > try {
> > stream.reset();
> > while(stream.incrementToken()) {
> > CharTermAttribute termAtt = stream.getAttribute(CharTermAttribute.class);
> > String term = termAtt.toString();
> > terms.add(term);
> > }
> > stream.end();
> > } catch (IOException e) {
> > e.printStackTrace();
> > log.error(e.getMessage(), e);
> > }
> > return terms;
> > }
> >
> > (2) Code to construct a PhraseQuery:
> >
> > private Query buildPhraseQuery(Analyzer analyzer, String fieldName,
> String
> > queryText, int slop) {
> > PhraseQuery.Builder builder = new PhraseQuery.Builder();
> > builder.setSlop(2); //? max is 2;
> > List<String> terms = analyzer.analysis(fieldName, queryText);
> > for(String termKeyword : terms) {
> > Term term = new Term(fieldName, termKeyword);
> > builder.add(term);
> > }
> > Query query = builder.build();
> > return query;
> > }
> >
> > Use BooleanQuery also failed:
> >
> > private Query buildBooleanANDQuery(Analyzer analyzer, String fieldName,
> > String queryText) {
> > BooleanQuery.Builder builder = new BooleanQuery.Builder();
> > List<String> terms = analyzer.analysis(fieldName, queryText);
> > log.info("terms: "+StringUtils.join(terms, ", "));
> > for(String termKeyword : terms) {
> > Term term = new Term(fieldName, termKeyword);
> > builder.add(new TermQuery(term), BooleanClause.Occur.MUST);
> > }
> > return builder.build();
> > }
> >
> > Adrien Grand <jpountz@gmail.com> 于2020年1月10日周五 下午4:53写道:
> >
> > > It should match. My guess is that you might not reusing the same
> > positions
> > > as set by the analysis chain when creating the phrase query? Can you
> show
> > > us how you build the phrase query?
> > >
> > > On Fri, Jan 10, 2020 at 9:24 AM 小鱼儿 <ctengctsh@gmail.com> wrote:
> > >
> > > > I use SmartChineseAnalyzer to do the indexing, and add a document
> with
> > a
> > > > TextField whose value is a long sentence, when anaylized, will get 18
> > > > terms.
> > > >
> > > > & then i use the same value to construct a PhraseQuery, setting slop
> to
> > > 2,
> > > > and adding the 18 terms concequently...
> > > >
> > > > I expect the search api to find this document, but it returns empty.
> > > >
> > > > Where am i wrong?
> > > >
> > >
> > >
> > > --
> > > Adrien
> > >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message