lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <ear...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count
Date Sun, 23 May 2010 18:06:33 GMT
> The QP should work like that:
> (1) It parses the query, creating fragments
> (2) It does some out-of-the-box handling of those fragments
>
> People should be able to override that handling of fragments. But people
> should not touch (1).

In fact QP should work like that:
(1) Tokenizer parses the query as if it was a string of text.
Care must be taken to preserve query language operators, as this stage
essentially replaces current QP's lexer stage.
(2) QP's syntax parser kicks in, identifies operators (those that
Tokenizer didn't treat as a part of word tokens) and does overridable
out-of-the-box handling for them and tokens around them.

The point is - it's hard to do correctly. That's why Lucene resorts to
upside-down approach.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message