lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Carlson <carl...@bookandhammer.com>
Subject Re: Status of proximity in query language
Date Mon, 18 Feb 2002 23:45:05 GMT
Sounds Great,

Let hook something up to the phrase query, but I would just suggest another
character so that it's not confusing that the same operator is used for two
different concepts.

Some thoughts

"foo bar"#3
"foo bar"!3
"foo bar"n3
"foo bar"$3

Really I would just suggest excluding what is currently used and ? (used in
urls), & (used in urls), > (would have encode for xml), < (would have to
encode for xml), % (can use be used to escape characters).

--Peter


On 2/18/02 3:21 PM, "Brian Goetz" <brian@quiotix.com> wrote:

>> These are situation where the end user who is using this syntax has to know
>> the limitations and options.
> 
> Right, but that's no excuse for creating more of these situations,
> especially one as egregious as introducing an infix operator that
> _looks_ like it should work with arbitrary operands but doesn't.
> That's like offering a desk calculator with a + button that only adds
> even numbers.  
> 
> Lets not lose sight of something: the query parser is a peripheral
> element of lucene; it converts text representation of queries into the
> internal representation.  No one _has_ to use it.  Its supposed to be
> a convenient first-order approximation that is good enough for most
> applications.
> 
>> In my user documentation
> 
> We can't assume every end user will have access to good documentation,
> or any for that matter.  The Yahoo serach engine has a doc page, but
> few users ever look at it.
> 
> Having NEAR as an infix operator is simply confusing.  Lets not add
> confusing features.
> 
>> For Doug's case
>>   ((a AND b) OR (c AND d)) NEAR20 ((e AND f) OR (g AND h))
>> I understand that this is a difficult case to process, but I also think it
>> is somewhat of an unpractical case in reality.
> 
> OK, what about combinations like:
> Foo* NEAR Bar
> The way this is processed internally, its basically the same (I think).
> 
>> What about putting a constraint on the NEAR operator to only be limited to
>> Term Queries (at least at first).
> 
> Lets find a better solution.
> 
>> I think this is how most users will use this type of search anyway. I agree
>> that it is difficult to solve the general case, but for a limited case, I
>> think this would be valuable to users.
> 
> It IS valuable.  But lets add it in way such that its not confusing.
> 
> Since the slop is tied to the phrasequery mechanism, lets think about
> syntax that operates only on that.
> 
> Ideas:
> "foo bar"(3)
> "foo bar"[3]
> "foo bar"~3
> 
> The latter makes some sense as the ~ already indicates fuzzy, and slop
> is a similar concept to fuzzy (searching for an approximate match.)
> 
> I can make the latter work pretty easily, too.
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
> 
> 


--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message