lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Test new query parser?
Date Wed, 23 Aug 2006 20:20:41 GMT
You are not kidding. I keep finding this tougher and tougher. Originally 
my method worked for most simple queries, but not all. I improved it to 
cover some more ground, allowing some modestly complex queries, but now 
even that improvement seems woefully inadequate for solving the general 
case. I was handling some pretty complex queries, but losing order of 
operations in a proximity query if things got too exciting. Among other 
small bugs.

The parse tree handles the boolean stuff fine on its own, but the 
proximity seems to require a distributed attack (think algebra) that 
still maintains order of operations. It is a bit nasty.

consider:
cop | fowl & (fowl | priest & man) ! helicopter ~8 (hillary | tom)

this must distribute to (roughly)
cop | ( (fowl & ( (fowl ~8 hillary | (priest ~8 hillary & man ~8 
hillary) ) ! helicopter ~8 hillary) | (fowl ~8 tom | (priest ~8 tom & 
man ~8 tom) ) ! helicopter ~8 tom))

I have gotten close but instead of (fowl | (priest & man)) I might get 
((fowl | priest) & man)...a naive distribution will ignore order of ops 
and order by left to right.

my order of ops:
& "and"
| "or"
~ "within"
! "butnot"
<space between words>

I have a new plan of attack that I have begun, but who knows where it 
will lead. I thought I was so close, but apparently just a tease...that 
method could only take me so far. I hate to put so much work into this 
since I doubt anyone will even use such complex queries (the queries I 
monitor are always so basic) but I may give it a go just to see if my 
new idea will solve the general case.

We will see if this parser actually has any life in it. Maybe I am no 
closer than you where-- I am very new at this.

- Mark
> Mark --
>
> Yes please! I'm very interested in the mixing of boolean and proximity
> operators. I have also worked on a parser (using JavaCC) but haven't
> managed to crack queries such as:
>
>     ((a OR b) AND c) NEAR (d NOT e)
>
> I can get the parse tree okay, but haven't figured out how to translate
> that into a valid Lucene Query object. Simple queries such as:
>
>     (a OR b) NEXT (c OR d)   // note the use of OR exclusively!
>
> are okay, but nothing more complex. So: bring it on!
>
> -- Robert
> rwatkins at foo-bar.org
>
> On Mon, 21 Aug 2006, Mark Miller wrote:
>
>> Is anyone interested in helping me test out a new query parser (i.e is
>> anyone interested in using this, thereby helping me test it) ?
>>
>> The parser uses a intermediate parse tree representation, unlike 
>> Lucene's
>> Query Filter.
>>
>> [ snipped ]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message