lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: New Lucene QueryParser
Date Wed, 03 Jan 2007 13:00:38 GMT
Hey Laurent,

I am actually pretty much ready for a beta/preview release right about 
now. All of the features are in and I am pretty happy with most of the 
work. Over the past month I have been squashing bugs and could certainly 
use as much help as I can get making sure this thing is as perfect as it 
can be. I am currently in the middle of migrating to a new laptop, so I 
may take a couple days to get a distribution jar together with some 
simple documentation, but I plan on doing that as soon as I get a chance.

>
>> Query-time thesaurus expansion / General token to query expansion : 
>> Takes advantage of a general find/replace feature, "expand" might map 
>> to "(expander | expanded)" ... or any other valid syntax. 
> This I could also use, if can also do following ?
> right now I've a little utility class which expands special strings 
> (syntax is to be disc.) to all combinations :
> "fest[,e] hypothek[,en,a]"
> -> fest hypothek;fest hypotheken;fest hypotheka;feste hypothek;feste 
> hypotheken;feste hypotheka
>
I require a similar feature, although in the form mark{s es ing} -> 
marks markes marking. Unfortunately, the way I have done it (in the 
JavaCC grammer) is not easily configurable.

>> Note that there may be some limitations...but so far this has proved 
>> to be pretty powerful
> Would still be good to know the limitations you see right now...
>
I mentioned there might be limitations because I kept running into new 
difficult problems and I just didn't know if something would come up I 
could not get around or if something would be too slow etc. Not to 
mention I am still a little (or a lot depending on who you talk to) wet 
behind the ears. So far I have not run into any limitations. That 
certainly does not mean they don't exists though :) I'm still crossing 
my fingers. My goal is to make this thing as perfect as I can. It's 
basically my new hobby.


- Mark



> Mark Miller wrote:
>> I have finally delved back into the Lucene Query parser that I 
>> started a few months back. I am very closing to wrapping up it's 
>> initial development. I am currently looking for anybody willing to 
>> help me out with a little testing and maybe some design consultation 
>> (I am not happy with the current range query  syntax for one). If you 
>> have any interested  in using this parser and have a little time to 
>> help out, please do. The parser is extremely customizable and you can 
>> basically mold it into whatever you want. A brief outline of the 
>> feature set:
>>
>> The basics from Lucene query parser are covered: escaping operators, 
>> handling tokens at the same position, range queries, etc.
>>
>> Default Operators are: & | ! ~ ( )
>> New operators can be defined and default operators can be hidden on 
>> the fly.
>>
>> Adds a proximity operator to the standard AND, OR, and ANDNOT 
>> operators allowing for queries like:
>> (search bear) ~5 (snake & horse ~4 pope) | crazy query
>>
>> The default space operator is customizable and can be made to bind 
>> tighter than if you use the actual operator (the operator acts like 
>> the actual operator but within parenthesis).
>>
>> The order of operations for the operators is customizable. The 
>> default order is |, &, ~, !, ( )...you can change it to whatever you 
>> want.
>>
>> Query-time thesaurus expansion / General token to query expansion : 
>> Takes advantage of a general find/replace feature, "expand" might map 
>> to "(expander | expanded)" ... or any other valid syntax. There is 
>> also a slower RegEx feature so that you can match tokens with a 
>> Pattern and perform back reference enabled replacements. You can also 
>> make the replacement behave as an operator...you might map NEAR to 
>> ~10 , creating a new operator that performs within 10 word proximity 
>> searches.
>>
>> Did You Mean feature using the SpellCheck contrib: if you search for 
>> 'date(Aug 3, 1952) & mackine | rabbit' you might get a suggestion of 
>> : 'date(Aug 3, 1952) & machine | rabbit'
>>
>> Paragraph/Sentence proximity search functionality. You can inject 
>> tokens to specify paragraph and sentence markers and perform 
>> SpanNotWithin searches for paragraph sentence proximity searches.
>>
>> Customizable date parser.
>>
>> Everything is pretty much configurable on the fly.
>>
>> Note that there may be some limitations...but so far this has proved 
>> to be pretty powerful. I could sure use some testing help making it 
>> production ready though. I will be putting a new website up for the 
>> parser soon. Please send me a note if you can help out at all. When I 
>> put up the jar you can just run it with Java -jar and it will provide 
>> a console input to enter queries and see the Lucene Query generated.
>>
>> - Mark Miller
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message