lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Karman <pe...@peknet.com>
Subject Re: [lucy-user] Complex search
Date Wed, 08 Feb 2012 03:08:08 GMT
Odisseu21 wrote on 2/3/12 8:56 AM:
> I am new in Lucy and looking for fast and elegant, search solutions that are able to:
> 
> - return an excerpt, HTML highlighted, around the MASTER_KEY_WORD
> - MASTER_KEY_WORD could be matched partial or not
> - must be possible define the size of excerpt (before and after the MASTER_KEY_WORD,
maybe in terms of number of words or lines)
> - optional keywords, called INC_KEY_WORD, must be present, inside the excerpt, no matter
the order
> - optional keywords, called EXC_KEY_WORD, must not be present, inside the excerpt, no
matter the order
> - combinations of INC_KEY_WORD and EXC_KEY_WORD are possible
> 
> Example: 
>               apple (partial)                -> MASTER_KEY_WORD
>               + (bag + blue, girl)         -> INC_KEY_WORD combo
>               -  (black+ man, orange)  -> EXC_KEY_WORD combo
> 
> must return excerpts that the string 'apple' exists (apple, apples, applebees, ...)
> and ('bag' AND 'blue') or 'girl'
> but not ('black' AND 'man') or 'orange' surrounding the master keyword 'apple'
> 
> Today we are using Postgres queries and some Perl code to do that in millions of docs.
We have a good performance, for now.
> 
> Is it possible to build such algorithm using Lucy? Fast an easy, in one step?
> Or maybe Lucy will be used just to retrieve the excerpt surroundig the master key word
with subsequent Perl code to apply the rest?

you can do most of the above with Lucy, though not in one step. Some
post-processing for the INC_ and EXC_ key words would be necessary.

I use Search::Tools plus Lucy for this kind of thing, since Search::Tools will
let me highlight and excerpt from the original document as well.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Mime
View raw message