lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From goran kent <gorank...@gmail.com>
Subject Re: [lucy-user] Lucy questions wrt production, ranking, etc
Date Fri, 09 Sep 2011 06:47:20 GMT
Thanks for the quick response Nathan!  See question below.

On Thu, Sep 8, 2011 at 9:58 PM, Nathan Kurz <nate@verse.com> wrote:
>
>> The environment is distributed search across a cluster with the intent
>> of keeping search-time sub-second - 3s at most (folks are spoilt by
>> the elephant in the industry, so they lose interest if the page does
>> not return in that time).
>>
>> I see from the docs that distributed search is supported, else it
>> would be a non-starter.
>
> This excites me too, but I don't know that anyone is pushing it's
> limits yet.  But architecturally, I think it's well designed to allow
> really fast clusters of in-ram search.  Talking about 3 seconds makes
> it sound like you're willing to hit disk:  you might need some intense
> tuning here, depending on how you deal with really common stopwords.
>  Also, there are some limitations with custom sort ordering and the
> like:  clusters are going to deal better with floating point than with
> alphabetical, for example, and

> ... excerpts might be a little clunky to
> retrieve.  Currently it's just a DocID and a score that get returned
> efficiently.

Just to clarify - is obtaining excerpts from a distributed search a
problem?  One would think irrespective of whether you're performing a
local or distributed search the modus operandi would be the same
(without coding gymnastics required to glue things together to work as
expected).

Mime
View raw message