Return-Path: X-Original-To: apmail-incubator-lucy-user-archive@www.apache.org Delivered-To: apmail-incubator-lucy-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B03738D14 for ; Fri, 9 Sep 2011 06:47:59 +0000 (UTC) Received: (qmail 60864 invoked by uid 500); 9 Sep 2011 06:47:58 -0000 Delivered-To: apmail-incubator-lucy-user-archive@incubator.apache.org Received: (qmail 60757 invoked by uid 500); 9 Sep 2011 06:47:49 -0000 Mailing-List: contact lucy-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucy-user@incubator.apache.org Delivered-To: mailing list lucy-user@incubator.apache.org Received: (qmail 60746 invoked by uid 99); 9 Sep 2011 06:47:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Sep 2011 06:47:47 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gorankent@gmail.com designates 209.85.212.41 as permitted sender) Received: from [209.85.212.41] (HELO mail-vw0-f41.google.com) (209.85.212.41) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Sep 2011 06:47:41 +0000 Received: by vwm42 with SMTP id 42so949504vwm.0 for ; Thu, 08 Sep 2011 23:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=E0NzuQlnP0fCuKHng5Nw407z8D5TdguaBxpAOpaRuGE=; b=DUBQF4axr8BLRHFQT5Ncbigc+aLd2flmLAO6kJke6rs7cZzSgnhMNKT2a5yQ/o6OAC bYCPRnR8/wE2KToLHLG2Ua1uTFyLj72vwUGLi9rB3fp8wEC/+gXnK7k6k7RK4NhmPHhl 6ECsRPrZuROe+jf9LNXys+YJZGFmOFGxtCTko= MIME-Version: 1.0 Received: by 10.52.177.1 with SMTP id cm1mr296190vdc.78.1315550840181; Thu, 08 Sep 2011 23:47:20 -0700 (PDT) Received: by 10.52.109.230 with HTTP; Thu, 8 Sep 2011 23:47:20 -0700 (PDT) In-Reply-To: References: Date: Fri, 9 Sep 2011 08:47:20 +0200 Message-ID: From: goran kent To: lucy-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Subject: Re: [lucy-user] Lucy questions wrt production, ranking, etc Thanks for the quick response Nathan! See question below. On Thu, Sep 8, 2011 at 9:58 PM, Nathan Kurz wrote: > >> The environment is distributed search across a cluster with the intent >> of keeping search-time sub-second - 3s at most (folks are spoilt by >> the elephant in the industry, so they lose interest if the page does >> not return in that time). >> >> I see from the docs that distributed search is supported, else it >> would be a non-starter. > > This excites me too, but I don't know that anyone is pushing it's > limits yet. =A0But architecturally, I think it's well designed to allow > really fast clusters of in-ram search. =A0Talking about 3 seconds makes > it sound like you're willing to hit disk: =A0you might need some intense > tuning here, depending on how you deal with really common stopwords. > =A0Also, there are some limitations with custom sort ordering and the > like: =A0clusters are going to deal better with floating point than with > alphabetical, for example, and > ... excerpts might be a little clunky to > retrieve. =A0Currently it's just a DocID and a score that get returned > efficiently. Just to clarify - is obtaining excerpts from a distributed search a problem? One would think irrespective of whether you're performing a local or distributed search the modus operandi would be the same (without coding gymnastics required to glue things together to work as expected).