Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8356FE6B2 for ; Sat, 16 Mar 2013 02:37:09 +0000 (UTC) Received: (qmail 42662 invoked by uid 500); 16 Mar 2013 02:37:07 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 42280 invoked by uid 500); 16 Mar 2013 02:37:06 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 42246 invoked by uid 99); 16 Mar 2013 02:37:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Mar 2013 02:37:04 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of linlma@gmail.com designates 209.85.216.171 as permitted sender) Received: from [209.85.216.171] (HELO mail-qc0-f171.google.com) (209.85.216.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Mar 2013 02:36:57 +0000 Received: by mail-qc0-f171.google.com with SMTP id d1so422369qca.2 for ; Fri, 15 Mar 2013 19:36:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=A//sVMaki/a9HnPr/SH+n2VDS6oWmgSTHv9fxKc+LWU=; b=yX7WdRG50EK28KSBbTlYNFomzt9Vu/OBYvFxT/Op7GVV+zQTzDmZb7ZyJaUn4kC+Y9 xT2UvJdfKH9e0HAQiPFpGTB+dnozhTEcptdRKVmmEn1572bod51nauC0jh7+Z9G1JWy2 d3bO9Cp23F6AnLKdiomYShv4QrsWk8PSWUuO7xjXk4t8zrxM+DO3bw/q4BQAMpGRbf8B luPJ9ErADzCuosgLzG29umgl1Gv21cf3ldLHTdWnMTjrmQV5UjWXA1HZ9Gl1CojfjHJq x6Cf7PtjwoMNYX0aGCKYRFUYPzac7psNgE78qZ3lfspKH2QfKR6wm3eQkeexRJJ9G1Mb htUA== MIME-Version: 1.0 X-Received: by 10.224.216.135 with SMTP id hi7mr9294504qab.28.1363401396708; Fri, 15 Mar 2013 19:36:36 -0700 (PDT) Received: by 10.49.120.226 with HTTP; Fri, 15 Mar 2013 19:36:36 -0700 (PDT) In-Reply-To: References: Date: Sat, 16 Mar 2013 10:36:36 +0800 Message-ID: Subject: Re: potential query performance issue From: Lin Ma To: java-user@lucene.apache.org, lukai1984@gmail.com Content-Type: multipart/alternative; boundary=20cf300fb05fefed7804d8019fb6 X-Virus-Checked: Checked by ClamAV on apache.org --20cf300fb05fefed7804d8019fb6 Content-Type: text/plain; charset=ISO-8859-1 Hi lukai, thanks for the reply. Do you mean WAND is a way to resolve this issue? For "native support", do you mean there is no built-in (existing ready to use externally open source) module in Lucene to implement WAND? If so, the performance will really be bad. regards, Lin On Sat, Mar 16, 2013 at 2:49 AM, lukai wrote: > I had implemented wand with solr/lucene. So far there is no performance > issue. There is no native support for this functionality, you need to > implement it by yourself.. > > On Fri, Mar 15, 2013 at 10:09 AM, Lin Ma wrote: > > > Hello guys, > > > > Supposing I have one million documents, and each document has hundreds of > > features. For a given query, it also has hundreds of features. I want to > > fetch most relevant top 1000 documents by dot product related features of > > query and documents (query/document features are in the same feature > > space). > > > > I am not sure how Lucene implement internally? If we have to go through > all > > one million document to dot product the query, then I am concerning about > > the performance. Appreciate if anyone could confirm (1) how Lucene works > > internally for this use case (2) any smart ideas to make improvement for > > query efficiency to select top 1000 documents? > > > > thanks in advance, > > Lin > > > --20cf300fb05fefed7804d8019fb6--