Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C4A09896 for ; Mon, 24 Oct 2011 01:33:52 +0000 (UTC) Received: (qmail 43633 invoked by uid 500); 24 Oct 2011 01:33:50 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 43561 invoked by uid 500); 24 Oct 2011 01:33:50 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 43553 invoked by uid 99); 24 Oct 2011 01:33:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Oct 2011 01:33:50 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of prasen.bea@gmail.com designates 209.85.214.48 as permitted sender) Received: from [209.85.214.48] (HELO mail-bw0-f48.google.com) (209.85.214.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Oct 2011 01:33:42 +0000 Received: by bkar19 with SMTP id r19so10657552bka.35 for ; Sun, 23 Oct 2011 18:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=6mS8iEaBNIoShWWLbq9+XL8VIsOqn15p6RLvysdfhZs=; b=u7Gqz2xeV+7daQBIZrxyjxqeQSSNFRx4+kFAmVAjv6WuWaRQNvNYzOlW8+97HRBhnR kYwwJdibxFVgtWV7mC7AO6HLL/5LINPHPeRGiBjl07f6C7FAnQTMPzu6Yn/yrsOkPwyg J00ovi/y6Ls5WnYuJ0ep82W20w0liocmgVyG0= MIME-Version: 1.0 Received: by 10.223.6.15 with SMTP id 15mr2214558fax.4.1319420002235; Sun, 23 Oct 2011 18:33:22 -0700 (PDT) Received: by 10.152.41.234 with HTTP; Sun, 23 Oct 2011 18:33:22 -0700 (PDT) In-Reply-To: References: Date: Mon, 24 Oct 2011 07:03:22 +0530 Message-ID: Subject: Re: using lucene to find neighbouring points in an n-dimensional space From: prasenjit mukherjee To: java-user Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Any pointers/suggestions on my approach ? On 10/22/11, prasenjit mukherjee wrote: > My use case is the following : > Given an n-dimensional vector ( only +ve quadrants/points ) find its > closest neighbours. I would like to try out with lucene's default > ranking. Here is how a typical document will look like : > ( or same thing > ) > > doc1 = 1245:15 3490:20 8856:20 etc. > > As reflected in the above example the number of dimensions is high ( ~ > 50K ) and the length of vectors are small ( < 40 ). > > I am thinking of constructing a BooleanQuery in the following way ( > for doc1 as Query ) : > > BooleanQuery bq = new BooleanQuery() > bq.add (new TermQuery(new Term("field", "1245") ), > BooleanClause.Occur.SHOULD ) ; > bq.add (new TermQuery(new Term("field", "3490") ), > BooleanClause.Occur.SHOULD ) ; > bq.add (new TermQuery(new Term("field", "8856") ), > BooleanClause.Occur.SHOULD ) ; > > The problem is how do I pass the dimension-value ( 15, 20, 20 etc. ) > in the TermQuery. > > One solution is to pass as many TermQueries as the diemension value, > but was thinking if there is any better way to pass the > dimension-weight. I can probably do the same during indexing as > latency is not an issue during indexing time. > > Any help is greatly appreciated. > > -Thanks, > Prasenjit > -- Sent from my mobile device --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org