Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 35923 invoked from network); 15 Oct 2010 08:31:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Oct 2010 08:31:03 -0000 Received: (qmail 34512 invoked by uid 500); 15 Oct 2010 08:31:01 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 34298 invoked by uid 500); 15 Oct 2010 08:30:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 34219 invoked by uid 99); 15 Oct 2010 08:30:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Oct 2010 08:30:56 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pasalic.zaharije@gmail.com designates 209.85.216.169 as permitted sender) Received: from [209.85.216.169] (HELO mail-qy0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Oct 2010 08:30:50 +0000 Received: by qyk7 with SMTP id 7so1282801qyk.14 for ; Fri, 15 Oct 2010 01:30:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=CgFl4jpvSgBN+MvxPNBGF3IAl4tsSMKb9GIlyh0VTT4=; b=LEQPoFdt1ygZVFBflb3nSSIN+yeYTY+qiX+GRhDhgAsg1GWPdKxPYuHb5Fb05jPvIg EPP1Hzi1gGFza0xu/qed/T3a0ATaFSGQz1M9M6kKe+y9pMF8RAjH4ss9YW1sNzrR0eyO MZqkyT/OEBgfLMcT9WbYIMwKRGsUv6jdaMs1M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=dSnAkBGDRo1uKHTWAUDmUpkPxZKCEZLMFuJl9W97wiChaOzwH2poyHF2DO3btPO99T Y5DwLDmkzpOhZCliph7JWeNKeYa1Ji6mZJjxzAkpQDh3UeAgbgNRjOxolQPKVwSbBag3 arGeFTYMXH4BQpHK/2U2BP71FJ3gZWbWU0Oyo= MIME-Version: 1.0 Received: by 10.224.11.20 with SMTP id r20mr1552956qar.7.1287131428902; Fri, 15 Oct 2010 01:30:28 -0700 (PDT) Received: by 10.229.34.70 with HTTP; Fri, 15 Oct 2010 01:30:28 -0700 (PDT) Date: Fri, 15 Oct 2010 10:30:28 +0200 Message-ID: Subject: Overriding DefaultScore From: Zaharije Pasalic To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hi my original problem is to index large number of documents which contains 360 integers in rage from 0-90K. Searching it's a little bit complicated - I need to find most similar documents where query data is also 360 numbers in range 0-90K. But (there is always 'but') i need to create score with some predefined weight table. Here is example: Index contains: DOC1 : 1, 3, 5 DOC2 : 1, 100 DOC3 : 1, 5 I need to find all documents which are 'like' this: SEARCH: 1,5,100 And suppose that i'm having table which says: "if value is larger than 10 wight hit as 0.5, else as 1" (in real application this is more complicated weight table). So for Query 1,5,100 i will have: DOC1: SCORE=2 [1,5] DOC3: SCORE=2 [1,5] DOC2: SCORE=1.5 [1,100 (100>10- wight 0.5] Searching is just: if hits occurs on field, increments score by 1*weight(value) My first step was to create index with one field which contains all 360 values and to remove normals from it. Now when i'm doing search like: "F:1 F:5 F:100" I'm getting results ok but score is not correct. Of course it gives me score sorted by 'number of hits' (am I right?) but score value is not calculated by increments of 1 nor i'm using wights at all. So, my question is - is this even possible with lucene and if can, can you point me into some directions (i already looked a little bit at DefaultSimilarity overriding). Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org