Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 55971 invoked from network); 10 Mar 2011 19:33:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Mar 2011 19:33:13 -0000 Received: (qmail 30443 invoked by uid 500); 10 Mar 2011 19:33:11 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 30378 invoked by uid 500); 10 Mar 2011 19:33:11 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 30370 invoked by uid 99); 10 Mar 2011 19:33:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2011 19:33:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of felipehummel@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2011 19:33:02 +0000 Received: by iyj12 with SMTP id 12so2892576iyj.35 for ; Thu, 10 Mar 2011 11:32:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type; bh=i/vpZVCuhLSTOgOJ1kzmmvl3qs/mGoMe9QljpKjAf4M=; b=HlMoa89acqRyhrMiiKCK0kQQgIriaqMKPfmqHsAP2Yk55KX111Wg0X3togkvtccvl4 Y077h7ziD19hgoeIw873mxEIgsZ+VweP2+oymdTs3Ba9hEC/30qryCODDJ6a8aC6135Q vzhujy0Bjq936mg2hUUerJGhC/5juAMbzj24M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=SdD5QRwceX9dXIgbTZDQXj2BN8KBplzdBA2i32Yhj7c3eEBYwuln7hsL982JhrlOCQ QZKAwGI9nw1eU94YXybZMOiIUn0lDph9SwAcvK2U5D9JQUM99PX4H1b62309MeyvS2IZ BaMdGOGgl8vvmvKWu19eLzOmUNdXmQqHhC7aA= Received: by 10.43.54.210 with SMTP id vv18mr10757899icb.103.1299785561611; Thu, 10 Mar 2011 11:32:41 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.15.11 with HTTP; Thu, 10 Mar 2011 11:32:21 -0800 (PST) From: Felipe Hummel Date: Thu, 10 Mar 2011 15:32:21 -0400 Message-ID: Subject: Search one index but use IDF from another? To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=bcaec51dd719af1fa9049e25e88d X-Virus-Checked: Checked by ClamAV on apache.org --bcaec51dd719af1fa9049e25e88d Content-Type: text/plain; charset=ISO-8859-1 Hi, I'm building a system where I want to show only results indexed in the past few days. Furthermore, I don't want to maintain a giant index with millions of documents if I only want to return results from a couple of days (thousands of documents). My system heavily relies that the occurrences of terms in documents stored in the index have a realistic distribution (consequently: realistic IDF). That said, I would like to use a small index to return results, but I want to compute documents score using a IDF from a much greater Index (or even an external source). The Similarity API doesn't seem to allow me to do this. The *idf* method does not receive as parameter the term being used. Another possibility is to use TrieRangeQuery to make sure the documents shown are within the last couple of days. Again, I rather not mantain a large index. Also this kind of query is not cheap. Am I missing something? Thanks Felipe Hummel --bcaec51dd719af1fa9049e25e88d--