From java-dev-return-21276-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Thu Aug 30 20:51:42 2007 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 80437 invoked from network); 30 Aug 2007 20:51:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Aug 2007 20:51:40 -0000 Received: (qmail 28928 invoked by uid 500); 30 Aug 2007 20:51:35 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 28271 invoked by uid 500); 30 Aug 2007 20:51:33 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 28259 invoked by uid 99); 30 Aug 2007 20:51:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Aug 2007 13:51:33 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [69.44.16.11] (HELO getopt.org) (69.44.16.11) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Aug 2007 20:52:27 +0000 Received: from [192.168.0.254] (75-mo3-2.acn.waw.pl [62.121.105.75]) (authenticated) by getopt.org (8.11.6/8.11.6) with ESMTP id l7UKpJL25318 for ; Thu, 30 Aug 2007 15:51:19 -0500 Message-ID: <46D72DAD.2070207@getopt.org> Date: Thu, 30 Aug 2007 22:50:53 +0200 From: Andrzej Bialecki User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: java-dev@lucene.apache.org Subject: Optimize and internal document order Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi all, I have the following scenario: I want to use ParallelReader to maintain parts of the index that are changing quickly, and where changes are limited to specific fields only. Let's say I have a "main" index (many fields, slowly changing, large updates), and an "aux" index (fast changing, usually single doc and single field updates). I'd like to "replace" documents in the "aux" index - that is, delete one doc and add another - but in a way that doesn't change the internal document numbers, so that I can keep the mapping required by ParallelReader intact. I think this is possible to achieve by using a FilterIndexReader, which keeps a map of updated documents, and re-maps old doc ids to the new ones on the fly. From time to time I'd like to optimize the "aux" index to get rid of deleted docs. At this time I need to figure out how to preserve the old->new mapping during the optimization. So, here's the question: is this scenario feasible? If so, then in the trunk/ version of Lucene, is there any way to figure out (predictably) how internal document numbers are reassigned after calling optimize() ? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org