Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1E0CB9A9C for ; Thu, 8 Dec 2011 17:58:28 +0000 (UTC) Received: (qmail 76576 invoked by uid 500); 8 Dec 2011 17:58:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 76492 invoked by uid 500); 8 Dec 2011 17:58:25 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 76484 invoked by uid 99); 8 Dec 2011 17:58:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Dec 2011 17:58:25 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ian.lea@gmail.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-ee0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Dec 2011 17:58:20 +0000 Received: by eekb47 with SMTP id b47so1585745eek.35 for ; Thu, 08 Dec 2011 09:57:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=qyJAuUSmpNl5M0vlWHLpKprEfnWlqdrqETif/JdG9ys=; b=dwNML2Qh2G6J/lRCOo9IFkjWkIghPYc5Jc9DHTE8moxipW17Y9KhdFW1YpMJyi9xuo t9gj8fTP1uJfNbtR2q/pU0PuOa/5nBG+ZqXdgQ/Z/2vTNlnZa38nK8vvqbNNESgGWidG Yj+C4ZdYOvx5FVVedYR8qpwjgl6dcs+iLz02Y= Received: by 10.14.3.200 with SMTP id 48mr311759eeh.140.1323367079309; Thu, 08 Dec 2011 09:57:59 -0800 (PST) MIME-Version: 1.0 Received: by 10.213.33.134 with HTTP; Thu, 8 Dec 2011 09:57:38 -0800 (PST) In-Reply-To: References: From: Ian Lea Date: Thu, 8 Dec 2011 17:57:38 +0000 Message-ID: Subject: Re: Split mutable logical document into two Lucene documents To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 It is conceivable that nested documents might help. https://issues.apache.org/jira/browse/LUCENE-2454. I don't know anything about that so might be way off target. -- Ian. On Wed, Dec 7, 2011 at 8:46 PM, Brandon Mintern wrote: > We have a document tagging system where documents are composed of two > types of data: > > Rarely changed (hereafter: "immutable") data - document text and > metadata that we upload and almost never change. The text can be > hundreds of pages. > > User created (hereafter: "mutable") data - document properties that > are set by users of our system. In total a document's properties are > generally several dozen bytes at most. Even viewing a document changes > the data (e.g. the document's "viewed" property. > > > At present, all data is part of a single Lucene document. The problem > is that when any piece of mutable data is updated (this happens > relatively frequently), we have to reindex the entire document. We'd > like to have two separate indexed Lucene documents per logical > document, one containing the immutable data and the other containing > the much smaller and more transient mutable data. When the mutable > data changes, we can delete that document's mutable Lucene document > and index a new one very quickly. > > There are two major difficulties when actually performing a search, though: > > 1. We are providing complex queries to retrieve logical documents > based on information in either of its Lucene documents. It seems > non-trivial to fetch a logical document in a BooleanQuery with > Occur.MUST clauses referring to fields in both of the Lucene > documents. > > 2. We need to sort results (logical document IDs) based on fields in > either of its Lucene documents. > > Has anyone done anything like this before? Is there functionality I'm > overlooking that could make this easier? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org