Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 433874C8E for ; Tue, 17 May 2011 12:14:29 +0000 (UTC) Received: (qmail 6918 invoked by uid 500); 17 May 2011 12:14:28 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 6868 invoked by uid 500); 17 May 2011 12:14:28 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 6861 invoked by uid 99); 17 May 2011 12:14:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2011 12:14:27 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2011 12:14:26 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6B895CDF3E for ; Tue, 17 May 2011 12:13:47 +0000 (UTC) Date: Tue, 17 May 2011 12:13:47 +0000 (UTC) From: "Jason Rutherglen (JIRA)" To: dev@lucene.apache.org Message-ID: <431995689.18940.1305634427437.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1107775081.18779.1305629387406.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (LUCENE-3112) Add IW.add/updateDocuments to support nested documents MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034734#comment-13034734 ] Jason Rutherglen commented on LUCENE-3112: ------------------------------------------ I think perhaps like a Hadoop input format split, we can define meta-data at the segment level as to where the documents live so that if one is 'splitting' the index, as is being implemented with HBase, the 'splitter' can be 'smart'. > Add IW.add/updateDocuments to support nested documents > ------------------------------------------------------ > > Key: LUCENE-3112 > URL: https://issues.apache.org/jira/browse/LUCENE-3112 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3112.patch > > > I think nested documents (LUCENE-2454) is a very compelling addition > to Lucene. It's also a popular (many votes) issue. > Beyond supporting nested document querying, which is already an > incredible addition since it preserves the relational model on > indexing normalized content (eg, DB tables, XML docs), LUCENE-2454 > should also enable speedups in grouping implementation when you group > by a nested field. > For the same reason, it can also enable very fast post-group facet > counting impl (LUCENE-3097) when you what to > count(distinct(nestedField)), instead of unique documents, as your > "identifier". I expect many apps that use faceting need this ability > (to count(distinct(nestedField)) not distinct(docID)). > To support these use cases, I believe the only core change needed is > the ability to atomically add or update multiple documents, which you > cannot do today since in between add/updateDocument calls a flush (eg > due to commit or getReader()) could occur. > This new API (addDocuments(Iterable), updateDocuments(Term > delTerm, Iterable) would also further guarantee that the > documents are assigned sequential docIDs in the order the iterator > provided them, and that the docIDs all reside in one segment. > Segment merging never splits segments apart, so this invariant would > hold even as merges/optimizes take place. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org