Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 29996 invoked from network); 19 Nov 2010 18:50:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Nov 2010 18:50:09 -0000 Received: (qmail 55798 invoked by uid 500); 19 Nov 2010 18:50:38 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 55739 invoked by uid 500); 19 Nov 2010 18:50:38 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 55614 invoked by uid 99); 19 Nov 2010 18:50:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Nov 2010 18:50:37 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Nov 2010 18:50:35 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oAJIoDvl022364 for ; Fri, 19 Nov 2010 18:50:13 GMT Message-ID: <22249539.201011290192613795.JavaMail.jira@thor> Date: Fri, 19 Nov 2010 13:50:13 -0500 (EST) From: "Uwe Schindler (JIRA)" To: dev@lucene.apache.org Subject: [jira] Issue Comment Edited: (LUCENE-2771) Remove norms() support from non-atomic IndexReaders In-Reply-To: <11676492.200961290192253277.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933911#action_12933911 ] Uwe Schindler edited comment on LUCENE-2771 at 11/19/10 1:49 PM: ----------------------------------------------------------------- Here the relevant comments: {quote} *Robert Muir added a comment - 19/Nov/10 04:28 AM* here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it seems crazy we are currently creating these big arrays this way (yeah there is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing something. *Robert Muir added a comment - 19/Nov/10 04:28 AM* here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it seems crazy we are currently creating these big arrays this way (yeah there is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing something. *Robert Muir added a comment - 19/Nov/10 04:58 AM* here's another hacky update: but still a few tests explicitly check these norms and need to be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests. *Robert Muir added a comment - 19/Nov/10 04:58 AM* here's another hacky update: but still a few tests explicitly check these norms and need to be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests. *Robert Muir added a comment - 19/Nov/10 07:45 AM* here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case). TODO: - ParallelReader shouldnt need multifields etc anymore - there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent - merge in Uwe's improved SegmentsMerger - clean up code. *Robert Muir added a comment - 19/Nov/10 07:45 AM* here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case). TODO: ParallelReader shouldnt need multifields etc anymore there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent merge in Uwe's improved SegmentsMerger clean up code. {quote} was (Author: thetaphi): Here the relevant comments: {quote} *Robert Muir added a comment - 19/Nov/10 04:28 AM* here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it seems crazy we are currently creating these big arrays this way (yeah there is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing something. *Robert Muir added a comment - 19/Nov/10 04:28 AM* here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it seems crazy we are currently creating these big arrays this way (yeah there is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing something. *Robert Muir added a comment - 19/Nov/10 04:58 AM* here's another hacky update: but still a few tests explicitly check these norms and need to be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests. *Robert Muir added a comment - 19/Nov/10 04:58 AM* here's another hacky update: but still a few tests explicitly check these norms and need to be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests. *Uwe Schindler added a comment - 19/Nov/10 07:17 AM* Here a better patch for the segment merger. We should even apply this if we not remove top-level norms! It saves lots of memory during merging by using ReaderUtil to go down to segment level! Don't wonder about BytesRef, but we need a reference here because of the anonymous inner class. *Uwe Schindler added a comment - 19/Nov/10 07:17 AM* Here a better patch for the segment merger. We should even apply this if we not remove top-level norms! It saves lots of memory during merging by using ReaderUtil to go down to segment level! Don't wonder about BytesRef, but we need a reference here because of the anonymous inner class. *Robert Muir added a comment - 19/Nov/10 07:45 AM* here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case). TODO: - ParallelReader shouldnt need multifields etc anymore - there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent - merge in Uwe's improved SegmentsMerger - clean up code. *Robert Muir added a comment - 19/Nov/10 07:45 AM* here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case). TODO: ParallelReader shouldnt need multifields etc anymore there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent merge in Uwe's improved SegmentsMerger clean up code. {quote} > Remove norms() support from non-atomic IndexReaders > --------------------------------------------------- > > Key: LUCENE-2771 > URL: https://issues.apache.org/jira/browse/LUCENE-2771 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Uwe Schindler > Fix For: 4.0 > > > Spin-off from LUCENE-2769: > Currently all IndexReaders support norms(), but the core of Lucene never uses it and its even dangerous because of memory usage. We should do the same like with MultiFields and factor it out and throw UOE on non-atomic readers. > The SlowMultiReaderWrapper can then manage the norms. Also ParallelReader needs to be fixed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org