lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (LUCENE-2771) Remove norms() support from non-atomic IndexReaders
Date Fri, 19 Nov 2010 18:50:13 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933911#action_12933911
] 

Uwe Schindler edited comment on LUCENE-2771 at 11/19/10 1:49 PM:
-----------------------------------------------------------------

Here the relevant comments:

{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer
from the subs. 
in my opinion it seems crazy we are currently creating these big arrays this way (yeah there
is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still).

Maybe i am missing something.

*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not
call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it
seems crazy we are currently creating these big arrays this way (yeah there is the hairy code
for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing
something. 

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms and need to
be fixed. 
maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience,
just to fill the byte arrays so these tests can assertEquals

otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests.

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms and need to
be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for
convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are
going to have to put a lot of SlowMultiReaderWrappers in these tests. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. 
For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap
them if thats not the case).

TODO: 
- ParallelReader shouldnt need multifields etc anymore 
- there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent

- merge in Uwe's improved SegmentsMerger 
- clean up code. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced
it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case).
TODO: 
ParallelReader shouldnt need multifields etc anymore 
there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent

merge in Uwe's improved SegmentsMerger 
clean up code. 
{quote}

      was (Author: thetaphi):
    Here the relevant comments:

{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but populate its normBuffer
from the subs. 
in my opinion it seems crazy we are currently creating these big arrays this way (yeah there
is the hairy code for re-open that re-uses the big merged cache for the NRT case, but still).

Maybe i am missing something.

*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms. we need to change SegmentMerger to not
call norms on the top-level IR but populate its normBuffer from the subs. in my opinion it
seems crazy we are currently creating these big arrays this way (yeah there is the hairy code
for re-open that re-uses the big merged cache for the NRT case, but still). Maybe i am missing
something. 

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms and need to
be fixed. 
maybe we could add an uncached "MultiNorms" or something at least in src/test for convenience,
just to fill the byte arrays so these tests can assertEquals

otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these tests.

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms and need to
be fixed. maybe we could add an uncached "MultiNorms" or something at least in src/test for
convenience, just to fill the byte arrays so these tests can assertEquals otherwise we are
going to have to put a lot of SlowMultiReaderWrappers in these tests. 

*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*

Here a better patch for the segment merger. We should even apply this if we not remove top-level
norms! 
It saves lots of memory during merging by using ReaderUtil to go down to segment level! Don't
wonder about BytesRef, but we need a reference here because of the anonymous inner class.

*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*

Here a better patch for the segment merger. We should even apply this if we not remove top-level
norms! It saves lots of memory during merging by using ReaderUtil to go down to segment level!
Don't wonder about BytesRef, but we need a reference here because of the anonymous inner class.


*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. 
For ParallelReader i forced it to require non-composite readers only (e.g. SlowMultiReaderWrap
them if thats not the case).

TODO: 
- ParallelReader shouldnt need multifields etc anymore 
- there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent

- merge in Uwe's improved SegmentsMerger 
- clean up code. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. For ParallelReader i forced
it to require non-composite readers only (e.g. SlowMultiReaderWrap them if thats not the case).
TODO: 
ParallelReader shouldnt need multifields etc anymore 
there are 5 @Ignore'd ParallelReader-related tests, because of things like reopen/isOptimized/isCurrent

merge in Uwe's improved SegmentsMerger 
clean up code. 
{quote}
  
> Remove norms() support from non-atomic IndexReaders
> ---------------------------------------------------
>
>                 Key: LUCENE-2771
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2771
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Uwe Schindler
>             Fix For: 4.0
>
>
> Spin-off from LUCENE-2769:
> Currently all IndexReaders support norms(), but the core of Lucene never uses it and
its even dangerous because of memory usage. We should do the same like with MultiFields and
factor it out and throw UOE on non-atomic readers.
> The SlowMultiReaderWrapper can then manage the norms. Also ParallelReader needs to be
fixed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message