lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4515) Make MemoryIndex more memory efficient
Date Thu, 01 Nov 2012 12:39:12 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488643#comment-13488643
] 

Michael McCandless commented on LUCENE-4515:
--------------------------------------------

bq. the comment is on start but it says "end" I think given the fact that we know the freq
we can read the slice without storing the end but we'd need to change SliceReader for it and
I am not sure if that is worth the trouble we could get in. Yet, 4byte per term though.

Ahh I see, right!  It's not needed.  You do need the "end" per term as you build up the slices,
but once done you can rely entirely on freq.

bq. we really rely on this in ByteBlockPool already so which likely doesn't work at this time
but we don't run into since we don't reuse in DWPT? I will add a test.

Hmm if we never reuse in DWPT then we don't need to clear...

bq. I think reuse is a special usecase and I guess we should allow it. Yet, I totally agree
this is risky. I suggest to make this possible if you subclass and expose this stuff via protected
API so if you really really wanna do this you can if you subclass?

I think if we remove reset(), and then have protected ctor that can pass in the allocator
... maybe that's OK?  Still makes me nervous ... we should mark that ctor experimental ...
                
> Make MemoryIndex more memory efficient
> --------------------------------------
>
>                 Key: LUCENE-4515
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4515
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/other
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: Simon Willnauer
>             Fix For: 4.1, 5.0
>
>         Attachments: LUCENE-4515.patch, LUCENE-4515.patch
>
>
> Currently MemoryIndex uses BytesRef objects to represent terms and holds an int[] per
term per field to represent postings. For highlighting this creates a ton of objects for each
search that 1. need to be GCed and 2. can't be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message