lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4515) Make MemoryIndex more memory efficient
Date Wed, 31 Oct 2012 20:49:13 GMT


Simon Willnauer commented on LUCENE-4515:

synchronizedAllocator looks unused? I guess you added that after
removing all sync from RecyclingByteBlockAllocator ... but I think we
can just add synchronizedAllocator back later if/when we need it?
Separately can you call out that RecyclingByteBlockAllocator is not
thread safe in its javadocs?

regarding javadocs, I though I did this... will fix. Regardin sync. yeah lets drop it we can
still add if needed, trivial though!

I think you need the start ... because if you used more than one slice
you won't know how to read "backwards" to get to the starting slice?

the comment is on start but it says "end" I think given the fact that we know the freq we
can read the slice without storing the end but we'd need to change SliceReader for it and
I am not sure if that is worth the trouble we could get in. Yet, 4byte per term though.

If we do that we have to make sure that allocator clears each int[]
before returning it, in getIntBlock().

we really rely on this in ByteBlockPool already so which likely doesn't work at this time
but we don't run into since we don't reuse in DWPT? I will add a test.

The added MemoryIndex.reset method is sort of ... spooky? Like, do we
really need/want to reuse a MemoryIndex? (I guess this is because we
added passing in an allocator to the ctor ... so you want the byte[]'s
returned to it ... but that also makes me nervous: should we really
pass in an external allocator...?).
I think reuse is a special usecase and I guess we should allow it. Yet, I totally agree this
is risky. I suggest to make this possible if you subclass and expose this stuff via protected
API so if you really really wanna do this you can if you subclass?
> Make MemoryIndex more memory efficient
> --------------------------------------
>                 Key: LUCENE-4515
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/other
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: Simon Willnauer
>             Fix For: 4.1, 5.0
>         Attachments: LUCENE-4515.patch
> Currently MemoryIndex uses BytesRef objects to represent terms and holds an int[] per
term per field to represent postings. For highlighting this creates a ton of objects for each
search that 1. need to be GCed and 2. can't be reused.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message