lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4515) Make MemoryIndex more memory efficient
Date Wed, 31 Oct 2012 12:51:11 GMT


Simon Willnauer commented on LUCENE-4515:

bq. Are we sure this is the right direction to go for MemoryIndex?
So we have a couple of options here. Like one would be to have a light weight DWPT (atomic
writer however you wanna call it) but our IW has a pretty significant overhead for indexing
just one document and execute a search on it so I think unless we have all those refactoring
I want to do long term this class should be supported.

bq. I think its being abused for highlighting: but it has other real use cases and we shouldn't
make it worse for its real use cases just because highlighting abuses it.
I couldn't agree more though. Maybe I used this as a bad example. In ElasticSearch we use
it for percolation (
and this works actually pretty well with the MemoryIndex. I had other usecases in the past
where this was handy too though. I also see folks on the mailing list opening issues so unless
we have a similar lightweight replacement I don't see why we should not improve this impl.
The main reason why I improved this here is that we want to reuse the internal buffers and
if possible move away from objects.

bq. instead I think if the concern is garbage then maybe just use term vectors and compute
this stuff at index time

this might work for highlighting I agree.

> Make MemoryIndex more memory efficient
> --------------------------------------
>                 Key: LUCENE-4515
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/other
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: Simon Willnauer
>             Fix For: 4.1, 5.0
>         Attachments: LUCENE-4515.patch
> Currently MemoryIndex uses BytesRef objects to represent terms and holds an int[] per
term per field to represent postings. For highlighting this creates a ton of objects for each
search that 1. need to be GCed and 2. can't be reused.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message