lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4515) Make MemoryIndex more memory efficient
Date Wed, 31 Oct 2012 12:51:11 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487718#comment-13487718
] 

Simon Willnauer commented on LUCENE-4515:
-----------------------------------------

bq. Are we sure this is the right direction to go for MemoryIndex?
So we have a couple of options here. Like one would be to have a light weight DWPT (atomic
writer however you wanna call it) but our IW has a pretty significant overhead for indexing
just one document and execute a search on it so I think unless we have all those refactoring
I want to do long term this class should be supported.

bq. I think its being abused for highlighting: but it has other real use cases and we shouldn't
make it worse for its real use cases just because highlighting abuses it.
I couldn't agree more though. Maybe I used this as a bad example. In ElasticSearch we use
it for percolation (http://www.elasticsearch.org/guide/reference/java-api/percolate.html)
and this works actually pretty well with the MemoryIndex. I had other usecases in the past
where this was handy too though. I also see folks on the mailing list opening issues so unless
we have a similar lightweight replacement I don't see why we should not improve this impl.
The main reason why I improved this here is that we want to reuse the internal buffers and
if possible move away from objects.

bq. instead I think if the concern is garbage then maybe just use term vectors and compute
this stuff at index time

this might work for highlighting I agree.

                
> Make MemoryIndex more memory efficient
> --------------------------------------
>
>                 Key: LUCENE-4515
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4515
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/other
>    Affects Versions: 4.0, 4.1, 5.0
>            Reporter: Simon Willnauer
>             Fix For: 4.1, 5.0
>
>         Attachments: LUCENE-4515.patch
>
>
> Currently MemoryIndex uses BytesRef objects to represent terms and holds an int[] per
term per field to represent postings. For highlighting this creates a ton of objects for each
search that 1. need to be GCed and 2. can't be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message