lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4512) Additional memory savings in CompressingStoredFieldsIndex.MEMORY_CHUNK
Date Mon, 29 Oct 2012 20:26:12 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486311#comment-13486311
] 

Robert Muir commented on LUCENE-4512:
-------------------------------------

I do think we should use n=(some power of 2 or whatever) chunks, because e.g. just testing
with that geonames dataset i saw the
deltas grow quite large at points... this caused it to use 24 bits per value (still better
than 29), but with a tiny bit of 
effort I think it could be significantly less.

                
> Additional memory savings in CompressingStoredFieldsIndex.MEMORY_CHUNK
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-4512
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4512
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.1
>
>
> Robert had a great idea to save memory with {{CompressingStoredFieldsIndex.MEMORY_CHUNK}}:
instead of storing the absolute start pointers we could compute the mean number of bytes per
chunk of documents and only store the delta between the actual value and the expected value
(avgChunkBytes * chunkNumber).
> By applying this idea to every n(=1024?) chunks, we would even:
>  - make sure to never hit the worst case (delta ~= maxStartPointer)
>  - reduce memory usage at indexing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message