lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-3659) Improve Javadocs of RAMDirectory to document its limitations
Date Sat, 24 Mar 2012 17:18:25 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-3659:
----------------------------------

    Attachment: LUCENE-3659.patch

I started to work on this, here is just a first step (trunk). This patch removes the BUFFER_SIZE
constant and moves it up to RAMDirectory (but for now only as default, see below!). RAMDirectory
inherits the default buffersize for now to its RAMFile childs (newRAMFile() method), but this
can likely change (see below).

As every RAMFile has its own buffer size, optimizations are possible:
- when you open an IndexOutput, in trunk we get the IOContext, which may contain a Merge/Flush
desc containing the complete segment size (unfortunately the *complete* segment size). But
this number can be used as a order of magnitude for specifiing the buffer size.

The patch does not yet implement that, but an idea would be to maybe allocate 1/32 of the
segment size as buffer size. By that the buffer size does not get too big, but on the other
hand the number of slices has an upper limit (approx 32 slices per merged segment). Currently
a merged segment with a size of say 32 Gigabytes would have 32 million byte[] arrays, after
the change only 32 byte[] arrays with a size of 1 Gigabyte each. This should make GC happy.

When backporting to 3.x, the IOContext is not yet available and RAMDirectory always uses the
default buffer size (maybe randomize in tests). Rainsing the buffer size should bring improvements
here.

We should still add some warnings into the Javadocs, that for *large* indexes it is often
preferable to use MMapDir, especially when you store it on disk. We should also peple tell
that new RAMDirectoty(OtherDirectory) maybe a bad idea...

The new default buffer size was raised from 1024 to 8192.
                
> Improve Javadocs of RAMDirectory to document its limitations
> ------------------------------------------------------------
>
>                 Key: LUCENE-3659
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3659
>             Project: Lucene - Java
>          Issue Type: Task
>    Affects Versions: 3.5, 4.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3659.patch
>
>
> Spinoff from several dev@lao issues:
> - [http://mail-archives.apache.org/mod_mbox/lucene-dev/201112.mbox/%3C001001ccbf1c%2471845830%24548d0890%24%40thetaphi.de%3E]
> - issue LUCENE-3653
> The use cases for RAMDirectory are very limited and to prevent users from using it for
e.g. loading a 50 Gigabyte index from a file on disk, we should improve the javadocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message