lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <>
Subject [jira] Updated: (LUCENE-2575) Concurrent byte and int block implementations
Date Sun, 12 Sep 2010 03:22:32 GMT


Jason Rutherglen updated LUCENE-2575:

    Attachment: LUCENE-2575.patch

This includes a basic implementation of the sorted term id based
term enum. We'll want to over-allocate the sorted term id array
so that future merges of new term ids will not require
allocating a new array for growth. I think overall the ram
buffer based searching will not require too much more of a RAM
outlay. The merging of new term ids could occur in a background
thread if we think it's expensive, however for now we can simply
merge them in on demand as new RAM readers are created.

Seek is implemented as a binary search of the sorted term ids.
If this is not efficient enough, we can implement a terms index
ala the current system.

For now the conversion from CSLM to sorted term id array can be
a percentage of the total number of terms, which I'll default to
10%. We may want to make this a function (eg, percentage) of RAM
consumption in the future.

> Concurrent byte and int block implementations
> ---------------------------------------------
>                 Key: LUCENE-2575
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: Realtime Branch
>            Reporter: Jason Rutherglen
>             Fix For: Realtime Branch
>         Attachments: LUCENE-2575.patch, LUCENE-2575.patch
> The current *BlockPool implementations aren't quite concurrent.
> We really need something that has a locking flush method, where
> flush is called at the end of adding a document. Once flushed,
> the newly written data would be available to all other reading
> threads (ie, postings etc). I'm not sure I understand the slices
> concept, it seems like it'd be easier to implement a seekable
> random access file like API. One'd seek to a given position,
> then read or write from there. The underlying management of byte
> arrays could then be hidden?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message