lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <>
Subject [jira] [Updated] (LUCENE-3245) Realtime terms dictionary
Date Mon, 27 Jun 2011 04:46:47 GMT


Jason Rutherglen updated LUCENE-3245:

    Attachment: LUCENE-3245.patch

Here's a basic initial patch implementing a single threaded writer, multiple reader atomic
integer array skip list.  

The next step is to tie in the ByteBlockPool to store terms, eg, implement an RTTermsDictAIA
class, and an RTTermsDictCSLM class.  

We can then load the same Wiki-EN terms, and measure the comparative write speeds.  

Then create a set of terms to lookup from each terms dict and measure the time difference.

I am not yet sure how the speed of AtomicIntegerArray will compare with CSLM's usage of AtomicReferenceFieldUpdater.
 Of note is the fact that because of DWPTs we do not need a skip list that supports concurrent
writes.  And because we're only adding new unique terms, we do not need delete functionality.
 Ie, AIA could be faster, though we may need to inline code and perform various tuning tricks.

> Realtime terms dictionary
> -------------------------
>                 Key: LUCENE-3245
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: LUCENE-3245.patch
> For LUCENE-2312 we need a realtime terms dictionary.  While ConcurrentSkipListMap may
be used, it has drawbacks in terms of high object overhead which can impact GC collection
times and heap memory usage.  
> If we implement a skip list that uses primitive backing arrays, we can hopefully have
a data structure that is [as] fast and memory efficient.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message