hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: ConcurrentSkipListMap performance in the MemStore
Date Wed, 15 Jun 2011 03:40:16 GMT
Maybe we should open an issue, I may not have time soon.  Also using
AtomicIntegerArray's makes more sense as the memory footprint goes
down, as does the garbage generated.  CSLM implements hairy delete
handling code, we don't need that.

Skip lists are sorted Log(n) seek/insert-able linked lists.  Using
AIA's one can allocate 3 ints per Index (skip), and 2 ints per Node.
One appends new 'objects' to the end of the array, and then changes
the 'next' int/pointer to the newly appended 'object'.  Simple eh?
Building the skips is interesting however the CSLM provides a nice
base to work with.

On Tue, Jun 14, 2011 at 7:19 PM, Stack <stack@duboce.net> wrote:
> Woah. That looks promising Jason.  Can you hack up TestMemStore.java
> and have it see if above will even pass the CSLM hbase tests?
>
> St.Ack
>
> On Tue, Jun 14, 2011 at 5:23 PM, Jason Rutherglen
> <jason.rutherglen@gmail.com> wrote:
>> Here's the implementation we can presumably drop in and benchmark:
>>
>> https://github.com/mspiegel/lockfreeskiptree
>>
>> On Tue, Jun 14, 2011 at 4:56 PM, Jason Rutherglen
>> <jason.rutherglen@gmail.com> wrote:
>>> Here's a paper on a ConcurrentSkipTree:
>>>
>>> http://www.cs.virginia.edu/~ms6ep/publications/michael-spiegel-dissertation.pdf
>>>
>>> On Tue, Jun 14, 2011 at 4:40 PM, Jason Rutherglen
>>> <jason.rutherglen@gmail.com> wrote:
>>>> Right, I think one naive approach would be to take CSLM, convert it to
>>>> use AtomicIntegerArray instead of objects (basically emulate objects
>>>> in blocks of ints), and of course not implement delete functionality.
>>>>
>>>> The first challenge is making an AtomicIntegerArray based linked list
>>>> where the value is pointed to by an int into a byte block data
>>>> structure.  The next is a 2nd AIA that emulates the internal 'Index'
>>>> class of CSLM.
>>>>
>>>> Thoughts?
>>>>
>>>> On Tue, Jun 14, 2011 at 3:51 PM, Stack <stack@duboce.net> wrote:
>>>>> Or do as Todd suggested a while back and pull in CSLM and hack it to
>>>>> suit our needs; its public domain.
>>>>> St.Ack
>>>>>
>>>>> On Tue, Jun 14, 2011 at 3:23 PM, Jason Rutherglen
>>>>> <jason.rutherglen@gmail.com> wrote:
>>>>>>> See this J: http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/
>>>>>>
>>>>>> Interesting.
>>>>>>
>>>>>> In Lucene for realtime search we have the same problem where the
>>>>>> best/only option right now for the realtime terms dictionary is to
use
>>>>>> ConcurrentSkipListMap.  However the Node, key, value, and Index
>>>>>> objects all are pointers that consume RAM and need to be cleaned
up.
>>>>>>
>>>>>> Lucene right now during indexing does not generate much object pointer
>>>>>> garbage.  We have the same problem in the two projects.
>>>>>>
>>>>>> Maybe there's a way to create an AtomicIntegerArray based system
that
>>>>>> uses int pointers to perform the same function as a the concurrent
>>>>>> skip list.  It would greatly reduce garbage generation.
>>>>>>
>>>>>> On Tue, Jun 14, 2011 at 10:14 AM, Stack <stack@duboce.net>
wrote:
>>>>>>> See this J: http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/
>>>>>>> St.Ack
>>>>>>>
>>>>>>> On Mon, Jun 13, 2011 at 10:58 PM, Jason Rutherglen
>>>>>>> <jason.rutherglen@gmail.com> wrote:
>>>>>>>> Is there an issue with garbage generated from using CSLM
in the MemStore?
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message