directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <elecha...@gmail.com>
Subject Re: Mavibot cache experiment
Date Wed, 11 Sep 2013 06:14:37 GMT
Le 9/11/13 4:44 AM, Kiran Ayyagari a écrit :
> On Tue, Sep 10, 2013 at 9:35 PM, Emmanuel Lécharny <elecharny@gmail.com>wrote:
>
>> Le 9/10/13 5:40 PM, Emmanuel Lécharny a écrit :
>>> Hi guys,
>>>
>>> yestrday and today, I was implementing a cache in Mavibot to replace the
>>> WeakReferences we were using previously. The rational was that with the
>>> WeakReferences, we were unable to inject more than 50 000 entries in the
>>> server (at this point, the GC is just going crazy trying to free some
>>> WeakReferences, and it slows down the server to a pint it's unusable...)
>>>
>>> So after a day fighting with an EhCache implementation in Mavibot, I was
>>> able to load 100K entries in the server. So far, so good, except that
>>> the performances are anything but good. I can add 26 entries per second,
>>> and fetch 555 entries per second. Worse than JDBM...
>>>
>>> Why is it so slow, especially for a search operation ? There are two
>>> reasons :
>>> - first, I configured the cache to store only 1000 elements (mainly
>> nodes)
>>> - second, when we try to update a leaf, we mostly have to load it from
>>> disk, as we rarely have it in memory
>>> - third, a leaf contains from 8 to 16 entries, and everytime we fetch a
>>> leaf from disk, we have to deserialize the Entries, which is extremely
>>> costly
>>>
>>> Fixing this third problem would save us a lot of time, and it's a matter
>>> of adding one level of indirection (the entries would be kept as byte[],
>>> and deserialized only when needed).
>>>
>>> If anyone has a better idea...
>> indeed : the pb is that we serialize/deserialize too late. It would be
>>
> I think you mean 'too _early_'

No, we should serialize the entry (or the values in general) before we
store it into the Leaf. Leaves should only store binary values.

That has 2 consequences :
- we should never compare 2 values
- we should add a cache to avoid fetching a value from the btree for
better performances

In other words, the Managed BTree cache will cache binary values, not
plain java objects.

Fetching an element from a BTree will then following this algorithm :
- first check if the value is present in the cache using its key
- if not, fetch the value from the BTree. If the leaf containing this
entry is in cache, deserialize the value and return it.


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com 


Mime
View raw message