On Tue, Sep 10, 2013 at 9:35 PM, Emmanuel Lécharny <elecharny@gmail.com> wrote:
Le 9/10/13 5:40 PM, Emmanuel Lécharny a écrit :
> Hi guys,
> yestrday and today, I was implementing a cache in Mavibot to replace the
> WeakReferences we were using previously. The rational was that with the
> WeakReferences, we were unable to inject more than 50 000 entries in the
> server (at this point, the GC is just going crazy trying to free some
> WeakReferences, and it slows down the server to a pint it's unusable...)
> So after a day fighting with an EhCache implementation in Mavibot, I was
> able to load 100K entries in the server. So far, so good, except that
> the performances are anything but good. I can add 26 entries per second,
> and fetch 555 entries per second. Worse than JDBM...
> Why is it so slow, especially for a search operation ? There are two
> reasons :
> - first, I configured the cache to store only 1000 elements (mainly nodes)
> - second, when we try to update a leaf, we mostly have to load it from
> disk, as we rarely have it in memory
> - third, a leaf contains from 8 to 16 entries, and everytime we fetch a
> leaf from disk, we have to deserialize the Entries, which is extremely
> costly
> Fixing this third problem would save us a lot of time, and it's a matter
> of adding one level of indirection (the entries would be kept as byte[],
> and deserialized only when needed).
> If anyone has a better idea...
indeed : the pb is that we serialize/deserialize too late. It would be
I think you mean 'too _early_'
way better if we process serialized values until the point we return the
result to the user :

- addition : we receive an object, we immediately serialize it into a
byte[] and process the addition using this byte[]. We don't anymore have
to deserialize all the values from the page we will add the new value,
they are all byte[]
- search : same thing, we don't deserialize the values until we return
it to the user.

The gain will be huge !

Emmanuel Lécharny

Kiran Ayyagari