directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <akaras...@apache.org>
Subject Re: Mavibot vs JDBM results
Date Mon, 11 Nov 2013 15:51:12 GMT
Hi Emmanuel,

On Mon, Nov 11, 2013 at 4:09 PM, Emmanuel Lecharny <elecharny@apache.org>wrote:

> On Mon, Nov 11, 2013 at 2:31 PM, Alex Karasulu <akarasulu@apache.org>
> wrote:
> >
> > On Mon, Nov 11, 2013 at 11:01 AM, Emmanuel L├ęcharny <elecharny@gmail.com
> >
> > wrote:
> >>
> >> Hi,
> >>
> >> now that the mavibot partition is working, here a quick bench I ran this
> >> morning :
> >>
> >> Addition of 10 000 entries, then 400 000 random searches :
> >>
> >> JDBM
> >> 10000 : Add = 74/s, Search : 4 729/s
> >>
> >> Mavibot
> >> 10000 : Add = 120/s, Search : 11 056/s
> >>
> >> As we can see, Mavibot is 2,1 x faster for additions, and 2,33 x faster
> >> for searches... Will run a test with 100 000 entries.
> >
> >
> > Is this benchmark involving the entire network ADS stack?
>
> Yes, except the network layer (which is irrelevant in this test). Also
> note that they are run on my poor laptop.
>
>
That's really cool. If I remember correctly you have not bought a new Mac
in a while. Now you have to keep this machine as the reference
configuration for all historic metrics comparisons :).


> The last results I get are the following  :
>
>
> JDBM (1Gb)
> 1000 : Add = 56/s, Search = 14 845/s
> Mavibot (1Gb)
> 1000 : Add = 111/s, Search = 17 586/s
>
> JDBM (1Gb)
> 10000 : Add = 57/s, Search = 4 729/s
> Mavibot (1Gb)
> 10000 : Add = 120/s, Search = 11 056/s
>
> JDBM (2Gb)
> 50000 : Add = 51/s, Search = 3 515/s
> Mavibot (2Gb)
> 50000 : Add = 134/s, Search = 10335/s
>
>
Impressive! These are by far the best numbers we've seen in all of ADS
history.


>
> Note that if we hit the disk (ie, teh cache and memory is not big
> enough), then the performances get down immediately :
>
>
> JDBM (2Gb)
> 100000 : Add = 44/s, Search = 2 957/s
> Mavibot (2Gb)
> 100000 : Add = 100/s, Search = 3 308/s
>
>
> This is even more visible for Mavibot than for JDBM, most certainly
> due to the cache we are using in Mavibot (EhCach) which is probably
> overkilling compared to the very basic but efficient LRU used by JDBM.
>
>
Yeah JDBM's cache was uber simple, perhaps a similar KISS cache maybe right
for Mavibot but maybe tunable to various common access scenarios or even
one that is adaptable.


>
> Enough said that given enough memory, Mavibot in its current state (i.e.
> we are still using locks all over, as the revisions are not yet
> implemented) is already more than 2x faster for additions and 3x
> faster for searches...
>
> This is not the end of the story though. There are many possible
> optimizations in Mavibot :
> - first of all, remove the locks that block concurrent access in
> searches (but that requires the handling of revisions in Mavibot,
> which is just a matter of implementing the free page collection)
> - second, we are doing way too many findPos (probably two times more
> than needed). We can get rid of this.
>
>
Looking forward to seeing those stats when these changes take place. I'd
love to see us at least come close to the C-based servers out there.


>
> Now, this is an approach where we used plain Keys (ie, Keys can have
> various sizes, which is not really efficient, as we may have to
> allocate more pages than necessary to store nodes and leaves.
> Open LDAP uses another approach, which is smarter : they use the hash
> value of each key to retrieve the element. Obviously, this leads to
> compare the keys when we reach the leaf, as we may have more than one
> key with the same hash value, and it also destroys the ordering (one
> can't compare two hash values as the ordering will be different) but
> most of the case, it's not really a big deal.
> The main advantage of such an approach is that suddenly, Nodes have a
> fixed size (a hash can be stored as an int, and the references to a
> page are longs), so in a fixed page size, we can store a fixed number
> of elements. Assuming that a node needs at least 28 bytes to store its
> header and PageIO, in a 512 bytes page we can store (512 - 28) /
> ((nbValues+1) x (8+8) + nbKeys x 4 ) elements, so 16 keys (64 bytes)
> and 17 values (272 bytes). We hve 148 bytes remaining in this case.
> Atm, we store 16 element per node, which requires many physical pages,
> ie, many disk access.
>
> This is something that worth being investigated in the near future.
>
>
Sounds like we need a minimal perfect order preserving hash function.

-- 
Best Regards,
-- Alex

Mime
View raw message