jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig <mdue...@apache.org>
Subject Re: Oak benchmarks (Was: [jr3] Index on randomly distributed data)
Date Fri, 09 Mar 2012 11:11:14 GMT

On 8.3.12 14:17, Jukka Zitting wrote:
> So what should we benchmark then? Here's one idea to get us started:
> * Large, flat hierarchy (selected pages-articles dump from Wikipedia)
>    * Time it takes to load all articles (ideally a single transaction)
>    * Amount of disk space used
>    * Time it takes to iterate over all articles
>    * Number of reads by X clients in Y seconds (power-law distribution)
>    * Number of writes by X clients in Y seconds (power-law distribution)

Ack. In addition we should add tests which check that large numbers of 
direct child nodes (Millions) work. That is, adding a child node takes 
constant time irrespective of how many child nodes there are already. 
These use case seems to be quite important to us.


> Ideally we'd design the benchmarks so that they can be run against not
> just different configurations of Oak, but also Jackrabbit 2.x and
> other databases (SQL and NoSQL) like Oracle, PostgreSQL, CouchDB and
> MongoDB.
> To start with, I'd target the following basic deployment configurations:
> * 1 node, MB-range test sets (small embedded or development/testing deployment)
> * 4 nodes, GB-range test sets (mid-size non-cloud deployment)
> * 16 nodes, TB-range test sets (low-end cloud deployment)

Sounds like a good idea to me. Having such deployment configuration and 
testing infrastructure ready from the beginning should help a lot during 
further development.


> BR,
> Jukka Zitting

View raw message