jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Large flat commit problems
Date Mon, 25 Feb 2013 15:24:04 GMT
Hi,

Two of our goals for Oak are support for large transactions and for
flat hierarchies. I combined these two goals into a simple benchmark
that tries to import the contents of a Wikipedia dump into an Oak
repository using just a single save() call.

Here are some initial numbers using the fairly small Faroese
wikipedia, with just some 12k pages.

The default H2 MK starts to slow down after 5k transient nodes and
fails after 6k:

$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \
      benchmark --wikipedia=fowiki-20130213-pages-articles.xml \
      WikipediaImport Oak-Default
Apache Jackrabbit Oak 0.7-SNAPSHOT
Wikipedia import (fowiki-20130213-pages-articles.xml)
Oak-Default: importing Wikipedia...
Imported 1000 pages in 1 seconds (1271us/page)
Imported 2000 pages in 2 seconds (1465us/page)
Imported 3000 pages in 4 seconds (1475us/page)
Imported 4000 pages in 6 seconds (1749us/page)
Imported 5000 pages in 11 seconds (2219us/page)
Imported 6000 pages in 28 seconds (4815us/page)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

The new MongoMK prototype fails already sooner:

$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \
      benchmark --wikipedia=fowiki-20130213-pages-articles.xml \
      WikipediaImport Oak-Mongo
Apache Jackrabbit Oak 0.7-SNAPSHOT
Wikipedia import (fowiki-20130213-pages-articles.xml)
Oak-Mongo: importing Wikipedia...
Imported 1000 pages in 1 seconds (1949us/page)
Imported 2000 pages in 6 seconds (3260us/page)
Imported 3000 pages in 13 seconds (4523us/page)
Imported 4000 pages in 30 seconds (7613us/page)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

After my recent work on OAK-632 the SegmentMK does better, but it also
experiences some slowdown over time:

$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \
      benchmark --wikipedia=fowiki-20130213-pages-articles.xml \
      WikipediaImport Oak-Segment
Apache Jackrabbit Oak 0.7-SNAPSHOT
Wikipedia import (fowiki-20130213-pages-articles.xml)
Oak-Segment: importing Wikipedia...
Imported 1000 pages in 1 seconds (1419us/page)
Imported 2000 pages in 2 seconds (1447us/page)
Imported 3000 pages in 4 seconds (1492us/page)
Imported 4000 pages in 6 seconds (1586us/page)
Imported 5000 pages in 8 seconds (1697us/page)
Imported 6000 pages in 10 seconds (1812us/page)
Imported 7000 pages in 13 seconds (1927us/page)
Imported 8000 pages in 16 seconds (2042us/page)
Imported 9000 pages in 19 seconds (2146us/page)
Imported 10000 pages in 22 seconds (2254us/page)
Imported 11000 pages in 25 seconds (2355us/page)
Imported 12000 pages in 29 seconds (2462us/page)
Imported 12148 pages in 41 seconds (3375us/page)

To summarize, all MKs still need some work on this. Once these initial
problems are solved, we can try the same benchmark with larger
Wikipedias.

PS. Note that I'm using the OAK-652 feature flag to speed things up on
the oak-jcr level.

BR,

Jukka Zitting

Mime
View raw message