jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: Large flat commit problems
Date Tue, 26 Feb 2013 10:04:12 GMT
Hi,

Large transactions: I think we didn't define this as a strict requirement.
I'm not aware we got into big troubles with Jackrabbit 2.x where this is
not supported. For me, this is still a nice to have. But of course it's
something we should test and try to achieve (and resolve problems if we
find any).

Flat hierarchies: Yes this is important (we ran into this problem many
times).

I didn't analyze the results, but could the problem be orderable child
nodes? Currently, oak-core stores a property ":childOrder". If there are
many child nodes, then this property gets larger and larger. This is a
problem, as it consumes more and more disk space / network bandwidth /
cpu, of the order n^2. It's the same problem as with storing the list of
children in the node bundle. So I guess this needs to be solved in
oak-core (not in each MK separately)?

Regards,
Thomas









 I combined these two goals into a simple benchmark
>that tries to import the contents of a Wikipedia dump into an Oak
>repository using just a single save() call.
>
>Here are some initial numbers using the fairly small Faroese
>wikipedia, with just some 12k pages.
>
>The default H2 MK starts to slow down after 5k transient nodes and
>fails after 6k:
>
>$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \
>      benchmark --wikipedia=fowiki-20130213-pages-articles.xml \
>      WikipediaImport Oak-Default
>Apache Jackrabbit Oak 0.7-SNAPSHOT
>Wikipedia import (fowiki-20130213-pages-articles.xml)
>Oak-Default: importing Wikipedia...
>Imported 1000 pages in 1 seconds (1271us/page)
>Imported 2000 pages in 2 seconds (1465us/page)
>Imported 3000 pages in 4 seconds (1475us/page)
>Imported 4000 pages in 6 seconds (1749us/page)
>Imported 5000 pages in 11 seconds (2219us/page)
>Imported 6000 pages in 28 seconds (4815us/page)
>Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>The new MongoMK prototype fails already sooner:
>
>$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \
>      benchmark --wikipedia=fowiki-20130213-pages-articles.xml \
>      WikipediaImport Oak-Mongo
>Apache Jackrabbit Oak 0.7-SNAPSHOT
>Wikipedia import (fowiki-20130213-pages-articles.xml)
>Oak-Mongo: importing Wikipedia...
>Imported 1000 pages in 1 seconds (1949us/page)
>Imported 2000 pages in 6 seconds (3260us/page)
>Imported 3000 pages in 13 seconds (4523us/page)
>Imported 4000 pages in 30 seconds (7613us/page)
>Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>After my recent work on OAK-632 the SegmentMK does better, but it also
>experiences some slowdown over time:
>
>$ java -DOAK-652=true -jar oak-run/target/oak-run-0.7-SNAPSHOT.jar \
>      benchmark --wikipedia=fowiki-20130213-pages-articles.xml \
>      WikipediaImport Oak-Segment
>Apache Jackrabbit Oak 0.7-SNAPSHOT
>Wikipedia import (fowiki-20130213-pages-articles.xml)
>Oak-Segment: importing Wikipedia...
>Imported 1000 pages in 1 seconds (1419us/page)
>Imported 2000 pages in 2 seconds (1447us/page)
>Imported 3000 pages in 4 seconds (1492us/page)
>Imported 4000 pages in 6 seconds (1586us/page)
>Imported 5000 pages in 8 seconds (1697us/page)
>Imported 6000 pages in 10 seconds (1812us/page)
>Imported 7000 pages in 13 seconds (1927us/page)
>Imported 8000 pages in 16 seconds (2042us/page)
>Imported 9000 pages in 19 seconds (2146us/page)
>Imported 10000 pages in 22 seconds (2254us/page)
>Imported 11000 pages in 25 seconds (2355us/page)
>Imported 12000 pages in 29 seconds (2462us/page)
>Imported 12148 pages in 41 seconds (3375us/page)
>
>To summarize, all MKs still need some work on this. Once these initial
>problems are solved, we can try the same benchmark with larger
>Wikipedias.
>
>PS. Note that I'm using the OAK-652 feature flag to speed things up on
>the oak-jcr level.
>
>BR,
>
>Jukka Zitting


Mime
View raw message