Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 106FD7C51 for ; Mon, 26 Sep 2011 18:42:38 +0000 (UTC) Received: (qmail 29991 invoked by uid 500); 26 Sep 2011 18:42:37 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 29959 invoked by uid 500); 26 Sep 2011 18:42:37 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 29950 invoked by uid 99); 26 Sep 2011 18:42:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2011 18:42:37 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of stefan.guggisberg@gmail.com designates 74.125.82.50 as permitted sender) Received: from [74.125.82.50] (HELO mail-ww0-f50.google.com) (74.125.82.50) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2011 18:42:27 +0000 Received: by wwe3 with SMTP id 3so7026788wwe.19 for ; Mon, 26 Sep 2011 11:42:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:references:from:content-type:x-mailer:in-reply-to :message-id:date:to:content-transfer-encoding:mime-version; bh=5L0EZ+xZ0xD/tQRTqfOEvNze4lF1tLdAYOIJwpuK/kk=; b=aRpciUP+mnPanWq1pAeJfCmn+hjDHl6h91ma6AbZaFsDtDKvozVF54G8pxhRBmlPvq LRbJY2+aB9/aeVgtBH107skJgqScYcd8xuGa1ot7oUJMUiNyqhnsJLAI/JqyvGHaLJXO 8xo6QR9loYkBjVkFE7Atw+tDmKftu3NRjzh/0= Received: by 10.216.131.67 with SMTP id l45mr3528922wei.26.1317062527628; Mon, 26 Sep 2011 11:42:07 -0700 (PDT) Received: from [10.0.1.3] (zux221-065-003.adsl.green.ch. [81.221.65.3]) by mx.google.com with ESMTPS id fq9sm32220498wbb.15.2011.09.26.11.42.05 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 26 Sep 2011 11:42:06 -0700 (PDT) Subject: Re: Using Jackrabbit/JCR as IDE workspace data backend References: <63E7BD93-5385-4673-82AB-4D56A2D5CC95@gmail.com> From: Stefan Guggisberg Content-Type: text/plain; charset=us-ascii X-Mailer: iPhone Mail (8L1) In-Reply-To: Message-Id: Date: Mon, 26 Sep 2011 20:42:01 +0200 To: "users@jackrabbit.apache.org" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (iPhone Mail 8L1) X-Virus-Checked: Checked by ClamAV on apache.org On 26.09.2011, at 19:56, Marcel Bruch wrote: > Hi Stefan, >=20 > On 26.09.2011, at 18:13, Stefan Guggisberg wrote: >=20 >>> I wrote a fairly ad-hoc dump of the 5900 data files into Jackrabbit. >>> Storing ~240 MB took roughly 3 minutes. Is this the expected time such >>> an operation takes? Is it possible to improve the performance somehow? >>=20 >> the performance seems rather poor. it's hard to tell what's wrong >> without having the test data. i noticed that you're storing the >> content of the .json files as string properties. why aren't you >> storing the json data as nodes & properties? >=20 > I had no code available for serializing the data as JCR nodes. Is there an= y simple snippet available somewhere? > However, I thought as a first baseline this would work.=20 >=20 >=20 >> anyway, i quickly ran an adapted ad hoc test on my machine >> (macbook pro 2.66 ghz, standard harddisk). the test imports >> an 'svn export' of jackrabbit/trunk. >>=20 >> importing ~6500 files takes ~30s which is IMO decent. >=20 > Thanks for writing your test agains your local files! >=20 > I run your code and compared the execution times. Unfortunately, it's not p= erforming faster :(=20 > The minute delta might be cause by some file traversing differences of by t= he additional nodes/properties created in your code. >=20 > However, the overall performance is still a bit low (2:24-3:05 minutes in a= clean repository). Any idea how the performance could be improved? Am I doi= ng something conceptually wrong? did you run my test with the same test data (local svn export of jackrabbit t= runk)? cheers stefan > I'm assuming that there is no big delta between creating hundreds of nodes= and properties compared to dumping a file's content into Jackrabbit. Is thi= s correct? >=20 > Thanks, > Marcel >=20 > =3D=3D=3D Experiments performance results =3D=3D=3D >=20 >=20 > Jackrabbit First Hops code adapted: >=20 > 0:00:08.522: 500 units persisted. data 17 MB=20 > 0:00:17.057: 1000 units persisted. data 33 MB=20 > 0:00:31.763: 1500 units persisted. data 53 MB=20 > 0:00:41.404: 2000 units persisted. data 72 MB=20 > 0:00:53.140: 2500 units persisted. data 97 MB=20 > 0:01:02.988: 3000 units persisted. data 113 MB=20 > 0:01:16.314: 3500 units persisted. data 133 MB=20 > 0:01:35.171: 4000 units persisted. data 143 MB=20 > 0:01:49.414: 4500 units persisted. data 173 MB=20 > 0:02:04.617: 5000 units persisted. data 204 MB=20 > 0:02:12.593: 5500 units persisted. data 221 MB=20 > Mon Sep 26 19:54:58 CEST 2011: 5927 units persisted > Run took 0:02:24.505 >=20 >=20 > Mailing List proposal: >=20 > 0:00:14.853: 500 units persisted. data 17 MB > 0:00:26.353: 1000 units persisted. data 33 MB > 0:00:36.114: 1500 units persisted. data 53 MB > 0:00:53.274: 2000 units persisted. data 72 MB > 0:01:06.643: 2500 units persisted. data 97 MB > 0:01:18.230: 3000 units persisted. data 113 MB > 0:01:36.765: 3500 units persisted. data 133 MB > 0:01:44.245: 4000 units persisted. data 143 MB > 0:02:04.026: 4500 units persisted. data 173 MB > 0:02:37.533: 5000 units persisted. data 204 MB > 0:02:48.089: 5500 units persisted. data 221 MB > Run took 0:03:08.458 >=20 >=20