Return-Path: Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: (qmail 6853 invoked from network); 22 Apr 2010 14:49:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Apr 2010 14:49:04 -0000 Received: (qmail 77760 invoked by uid 500); 22 Apr 2010 14:49:03 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 77743 invoked by uid 500); 22 Apr 2010 14:49:03 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 77734 invoked by uid 99); 22 Apr 2010 14:49:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Apr 2010 14:49:03 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hutieuxao@gmail.com designates 209.85.212.170 as permitted sender) Received: from [209.85.212.170] (HELO mail-px0-f170.google.com) (209.85.212.170) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Apr 2010 14:48:56 +0000 Received: by pxi18 with SMTP id 18so548244pxi.1 for ; Thu, 22 Apr 2010 07:48:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:received:message-id :subject:from:to:content-type; bh=WjlbxdLYd6n6T5fccrHAmeOlocm51kPJ520BIfTTtKM=; b=qvmiovSVM1ruPHEbBL4lyn6Y1/YH+rcQ1RxhY6EVFEQel+hAZIqq+KXmxZvNRAnaFi lq7WzxZIqVXoWh2uzAWJOX8/RU0Pc4OpTRqq/qBA35J1bbR+UKhdVtp8seIjHE+CcPch vM4XkuMjs/awggnfCEGN5QrBXiWVx/Q1XqUAg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=FCkiIIiQvDCJQl3q+tZQAUelOH2h+I6Mv//QMAK7c904SRJ/660TRxGbXr6EgoakkP 8Zzyak66HG6klZr55FKCCHlAclUEOAXQYVkf9BtQfLrInfMYRcag5y7mgGkBT6xjTjjs sOeZHrmT/cw4YajyUmabFL64qYSP4EHc1ROAQ= MIME-Version: 1.0 Received: by 10.114.183.4 with HTTP; Thu, 22 Apr 2010 07:48:32 -0700 (PDT) Date: Thu, 22 Apr 2010 21:48:32 +0700 Received: by 10.114.188.22 with SMTP id l22mr3692287waf.154.1271947712198; Thu, 22 Apr 2010 07:48:32 -0700 (PDT) Message-ID: Subject: The fastest way to dump JackRabbit data? From: Tai Tran To: users@jackrabbit.apache.org Content-Type: multipart/alternative; boundary=0016e64ca5048efa8f0484d4678b X-Virus-Checked: Checked by ClamAV on apache.org --0016e64ca5048efa8f0484d4678b Content-Type: text/plain; charset=ISO-8859-1 Hi, I'm very new to JackRabbit, but I'm challenged by a performance-critical task in my project that needs dumping the whole JackRabbit data into CSV file. We're using JackRabbit standalone server 1.6.0 with MySQL 5.x to store a huge hierarchical data of network devices. Each device can have up to 100 attributes, and several thousands child nodes under with nth level of depth: device[1] rack subrack port ... ... ... device[2] ... device[5000] ... We need to dump the whole JackRabbit data in tree structure into a flat CSV file with each row is a data of one node. The output CSV data is as huge as the source JackRabbit data, up to 3.6 millions lines with the following format: rack, attr1, attr2, ... rack, attr1, attr2, ... ... subrack, attr1, attr2, ... ... To minimize calls through RMI access layer, we tried iterating each device in the repository and using Node.exportSystemView() to dump the data into a XML file on hard disk, and then parsing it to generate output in CSV file. However, it is very slow, we ended up with more than 5 hours to dump the whole JackRabbit data on a very fast server while we targeted it to complete within 15 minutes (almost insane)! Now we're planning to change JackRabbit source code to add our customized version of exportSystemView in hope of tackling this performance issue. Any suggestions are really appreciated!!! Thanks a lot, Tai Tran --0016e64ca5048efa8f0484d4678b--