Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B09E10849 for ; Sat, 23 Nov 2013 09:34:33 +0000 (UTC) Received: (qmail 60282 invoked by uid 500); 23 Nov 2013 09:34:32 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 60170 invoked by uid 500); 23 Nov 2013 09:34:31 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 60162 invoked by uid 99); 23 Nov 2013 09:34:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 09:34:30 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mslama@email.cz designates 77.75.72.26 as permitted sender) Received: from [77.75.72.26] (HELO mxh1.seznam.cz) (77.75.72.26) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 09:34:24 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=email.cz; s=beta; t=1385199243; bh=Jfi+yJ8AtyGuqINonKH3lgSaNVeR1hyfXNKxg2gKML8=; h=Received:From:To:Subject:Date:Message-Id:References:Mime-Version: X-Mailer:Content-Type; b=UO2mjP/ZluBV6wr/+00akHRkJ6Tt+BfLJYOei9xTnRfot1bCbzHFyf+shVMHqVj93 Sn+8BMxXzHHQ+D5UZHtOU0yhUpXkxnRcIQcgn3u1KF8ITYIpCvnI3Kloh8XkHBYKtL v+iy5H8vcdrIEiFVr4+xVFSZkcJQDTwg6BMJsZkw= Received: from unknown ([213.29.236.156]) by email.seznam.cz (szn-ebox-4.4.104) with HTTP; Sat, 23 Nov 2013 10:33:56 +0100 (CET) From: "Marek Slama" To: Subject: Re: Standalone JR takes long time to create indexes from Scratch Date: Sat, 23 Nov 2013 10:33:56 +0100 (CET) Message-Id: References: <57ECEA1D9C5CB14BA4B3A0E52D1D9CF78A13DE3F@ct1-mailbox-1-2.cybage.com> Mime-Version: 1.0 (szn-mime-1.0.94) X-Mailer: szn-ebox-4.4.104 Content-Type: multipart/alternative; boundary="=_449e95026ba9fa1c14206073=56d478a7-47e0-5124-9b0e-28fab7f2e4b4_=" X-Virus-Checked: Checked by ClamAV on apache.org --=_449e95026ba9fa1c14206073=56d478a7-47e0-5124-9b0e-28fab7f2e4b4_= Content-Type: text/plain Content-Transfer-Encoding: 7bit It seems JR creates index on documents itself too ie. for full text search. Do you need this? Anyway we have about 30GB repo and reindexing takes about 2hrs. If you really need to reindex also documents you can try to investigate if it is possible to improve IO somehow. If datastore is in DB you can try clustering db. If on fs then some RAID. (You can also try to increase bundle cache but not sure if it may help as I assume JR loads document content just once for indexing). Marek "Hi All, We have a client which is UK based legal firm. According to their business requirement, they need have very large set of documents to manage and it keeps increasing. To manage those documents, we use Apache Jackrabbit. Now, as a part of project maintenance and performance improvement, we recommended to upgrade the jackrabbit from 2.2.7 to 2.6.0. Jackrabbit upgrade is successful but they still use old indexes generated from version 2.2.7. Datastore size is approx 700GB. The problem is, jackrabbit takes 7-8 days to complete indexing process. Until indexing process finished, we cannot use it. In the production, we cannot afford to shutdown the production for those many days. I have enabled the debug level logs in jackrabbit and observed lot of occurrences of following entries in log file. - DEBUG [WrapperSimpleAppMain] AbstractBundlePersistenceManager.java:765 Loading bundle e7abce77-578e-461a-8b9d-59ee4dfe5480 - DEBUG [WrapperSimpleAppMain] SessionState.java:213 Performing item.getPath () - DEBUG [WrapperSimpleAppMain] SessionState.java:229 Performed item.getPath( ) in 63696us - DEBUG [1562158685@qtp-875788435-2] SessionState.java:213 Performing node. getName() - DEBUG [1562158685@qtp-875788435-2] SessionState.java:229 Performed node. getName() in 20114us - DEBUG [1562158685@qtp-875788435-2] SessionState.java:213 Performing node. getProperty({internal}principalName) - DEBUG [1562158685@qtp-875788435-2] SessionState.java:229 Performed node. getProperty({internal}principalName) in 39112us Please suggest, what all these activities are? What is the purpose of these activities? What, if we skip such activities? How to skip these? Thanks and regards, Nilay "Legal Disclaimer: This electronic message and all contents contain information from Cybage Software Private Limited which may be privileged, confidential, or otherwise protected from disclosure. The information is intended to be for the addressee(s) only. If you are not an addressee, any disclosure, copy, distribution, or use of the contents of this message is strictly prohibited. If you have received this electronic message in error please notify the sender by reply e-mail to and destroy the original message and all copies. Cybage has taken every reasonable precaution to minimize the risk of malicious content in the mail, but is not liable for any damage you may sustain as a result of any malicious content in this e-mail. You should carry out your own malicious content checks before opening the e-mail or attachment." www.cybage.com" --=_449e95026ba9fa1c14206073=56d478a7-47e0-5124-9b0e-28fab7f2e4b4_=--