Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 52852 invoked from network); 7 Aug 2007 15:46:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Aug 2007 15:46:17 -0000 Received: (qmail 72520 invoked by uid 500); 7 Aug 2007 15:46:15 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 72509 invoked by uid 500); 7 Aug 2007 15:46:15 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 72497 invoked by uid 99); 7 Aug 2007 15:46:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2007 08:46:15 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mwaschkowski@gmail.com designates 64.233.166.178 as permitted sender) Received: from [64.233.166.178] (HELO py-out-1112.google.com) (64.233.166.178) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2007 15:46:09 +0000 Received: by py-out-1112.google.com with SMTP id d32so3467815pye for ; Tue, 07 Aug 2007 08:45:48 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=kOSWovk4w8OqrlexEIpstURWMoXoGMY9qZMbhC/kRig34ZDD6O8rn0A9JK9NrUAJvsN9Z5YsOLOhdGJFcIMGlGedbl6NzqdBk/0+vOKHHgvEhSbzEX1msp8sJl9Tvpaq+Jj98qEKHLdcM9SiOd+rbjpLo/EB3rSJcuYLiDAhIxM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=Q6IdZ5dVUwHKdFRdovl9qdgNzmFQP9l5Jtz0TK72B1IFFWQax9qbQH7+PzZdG3jRKX4XSTS+CjiB5ZsivX2H+3/41WTO7xVF6aVgo+0XWoz4ZF5j/I9yGnEirngbDk9Wr9KFTCw+985bM40xfrwCqt0iEyX/wzeJ0Hw7TGQ8880= Received: by 10.64.10.2 with SMTP id 2mr11052621qbj.1186501547811; Tue, 07 Aug 2007 08:45:47 -0700 (PDT) Received: by 10.65.148.15 with HTTP; Tue, 7 Aug 2007 08:45:47 -0700 (PDT) Message-ID: <76a6ebd00708070845k1c4f6c72l9b40d1f75f48882d@mail.gmail.com> Date: Tue, 7 Aug 2007 11:45:47 -0400 From: "Mark Waschkowski" To: users@jackrabbit.apache.org Subject: Re: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big Trouble??) In-Reply-To: <1b0d43d00707300214m341684c4m130737ce1c9f3430@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_123231_21944903.1186501547751" References: <916A2A65AB16854B99689B6EC2C60A541B4B72@scooby2k3.corp.bspark.com> <1b0d43d00707300214m341684c4m130737ce1c9f3430@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_123231_21944903.1186501547751 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi David, I would like to update the wiki with the below information, as I think its quite valuable and would help new users without having to scour the mailing list. If you verify the following, I will update the wiki. -----For wiki: Using DBFileSystem as specified in the repository.xml: and using the same database any of the PersistenceManager entries, the only things that need to be backed up are: 1) repository.xml 2) the database Then, to restore from a backup, all that would need to be done is to use th= e backed up repository.xml, restore the database using the backup, and the indexes will rebuild themselves when the system restarts. This will properl= y handle versioning as well. Note: rebuilding of indexes may take a significant amount of time ----end If all that looks correct, I'll fill in an example FileSystem and update th= e wiki. As well, any suggestions for the 'significant amount of time part'? Thanks, Mark On 7/30/07, David Nuescheler wrote: > > Hi Bruce, > > thanks for your comment. > > > I am not fired by index problems. -) > > I just want to everybody realize it is very critical issue to back up > your repository. > > Currently, the solution is: > > 1) Backup DB data. > > 2) Backup your file system and you can delete all indexes of them. > > However, it is still a bug that JackRabbit v1.3 can not rebuild > everything from DB, in > > case your hard driver dies with all your repository file system. > Shouldn't that be solved by the DBFileSystem. > http://yukatan.fi/2007/1.4/org/apache/jackrabbit/core/fs/db/DbFileSystem.= html > > > This allows you to store everything that is necessary for a complete > restore > in the DB, which means your DB backup is the only thing (beyond the > repository.xml) that you need to restore a complete JR instance. > > > My concerns are two: > > 1) Performance of navigation of Nodes which relates cache manager > resizing > I appreciate the performance issue. I am still not convinced that this > is related > with the cache manager resizing... > > > 2) Logic backup repository using JCR export/import API. > I agree that it would be desirable to have a built-in backup/restore > mechanism on a higher level. > > The JCR export/import is probably not the right layer, > since it only covers the content in a single workspace and has no > means to address things like nodetypes, versions or the > namespace registry. > And I think your most pressing issue should be addressed > by the DBFileSystem. > > regards, > david > > > -----Original Message----- > > From: bdelacretaz@gmail.com [mailto:bdelacretaz@gmail.com] On Behalf Of > Bertrand Delacretaz > > Sent: Friday, July 27, 2007 3:15 AM > > To: users@jackrabbit.apache.org > > Subject: Jackrabbit =3D Kick Ass Tool (was: Jackrabbit =3D Big Trouble?= ?) > > > > Hi, > > > > I hate to play grumpy old man once again, but the recent trend towards > > Loud Subjects That Catch Peoples Attention does not really help the > > discussion, so let's rename this thread ;-) > > > > Bruce, if I read your message correctly, it looks like you have three > > problems with Jackrabbit: > > > > 1) Cache Manager resizes seem to slow your app down > > 2) You're going to be fired because you lost your index (or Jackrabbit > did) > > 3) You're not sure about which application pattern/content model to use > > > > So let's please tackle these one at a time, ideally in separate > > threads so that people can contribute efficiently to the discussion. > > > > Sorry if I'm being a bit harsh, but IMHO you started it with the > > choice of your message's subject ;-) > > -Bertrand > > > > > > On 7/27/07, Bruce Li < bli@tirawireless.com> wrote: > > > I have been in this Jackrabbit Community for a couple of months since > I joined repository project two months ago. > > > > > > > > > > > > First, I respect and appreciate all hard works contributed in current > JackRabbit project and definitely I am sure a lot of developers benefit f= rom > this project. There are some people contribute their JackRabbit working > experience like David Nuescheler, who collects "7 DR Rules", which is > precious since current lack of document of JackRabbit, and they are "real= " > working experiences. > > > > > > > > > > > > However, I also heard some negative voice from this community like > "JackRabbit is dead (for us)" from Fr=E9d=E9ric Esnault. I suffer some tr= oubles > from JackRabbit and it seems foundational problems. I would like to share > all my experience with you, and any feedback or good suggestion is > definitely what I want. > > > > > > > > > > > > Since these troubles are "big" troubles for enterprise use of > JackRabbit 1.3, let's discuss it from beginning. > > > > > > > > > > > > Question 1: > > > > > > Why do you select JackRabbit rather than Database as your repository > solution? > > > > > > > > > > > > There are a lot of answers for this question and it seems that > everybody who joins this community has already known the answers (It may = be > formal document which was approved by your CTO). However, my opinion, th= is > is the basic question really need to be discussed here. > > > > > > > > > > > > To answer this question, some technical key words to support > Jackrabbit may be "JCR API", "Lucene Search Engine" and so on. However, a= s > the user of JackRabbit, I would like to list the two key concerns why I > select JackRabbit as repository solution from Product Point of View: > > > > > > > > > > > > 1. Quick and effective data search/fetch from volume content > repository > > > 2. Build-in content version/revision control without extra code > > > > > > > > > > > > Now let me describe the big troubles I met in my use: > > > > > > 1. Quick and effective data search or fetch from volume content > repository > > > > > > > > > > > > Experience: There are not many data on my repository which contains > hundreds of two major object nodes, each node (object) contains less than= 20 > properties (fields), including the other 5 child nodes (nested small > objects) and one of two major nodes(object) has one binary data (up to 1 > megabyte). Unfortunately, the performance is not acceptable when I naviga= te > nodes of the major nodes. The main problem is the build-in Cache Manager = of > JackRabbit resizes which costs uncertain time, which result the operation > very slow sometimes. It is not easy to read those codes when debugging > Jackrabbit for performance tuning because there is no document about the > logic behind the index resizing. > > > > > > > > > > > > 2. Content version/revision control > > > > > > Experience: This function works well on Jackrabbit v1.3. The main > problem is that all revision (except base revision) of node are lost when > export/import data from one repository to another repository. I am > discussing this issue because it concerns the repository backup. > > > > > > > > > > > > I just found in JackRabbit v1.3, there is no way to backup repository > using DB as persistence manager. I mean that there is no way to re-index > based on data on DB. The following is my case: > > > > > > > > > > > > In one repository server, the index (in file system) is corrupt which > causes all search failure. However, all data (in DB) is still alive, wher= e > you can iterate all of them. After clean the whole repository file system > (most of them are index information), Jackrabbit can not correctly re-bui= ld > index based on the data on DB. If it happens on production repository, it > means: "My God, I am going to be fired". As I know, Jackrabbit v1.1 can > successfully re-index (creating totally new repository index (file system= ) > based on DB data). > > > > > > > > > > > > As the alternative solution to backup repository, I try to > export/import all nodes from repository to another repository using JCR > Export API (exportSystemView). The good news is that JackRabbot v1.3succe= ssfully builds index (the whole file system) during the importing > process; the bad news is that it lost all revision of all versioning node= s. > Can you image how frustrate I am when I realize there is no way to backup > repository based on DB data? > > > > > > > > > > > > I just got the answer for the re-index issue for Jackrabbit v1.3: You > CAN NOT delete all file system. Only delete all indexes but keep the othe= r > folders. Jackrabbit can re-index successfully when it starts up. > > > > > > > > > > > > Question 2: > > > > > > How can developer correctly use Jackrabbit (JCR) as their repository > solution? > > > > > > > > > > > > The expert of jackrabbit may see that I use object to describe node > and you may think it is not the pattern you are using Jackrabbit. So the > question is raised as "Which is the best practices (pattern) to use > Jackrabbit (JCR) as repository solution." > > > > > > > > > > > > From this community, I see a lot of developers use Jackrabbit by > fetching contents by path. It means that they do not need treat node as > object, instead, they put content on repository as asset, which can be > easily and effectively retrieved by a given path. This pattern exactly me= ets > the truth of "The simplicity is the best". > > > > > > > > > > > > My use of Jackrabbit is based on the business requirement, which need > to navigate most of nodes and reference nodes, check child nodes and > properties to find the proper content by a couple of business rules. I wo= uld > like to say that all performance issues are raised by nodes iteration > process. Even more, I have created generic classes using java reflect > package for bi-directory mapping between nodes and objects. For performan= ce > improvement, the mapping supports generic child nodes lazy loading. Howev= er, > it seems all these jobs do not solve the performance problem although the= y > sound pretty "professional". You may ask me: if you have such business > requirement, why not go to DB and build the full relationship for your > business model? J2EE developers all know how powerful java-db world is: t= he > mature ORM tool ( e.g. Hibernate), transaction management, batch data > fetching, performance tuning and so on. However, my question is: "Is ther= e > any good pattern in current jackrabbit to effectively handle data fetchin= g > with week relationship?" > > > > > > > > > > > > Now it is time to say some words to the jackrabbit developers and > contributors what I really want to say for the whole community: > > > > > > > > > > > > My begs: > > > > > > Guide, document and sample code is the king for any open source. How > frustrating for Jackrabbit developers find the incorrect pattern is appli= ed > by users on their projects. On the other hand, how frustrating for > JackRabbit users can not find the good pattern to follow, which can save > their bunch of time. From product point of view, the search by XPath or > XQuery or SQL is not foundational issue. The foundational issue is one > effective search means covers most of important requirements from real wo= rld > and the document can be found in jackrabbit web site. > > > > > > > > > > > > > > > > > > I do believe Jackrabbit is qualified project and I really hope all > "best features" are documented, demoed and used by the whole community. > > > > > > > > > > > > Thanks > > > > > > > > > > > > Bruce > > > --=20 Best, Mark Waschkowski ------=_Part_123231_21944903.1186501547751--