Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@jackrabbit.apache.org
Received-SPF: pass (athena.apache.org: domain of mwaschkowski@gmail.com
 designates 64.233.166.178 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=beta;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references;
        b=Q6IdZ5dVUwHKdFRdovl9qdgNzmFQP9l5Jtz0TK72B1IFFWQax9qbQH7+PzZdG3jRKX4XSTS+CjiB5ZsivX2H+3/41WTO7xVF6aVgo+0XWoz4ZF5j/I9yGnEirngbDk9Wr9KFTCw+985bM40xfrwCqt0iEyX/wzeJ0Hw7TGQ8880=
Message-ID: <76a6ebd00708070845k1c4f6c72l9b40d1f75f48882d@mail.gmail.com>
Date: Tue, 7 Aug 2007 11:45:47 -0400
From: "Mark Waschkowski" <mwaschkowski@gmail.com>
To: users@jackrabbit.apache.org
Subject: Re: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big Trouble??)
In-Reply-To: <1b0d43d00707300214m341684c4m130737ce1c9f3430@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_123231_21944903.1186501547751"
References: <f767f0600707270014x3cd206bfye28d42afb86ec3ed@mail.gmail.com>
	 <916A2A65AB16854B99689B6EC2C60A541B4B72@scooby2k3.corp.bspark.com>
	 <1b0d43d00707300214m341684c4m130737ce1c9f3430@mail.gmail.com>

------=_Part_123231_21944903.1186501547751
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Hi David,

I would like to update the wiki with the below information, as I think its
quite valuable and would help new users without having to scour the mailing
list. If you verify the following, I will update the wiki.

-----For wiki:
Using DBFileSystem as specified in the repository.xml:
<Repository>
        <FileSystem ...>

and using the same database any of the PersistenceManager entries, the only
things that need to be backed up are:
1) repository.xml
2) the database

Then, to restore from a backup, all that would need to be done is to use th=
e
backed up repository.xml, restore the database using the backup, and the
indexes will rebuild themselves when the system restarts. This will properl=
y
handle versioning as well.

Note: rebuilding of indexes may take a significant amount of time
----end

If all that looks correct, I'll fill in an example FileSystem and update th=
e
wiki. As well, any suggestions for the 'significant amount of time part'?

Thanks,

Mark

On 7/30/07, David Nuescheler <david@day.com> wrote:
>
> Hi Bruce,
>
> thanks for your comment.
>
> > I am not fired by index problems. -)
> > I just want to everybody realize it is very critical issue to back up
> your repository.
> > Currently, the solution is:
> > 1) Backup DB data.
> > 2) Backup your file system and you can delete all indexes of them.
> > However, it is still a bug that JackRabbit v1.3 can not rebuild
> everything from DB, in
> > case your hard driver dies with all your repository file system.
> Shouldn't that be solved by the DBFileSystem.
> http://yukatan.fi/2007/1.4/org/apache/jackrabbit/core/fs/db/DbFileSystem.=
html
>
>
> This allows you to store everything that is necessary for a complete
> restore
> in the DB, which means your DB backup is the only thing (beyond the
> repository.xml) that you need to restore a complete JR instance.
>
> > My concerns are two:
> > 1) Performance of navigation of Nodes which relates cache manager
> resizing
> I appreciate the performance issue. I am still not convinced that this
> is related
> with the cache manager resizing...
>
> > 2) Logic backup repository using JCR export/import API.
> I agree that it would be desirable to have a built-in backup/restore
> mechanism on a higher level.
>
> The JCR export/import is probably not the right layer,
> since it only covers the content in a single workspace and has no
> means to address things like nodetypes, versions or the
> namespace registry.
> And I think your most pressing issue should be addressed
> by the DBFileSystem.
>
> regards,
> david
>
> > -----Original Message-----
> > From: bdelacretaz@gmail.com [mailto:bdelacretaz@gmail.com] On Behalf Of
> Bertrand Delacretaz
> > Sent: Friday, July 27, 2007 3:15 AM
> > To: users@jackrabbit.apache.org
> > Subject: Jackrabbit =3D Kick Ass Tool (was: Jackrabbit =3D Big Trouble?=
?)
> >
> > Hi,
> >
> > I hate to play grumpy old man once again, but the recent trend towards
> > Loud Subjects That Catch Peoples Attention does not really help the
> > discussion, so let's rename this thread ;-)
> >
> > Bruce, if I read your message correctly, it looks like you have three
> > problems with Jackrabbit:
> >
> > 1) Cache Manager resizes seem to slow your app down
> > 2) You're going to be fired because you lost your index (or Jackrabbit
> did)
> > 3) You're not sure about which application pattern/content model to use
> >
> > So let's please tackle these one at a time, ideally in separate
> > threads so that people can contribute efficiently to the discussion.
> >
> > Sorry if I'm being a bit harsh, but IMHO you started it with the
> > choice of your message's subject ;-)
> > -Bertrand
> >
> >
> > On 7/27/07, Bruce Li < bli@tirawireless.com> wrote:
> > > I have been in this Jackrabbit Community for a couple of months since
> I joined repository project two months ago.
> > >
> > >
> > >
> > > First, I respect and appreciate all hard works contributed in current
> JackRabbit project and definitely I am sure a lot of developers benefit f=
rom
> this project. There are some people contribute their JackRabbit working
> experience like David Nuescheler, who collects "7 DR Rules", which is
> precious since current lack of document of JackRabbit, and they are "real=
"
> working experiences.
> > >
> > >
> > >
> > > However, I also heard some negative voice from this community like
> "JackRabbit is dead (for us)" from Fr=E9d=E9ric Esnault. I suffer some tr=
oubles
> from JackRabbit and it seems foundational problems. I would like to share
> all my experience with you, and any feedback or good suggestion is
> definitely what I want.
> > >
> > >
> > >
> > > Since these troubles are "big" troubles for enterprise use of
> JackRabbit 1.3, let's discuss it from beginning.
> > >
> > >
> > >
> > > Question 1:
> > >
> > > Why do you select JackRabbit rather than Database as your repository
> solution?
> > >
> > >
> > >
> > > There are a lot of answers for this question and it seems that
> everybody who joins this community has already known the answers (It may =
be
> formal document which was approved by your CTO).  However, my opinion, th=
is
> is the basic question really need to be discussed here.
> > >
> > >
> > >
> > > To answer this question, some technical key words to support
> Jackrabbit may be "JCR API", "Lucene Search Engine" and so on. However, a=
s
> the user of JackRabbit, I would like to list the two key concerns why I
> select JackRabbit as repository solution from Product Point of View:
> > >
> > >
> > >
> > > 1.      Quick and effective data search/fetch from volume content
> repository
> > > 2.      Build-in content version/revision control without extra code
> > >
> > >
> > >
> > > Now let me describe the big troubles I met in my use:
> > >
> > > 1.      Quick and effective data search or fetch from volume content
> repository
> > >
> > >
> > >
> > > Experience: There are not many data on my repository which contains
> hundreds of two major object nodes, each node (object) contains less than=
 20
> properties (fields), including the other 5 child nodes (nested small
> objects) and one of two major nodes(object) has one binary data (up to 1
> megabyte). Unfortunately, the performance is not acceptable when I naviga=
te
> nodes of the major nodes. The main problem is the build-in Cache Manager =
of
> JackRabbit resizes which costs uncertain time, which result the operation
> very slow sometimes.  It is not easy to read those codes when debugging
> Jackrabbit for performance tuning because there is no document about the
> logic behind the index resizing.
> > >
> > >
> > >
> > > 2.      Content version/revision control
> > >
> > > Experience: This function works well on Jackrabbit v1.3. The main
> problem is that all revision (except base revision) of node are lost when
> export/import data from one repository to another repository. I am
> discussing this issue because it concerns the repository backup.
> > >
> > >
> > >
> > > I just found in JackRabbit v1.3, there is no way to backup repository
> using DB as persistence manager. I mean that there is no way to re-index
> based on data on DB. The following is my case:
> > >
> > >
> > >
> > > In one repository server, the index (in file system) is corrupt which
> causes all search failure. However, all data (in DB) is still alive, wher=
e
> you can iterate all of them. After clean the whole repository file system
> (most of them are index information), Jackrabbit can not correctly re-bui=
ld
> index based on the data on DB. If it happens on production repository, it
> means: "My God, I am going to be fired". As I know, Jackrabbit v1.1 can
> successfully re-index (creating totally new repository index (file system=
)
> based on DB data).
> > >
> > >
> > >
> > > As the alternative solution to backup repository, I try to
> export/import all nodes from repository to another repository using JCR
> Export API (exportSystemView). The good news is that JackRabbot v1.3succe=
ssfully builds index (the whole file system) during the importing
> process; the bad news is that it lost all revision of all versioning node=
s.
> Can you image how frustrate I am when I realize there is no way to backup
> repository based on DB data?
> > >
> > >
> > >
> > > I just got the answer for the re-index issue for Jackrabbit v1.3: You
> CAN NOT delete all file system. Only delete all indexes but keep the othe=
r
> folders. Jackrabbit can re-index successfully when it starts up.
> > >
> > >
> > >
> > > Question 2:
> > >
> > > How can developer correctly use Jackrabbit (JCR) as their repository
> solution?
> > >
> > >
> > >
> > > The expert of jackrabbit may see that I use object to describe node
> and you may think it is not the pattern you are using Jackrabbit. So the
> question is raised as "Which is the best practices (pattern) to use
> Jackrabbit (JCR) as repository solution."
> > >
> > >
> > >
> > > From this community, I see a lot of developers use Jackrabbit by
> fetching contents by path. It means that they do not need treat node as
> object, instead, they put content on repository as asset, which can be
> easily and effectively retrieved by a given path. This pattern exactly me=
ets
> the truth of "The simplicity is the best".
> > >
> > >
> > >
> > > My use of Jackrabbit is based on the business requirement, which need
> to navigate most of nodes and reference nodes, check child nodes and
> properties to find the proper content by a couple of business rules. I wo=
uld
> like to say that all performance issues are raised by nodes iteration
> process. Even more, I have created generic classes using java reflect
> package for bi-directory mapping between nodes and objects. For performan=
ce
> improvement, the mapping supports generic child nodes lazy loading. Howev=
er,
> it seems all these jobs do not solve the performance problem although the=
y
> sound pretty "professional".  You may ask me: if you have such business
> requirement, why not go to DB and build the full relationship for your
> business model? J2EE developers all know how powerful java-db world is: t=
he
> mature ORM tool ( e.g. Hibernate), transaction management, batch data
> fetching, performance tuning and so on. However, my question is: "Is ther=
e
> any good pattern in current jackrabbit to effectively handle data fetchin=
g
> with week relationship?"
> > >
> > >
> > >
> > > Now it is time to say some words to the jackrabbit developers and
> contributors what I really want to say for the whole community:
> > >
> > >
> > >
> > > My begs:
> > >
> > > Guide, document and sample code is the king for any open source. How
> frustrating for Jackrabbit developers find the incorrect pattern is appli=
ed
> by users on their projects. On the other hand, how frustrating for
> JackRabbit users can not find the good pattern to follow, which can save
> their bunch of time. From product point of view, the search by XPath or
> XQuery or SQL is not foundational issue. The foundational issue is one
> effective search means covers most of important requirements from real wo=
rld
> and the document can be found in jackrabbit web site.
> > >
> > >
> > >
> > >
> > >
> > > I do believe Jackrabbit is qualified project and I really hope all
> "best features" are documented, demoed and used by the whole community.
> > >
> > >
> > >
> > > Thanks
> > >
> > >
> > >
> > > Bruce
> >
>


--=20
Best,

Mark Waschkowski

------=_Part_123231_21944903.1186501547751--