jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miro Walker" <miro.wal...@gmail.com>
Subject Re: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big Trouble??)
Date Tue, 07 Aug 2007 21:46:45 GMT
We've had the same issues in the past. In the end we developed some
scripts to edit the dbfilesystem directly, but it's not ideal.

See http://issues.apache.org/jira/browse/JCR-399 and
http://issues.apache.org/jira/browse/JCR-313 for more on this issue.
We tried to suggest allowing workspaces to inherit the config in the
repository.xml at runtime, but this was rejected. The alternative
suggested and implemented (at least partially) was to use JNDI for
connection lookups.  Might be worth a look. I'd be interested to see
if this meets your use case, as it sounds similar to what we need.

Cheers,

miro



On 8/7/07, Brian Thompson <elephantium@gmail.com> wrote:
> We did the same type of thing in our project: filesystem, workspace
> filesystem and persistence, and the versioning filesystem and persistence
> were all pointing to the same DB.
>
> We had three workspaces in our repository, and repository.xml files were
> generated based on the initial config in our jackrabbit.xml file.
>
> When we needed to change the DB that Jackrabbit was pointing to (to deploy
> the app on the production server, pointing at the production DB), it ended
> up being a lot of places that we had to adjust the connection string.  It
> would be nice to have an easier way to set these configuration options.
>
> -Brian
>
>
>
> On 8/7/07, Mark Waschkowski <mwaschkowski@gmail.com> wrote:
> >
> > OK, great, happy to help.
> >
> > Yes, a lot of people will want to do this, and I've tried unsuccessfully
> > in
> > the past to get the details surrounding backup techniques, so when I got
> > back from vacation and saw your post, I didn't want to lose the info!
> >
> > Would you please confirm that I got the information correct, as I haven't
> > tried this yet? After you confirm, I will update my config, test it, and
> > then provide my new config as a blueprint and put in the wiki.
> >
> > Best,
> >
> > Mark
> >
> > On 8/7/07, David Nuescheler <david@day.com> wrote:
> > >
> > > Hi Mark,
> > >
> > > I think this is an excellent idea, thanks a lot for putting in the
> > effort.
> > >
> > > I think the case that someone would like to store all their content
> > > within the same RDBMS is common enough that we even should
> > > have a blueprint example config in the documentation.
> > >
> > > thanks again,
> > > david
> > >
> > >
> > > On 8/7/07, Mark Waschkowski <mwaschkowski@gmail.com> wrote:
> > > > Hi David,
> > > >
> > > > I would like to update the wiki with the below information, as I think
> > > its
> > > > quite valuable and would help new users without having to scour the
> > > mailing
> > > > list. If you verify the following, I will update the wiki.
> > > >
> > > > -----For wiki:
> > > > Using DBFileSystem as specified in the repository.xml:
> > > > <Repository>
> > > >         <FileSystem ...>
> > > >
> > > > and using the same database any of the PersistenceManager entries, the
> > > only
> > > > things that need to be backed up are:
> > > > 1) repository.xml
> > > > 2) the database
> > > >
> > > > Then, to restore from a backup, all that would need to be done is to
> > use
> > > the
> > > > backed up repository.xml , restore the database using the backup, and
> > > the
> > > > indexes will rebuild themselves when the system restarts. This will
> > > properly
> > > > handle versioning as well.
> > > >
> > > > Note: rebuilding of indexes may take a significant amount of time
> > > > ----end
> > > >
> > > > If all that looks correct, I'll fill in an example FileSystem and
> > update
> > > the
> > > > wiki. As well, any suggestions for the 'significant amount of time
> > > part'?
> > > >
> > > > Thanks,
> > > >
> > > > Mark
> > > >
> > > > On 7/30/07, David Nuescheler <david@day.com> wrote:
> > > > >
> > > > > Hi Bruce,
> > > > >
> > > > > thanks for your comment.
> > > > >
> > > > > > I am not fired by index problems. -)
> > > > > > I just want to everybody realize it is very critical issue to
back
> > > up
> > > > > your repository.
> > > > > > Currently, the solution is:
> > > > > > 1) Backup DB data.
> > > > > > 2) Backup your file system and you can delete all indexes of
them.
> > > > > > However, it is still a bug that JackRabbit v1.3 can not rebuild
> > > > > everything from DB, in
> > > > > > case your hard driver dies with all your repository file system.
> > > > > Shouldn't that be solved by the DBFileSystem.
> > > > >
> > >
> > http://yukatan.fi/2007/1.4/org/apache/jackrabbit/core/fs/db/DbFileSystem.html
> > > > >
> > > > >
> > > > > This allows you to store everything that is necessary for a complete
> > > > > restore
> > > > > in the DB, which means your DB backup is the only thing (beyond the
> > > > > repository.xml) that you need to restore a complete JR instance.
> > > > >
> > > > > > My concerns are two:
> > > > > > 1) Performance of navigation of Nodes which relates cache manager
> > > > > resizing
> > > > > I appreciate the performance issue. I am still not convinced that
> > this
> > > > > is related
> > > > > with the cache manager resizing...
> > > > >
> > > > > > 2) Logic backup repository using JCR export/import API.
> > > > > I agree that it would be desirable to have a built-in backup/restore
> > > > > mechanism on a higher level.
> > > > >
> > > > > The JCR export/import is probably not the right layer,
> > > > > since it only covers the content in a single workspace and has no
> > > > > means to address things like nodetypes, versions or the
> > > > > namespace registry.
> > > > > And I think your most pressing issue should be addressed
> > > > > by the DBFileSystem.
> > > > >
> > > > > regards,
> > > > > david
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: bdelacretaz@gmail.com [mailto: bdelacretaz@gmail.com]
On
> > > Behalf Of
> > > > > Bertrand Delacretaz
> > > > > > Sent: Friday, July 27, 2007 3:15 AM
> > > > > > To: users@jackrabbit.apache.org
> > > > > > Subject: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big
> > > Trouble??)
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I hate to play grumpy old man once again, but the recent trend
> > > towards
> > > > > > Loud Subjects That Catch Peoples Attention does not really help
> > the
> > > > > > discussion, so let's rename this thread ;-)
> > > > > >
> > > > > > Bruce, if I read your message correctly, it looks like you have
> > > three
> > > > > > problems with Jackrabbit:
> > > > > >
> > > > > > 1) Cache Manager resizes seem to slow your app down
> > > > > > 2) You're going to be fired because you lost your index (or
> > > Jackrabbit
> > > > > did)
> > > > > > 3) You're not sure about which application pattern/content model
> > to
> > > use
> > > > > >
> > > > > > So let's please tackle these one at a time, ideally in separate
> > > > > > threads so that people can contribute efficiently to the
> > discussion.
> > >
> > > > > >
> > > > > > Sorry if I'm being a bit harsh, but IMHO you started it with
the
> > > > > > choice of your message's subject ;-)
> > > > > > -Bertrand
> > > > > >
> > > > > >
> > > > > > On 7/27/07, Bruce Li < bli@tirawireless.com> wrote:
> > > > > > > I have been in this Jackrabbit Community for a couple of
months
> > > since
> > > > > I joined repository project two months ago.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > First, I respect and appreciate all hard works contributed
in
> > > current
> > > > > JackRabbit project and definitely I am sure a lot of developers
> > > benefit from
> > > > > this project. There are some people contribute their JackRabbit
> > > working
> > > > > experience like David Nuescheler, who collects "7 DR Rules", which
> > is
> > > > > precious since current lack of document of JackRabbit, and they are
> > > "real"
> > > > > working experiences.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > However, I also heard some negative voice from this community
> > like
> > > > > "JackRabbit is dead (for us)" from Frédéric Esnault. I suffer some
> > > troubles
> > > > > from JackRabbit and it seems foundational problems. I would like
to
> > > share
> > > > > all my experience with you, and any feedback or good suggestion is
> > > > > definitely what I want.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Since these troubles are "big" troubles for enterprise
use of
> > > > > JackRabbit 1.3, let's discuss it from beginning.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Question 1:
> > > > > > >
> > > > > > > Why do you select JackRabbit rather than Database as your
> > > repository
> > > > > solution?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > There are a lot of answers for this question and it seems
that
> > > > > everybody who joins this community has already known the answers
(It
> > > may be
> > > > > formal document which was approved by your CTO).  However, my
> > opinion,
> > > this
> > > > > is the basic question really need to be discussed here.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > To answer this question, some technical key words to support
> > > > > Jackrabbit may be "JCR API", "Lucene Search Engine" and so on.
> > > However, as
> > > > > the user of JackRabbit, I would like to list the two key concerns
> > why
> > > I
> > > > > select JackRabbit as repository solution from Product Point of View:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 1.      Quick and effective data search/fetch from volume
> > content
> > > > > repository
> > > > > > > 2.      Build-in content version/revision control without
extra
> > > code
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Now let me describe the big troubles I met in my use:
> > > > > > >
> > > > > > > 1.      Quick and effective data search or fetch from volume
> > > content
> > > > > repository
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Experience: There are not many data on my repository which
> > > contains
> > > > > hundreds of two major object nodes, each node (object) contains less
> > > than 20
> > > > > properties (fields), including the other 5 child nodes (nested small
> > > > > objects) and one of two major nodes(object) has one binary data (up
> > to
> > > 1
> > > > > megabyte). Unfortunately, the performance is not acceptable when
I
> > > navigate
> > > > > nodes of the major nodes. The main problem is the build-in Cache
> > > Manager of
> > > > > JackRabbit resizes which costs uncertain time, which result the
> > > operation
> > > > > very slow sometimes.  It is not easy to read those codes when
> > > debugging
> > > > > Jackrabbit for performance tuning because there is no document about
> > > the
> > > > > logic behind the index resizing.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 2.      Content version/revision control
> > > > > > >
> > > > > > > Experience: This function works well on Jackrabbit v1.3.
The
> > main
> > > > > problem is that all revision (except base revision) of node are lost
> > > when
> > > > > export/import data from one repository to another repository. I am
> > > > > discussing this issue because it concerns the repository backup.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I just found in JackRabbit v1.3, there is no way to backup
> > > repository
> > > > > using DB as persistence manager. I mean that there is no way to
> > > re-index
> > > > > based on data on DB. The following is my case:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > In one repository server, the index (in file system) is
corrupt
> > > which
> > > > > causes all search failure. However, all data (in DB) is still alive,
> > > where
> > > > > you can iterate all of them. After clean the whole repository file
> > > system
> > > > > (most of them are index information), Jackrabbit can not correctly
> > > re-build
> > > > > index based on the data on DB. If it happens on production
> > repository,
> > > it
> > > > > means: "My God, I am going to be fired". As I know, Jackrabbit
> > v1.1can
> > > > > successfully re-index (creating totally new repository index (file
> > > system)
> > > > > based on DB data).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > As the alternative solution to backup repository, I try
to
> > > > > export/import all nodes from repository to another repository using
> > > JCR
> > > > > Export API (exportSystemView). The good news is that JackRabbot
> > > v1.3successfully builds index (the whole file system) during the
> > importing
> > > > > process; the bad news is that it lost all revision of all versioning
> > > nodes.
> > > > > Can you image how frustrate I am when I realize there is no way to
> > > backup
> > > > > repository based on DB data?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I just got the answer for the re-index issue for Jackrabbit
v1.3
> > :
> > > You
> > > > > CAN NOT delete all file system. Only delete all indexes but keep
the
> > > other
> > > > > folders. Jackrabbit can re-index successfully when it starts up.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Question 2:
> > > > > > >
> > > > > > > How can developer correctly use Jackrabbit (JCR) as their
> > > repository
> > > > > solution?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > The expert of jackrabbit may see that I use object to describe
> > > node
> > > > > and you may think it is not the pattern you are using Jackrabbit.
So
> > > the
> > > > > question is raised as "Which is the best practices (pattern) to use
> > > > > Jackrabbit (JCR) as repository solution."
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > From this community, I see a lot of developers use Jackrabbit
by
> > > > > fetching contents by path. It means that they do not need treat node
> > > as
> > > > > object, instead, they put content on repository as asset, which can
> > be
> > > > > easily and effectively retrieved by a given path. This pattern
> > exactly
> > > meets
> > > > > the truth of "The simplicity is the best".
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > My use of Jackrabbit is based on the business requirement,
which
> > > need
> > > > > to navigate most of nodes and reference nodes, check child nodes
and
> > > > > properties to find the proper content by a couple of business rules.
> > I
> > > would
> > > > > like to say that all performance issues are raised by nodes
> > iteration
> > > > > process. Even more, I have created generic classes using java
> > reflect
> > > > > package for bi-directory mapping between nodes and objects. For
> > > performance
> > > > > improvement, the mapping supports generic child nodes lazy loading.
> > > However,
> > > > > it seems all these jobs do not solve the performance problem
> > although
> > > they
> > > > > sound pretty "professional".  You may ask me: if you have such
> > > business
> > > > > requirement, why not go to DB and build the full relationship for
> > your
> > >
> > > > > business model? J2EE developers all know how powerful java-db world
> > > is: the
> > > > > mature ORM tool ( e.g. Hibernate), transaction management, batch
> > data
> > > > > fetching, performance tuning and so on. However, my question is:
"Is
> > > there
> > > > > any good pattern in current jackrabbit to effectively handle data
> > > fetching
> > > > > with week relationship?"
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Now it is time to say some words to the jackrabbit developers
> > and
> > > > > contributors what I really want to say for the whole community:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > My begs:
> > > > > > >
> > > > > > > Guide, document and sample code is the king for any open
source.
> > > How
> > > > > frustrating for Jackrabbit developers find the incorrect pattern
is
> > > applied
> > > > > by users on their projects. On the other hand, how frustrating for
> > > > > JackRabbit users can not find the good pattern to follow, which can
> > > save
> > > > > their bunch of time. From product point of view, the search by XPath
> > > or
> > > > > XQuery or SQL is not foundational issue. The foundational issue is
> > one
> > > > > effective search means covers most of important requirements from
> > real
> > > world
> > > > > and the document can be found in jackrabbit web site.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I do believe Jackrabbit is qualified project and I really
hope
> > all
> > >
> > > > > "best features" are documented, demoed and used by the whole
> > > community.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Bruce
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best,
> > > >
> > > > Mark Waschkowski
> > > >
> > >
> >
> >
> >
> > --
> > Best,
> >
> > Mark Waschkowski
> >
>

Mime
View raw message