Thanks Alex for the prompt reply! Your answers have really cleared some of the things. Few more queries inline -

Alexander Klimetschek <aklimets@day.com> wrote on 11/06/2009 02:40:38 PM:

> On Fri, Nov 6, 2009 at 08:39, Medha C Sutaria <msutaria@csc.com> wrote:
> > We are using Jackrabbit (version 1.4.1) with liferay (version 5.2.2). It
> > uses the following configuration -
> > JCRHook + PersistenceManager + File system
> >...
> > JCRHook vs FileSystemHook?
> > PersistenceManager vs Datastore?
> > FileSystem vs Database?
> > if filesystem, sharing the file system? or using SAN?
>
> You need to be more specific. Which persistence manager are you using?


Medha - we use BundleFsPersistenceManager.

>
> Quick notes:
> - bundle based persistence managers are best
> - local dbs (like derby or h2) have better performance than remote dbs


Medha - Any idea about MySql? Saw some posts about table locking and concurrent access issues while retrieving/updating files

> - datastore will only be used for large binaries; using filedatastore
> is a better choice than storing the binaries in a database (using a db
> pm)
> - FileSystem (element in repository.xml) is not important anymore,
> does not influence peformance

Is it this tag you are talking about? If yes, then isn't this which decides if we want to store data in DB or on LocalFileSystem?
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
        <param name="path" value="${rep.home}/repository" />
</FileSystem>

> - JCRHook seems to be a proprietary liferay component, so we
> (jackrabbit devs) cannot give you any information on this
> - if you do clustering and use the datastore, you will need a shared
> file system, SAN is typically the best (but know your network
> performance)

Will adding this tag in our repository.xml make the repository usable with SAN? (our repository.xml attached)  
<DataStore class="org.apache.jackrabbit.core.data.FileDataStore">
        <param name="path" value="${rep.home}/repository/datastore"/>
        <param name="minRecordLength" value="1000"/>
</DataStore>

>
> > 1. To select a different solution, migration of current documents to the
> > new solutions has to be done
>
> See here http://wiki.apache.org/jackrabbit/BackupAndMigration for
> some options.

I checked out these options in the past. I've learned that there's a problem in migrating versions. That part of the repository tree is secured and cannot be exported to an xml file. We needed migration of data when we tried to use DbFileSystem instead of LocalFileSystem. I guess we don't need to migrate in case of changing other configuration? Eg. using datastore?

>
> > 2. The repository is increasing exponentially. The file system size is
> > already 10 GB. Does jackrabbit support such large repositories? At what
> > point will the performance start degrading?
>
> Depends on what configuration you actually use.

Can you suggest which is the best configuration for clustering (based on performance and large repositories)

>
> > 3. What will it take to upgrade the jackrabbit version from 1.4 to 1.6?
> > Will be require any migration?
>
> AFAIK nothing would be required from 1.4 to 1.6. Minor version numbers
> are meant to be backwards compatible in Jackrabbit.

This sounds great!
>
> Regards,
> Alex
>
> --
> Alexander Klimetschek
> alexander.klimetschek@day.com