Return-Path: Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: (qmail 47731 invoked from network); 17 Nov 2010 17:42:56 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Nov 2010 17:42:56 -0000 Received: (qmail 59226 invoked by uid 500); 17 Nov 2010 17:43:28 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 59191 invoked by uid 500); 17 Nov 2010 17:43:27 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 59182 invoked by uid 99); 17 Nov 2010 17:43:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Nov 2010 17:43:27 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [194.172.26.33] (HELO MX1.aeb.de) (194.172.26.33) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Nov 2010 17:43:19 +0000 X-IronPort-AV: E=Sophos;i="4.59,212,1288566000"; d="scan'208";a="6337557" Received: from unknown (HELO S-HQMX7.pmbelz.de) ([10.237.5.7]) by MX1I.pmbelz.de with ESMTP; 17 Nov 2010 18:42:59 +0100 Received: from S-HQMX7.pmbelz.de ([fe80::c4b6:4b67:cd89:2050]) by S-HQMX7.pmbelz.de ([fe80::c4b6:4b67:cd89:2050%15]) with mapi; Wed, 17 Nov 2010 18:42:58 +0100 From: "Seidel. Robert" To: "users@jackrabbit.apache.org" Date: Wed, 17 Nov 2010 18:42:55 +0100 Subject: AW: Multiple instances of repository Thread-Topic: Multiple instances of repository Thread-Index: AcuFvgyiNnIQvrZyQymyVu5oksQ7PAAuG1eQAAHd9pA= Message-ID: References: <2DD29CC841489C44A6E38A6CA5D4BB270D75C5F034@emcexc-02> <2DD29CC841489C44A6E38A6CA5D4BB270D75C5F048@emcexc-02> <2DD29CC841489C44A6E38A6CA5D4BB270D75C5F062@emcexc-02> <2DD29CC841489C44A6E38A6CA5D4BB270D75C5F06E@emcexc-02> <2DD29CC841489C44A6E38A6CA5D4BB270D75C5F842@emcexc-02> In-Reply-To: <2DD29CC841489C44A6E38A6CA5D4BB270D75C5F842@emcexc-02> Accept-Language: de-DE Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: de-DE Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hi Nikhil, why don't you just use one 24x7 server jvm hitting the jackrabbit repositor= y? If one of the hundred jvms want something from the repository, they have= to make a web service call to your server instance, which gets the job don= e. Kindly regards, Robert -----Urspr=FCngliche Nachricht----- Von: nikhil.agrawal@emeter.com [mailto:nikhil.agrawal@emeter.com] Gesendet: Mittwoch, 17. November 2010 18:06 An: users@jackrabbit.apache.org Betreff: RE: Multiple instances of repository I am really thankful for all the suggestions. I am not an expert in architecting the applications and the answers are rea= lly providing me lots of help. Justin as you suggested, I think there is a need to change the architecture= . Let's say I restructure my application, let's call it app1, such that it's = 24X7 type of application. It will wait for a job and may be some scheduler ( quartz may be) will prov= ide it a job instance to run. Now this application 'app1' can be run on two different machines (in a clus= tered environment) and in that case these two jackrabbit repository instanc= es should be configured as a cluster, right? But I will also have a web-application that will also hit the repository in= stance. Right now it just reads the content from repository but in future i= t might write into the repository as well. This web application can be also= run on machine 1 and machine2. So now on machine 1, I will have one web-application and one other 24X7 app= lication and they both will be hitting the jackrabbit repository. So I will have to run a cluster configuration on this machine1, because I w= ill have two independent JVMs hitting on the same repository? I really don't want to run cluster nodes on a single machine, just so that= different JVMs can access the repository. That doesn't look correct. I am = sure that will be better ways to solve this issue as well. Any ideas will be of great help. -Nikhil -----Original Message----- From: justinedelson@gmail.com [mailto:justinedelson@gmail.com] On Behalf Of= Justin Edelson Sent: Wednesday, November 17, 2010 12:12 AM To: users@jackrabbit.apache.org Subject: Re: Multiple instances of repository Nikhil- I think you should rethink you're architecture. It really doesn't make sense to be bringing repository instances up only for a 2-4 minute job. Instead, you should think about using the Command pattern and package your "applications" as executable jobs which can be run inside a long-running VM against a local repository instance (i.e. making in-process calls instead of RMI or DavEx). This is where something like OSGi and Apache Sling can be *very* helpful, but there are obviously other ways to add/remove jobs at runtime. See, for example, Sling's Scheduler support: http://sling.apache.org/site/scheduler-service-commons-scheduler.html Justin On Tue, Nov 16, 2010 at 5:16 AM, wrote: > Thanks for your inputs, they are really helpful. > > Well, so does my application is not a good candidate to use jackrabbit. > > The other option, I had was to use jackrabbit in client-server mode. In t= his case I will be accessing the repository from RMI. But in the jackrabbit= documents it has been mentioned that RMI is not optimized for performance = and I should use embedded repository instance in my application code for be= tter performance. > > I can remove the search functionality from these clusters, because the li= fe span of these will be very short. The application will take 2-4 minutes = to do its job and I don't think we really need search for these clusters. > > But my question is, should I really use the clustering feature. I mean cl= uster nodes should normally have a longer life span. But here in this case = the nodes will have very short life span 2-4 minutes. > I am kind of finding it hard to use these short span applications as clus= ter nodes. > > Thanks, > Nikhil > > -----Original Message----- > From: Seidel. Robert [mailto:Robert.Seidel@aeb.de] > Sent: Tuesday, November 16, 2010 3:33 PM > To: users@jackrabbit.apache.org > Subject: AW: Multiple instances of repository > > Hi Nikhil, > > I don't know if it will work (setProperty), but you have another problem.= The Lucene search index is always saved in the file system. And afaik, eac= h repository home needs its own index directories (so you have the index fi= les for each cluster). If you make a new cluster, you have to wait for a lo= ng time till the index is built, depending on the data in your repository (= if you have tons of data, you have to wait a week or longer). > > The tables of the FS and PM will be shared between all cluster nodes - th= at works. > > Kindly regards, Robert > > -----Urspr=FCngliche Nachricht----- > Von: nikhil.agrawal@emeter.com [mailto:nikhil.agrawal@emeter.com] > Gesendet: Dienstag, 16. November 2010 10:54 > An: users@jackrabbit.apache.org > Betreff: RE: Multiple instances of repository > > Since there could be n number of instances. So I can't decide the cluster= id beforehand. > Hence I have the following code that creates a cluster id at run time. > > System.setProperty("org.apache.jackrabbit.core.cluster.node_id", "cluster= _id"+System.nanoTime()); > > Similarly the repositoryHome path is generated at run time. > > But do I also need separate tables for workspace file system? I have the = following configuration for my workspace. Is it correct? The tables for the= workspace FS and PersistenceManager will be shared between all the nodes o= r will these tables will be different? > > > PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//E= N" > "http://jackrabbit.apache.org/dtd/repository-2.0.dtd"> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > Nikhil > -----Original Message----- > From: Seidel. Robert [mailto:Robert.Seidel@aeb.de] > Sent: Tuesday, November 16, 2010 2:42 PM > To: users@jackrabbit.apache.org > Subject: AW: Multiple instances of repository > > Hi Nikhil, > > you need clustering, because all of your instances should access the same= repository. > > What you need is separate repository homes for each instance. In my use c= ase I have an installation directory for each instance, so the repository h= ome is located below this directory. > > You have to make sure, that each instance has also its own repository.xml= because you need to define different clusterIDs. > > And you have to define a cluster section in the repository.xml where the = journal is located, which is necessary for synchronization: > > > > > > ... > > > > Kindly regards, Robert > > -----Urspr=FCngliche Nachricht----- > Von: nikhil.agrawal@emeter.com [mailto:nikhil.agrawal@emeter.com] > Gesendet: Dienstag, 16. November 2010 09:37 > An: users@jackrabbit.apache.org > Betreff: RE: Multiple instances of repository > > Thanks for replying back. I will need little more help to understand the = things completely. > I will just elaborate a bit more on my usage scenario. I am also attachin= g my repository.xml file with this mail. Please let me know if you want to = know more about my environment. > > In my case, I want to keep all the data in one database and I want to use= jackrabbit as JCR over this database. > I have the jackrabbit embedded in my application so the repository gets-u= p as part of the application. > Now this application reads some files from repository and also inserts so= me data in repository. > There could be two instances of the application app1 running on machine1 = and app2 running on machine2. > So my application instances are different and I can create multiple repos= itory homes to avoid the locking problem but I still wants to insert the da= ta from these applications in same database tables. > So if all the application instances use the same repository configuration= file and specify their own repository home. > Will that work in my case? Will there be any consistency issues? > > When you say separate data store and separate persistence managers, you m= ean separate repository configuration file or separate database tables for = data stores and persistence managers. > > My instances and the repositories operate separately from each other but = they still want to share the data. The data inserted by one application ins= tance should be visible to other instance. So they all should be inserting = the data in same tables, that's what my understanding is. > > Thanks, > Nikhil > > -----Original Message----- > From: Seidel. Robert [mailto:Robert.Seidel@aeb.de] > Sent: Tuesday, November 16, 2010 1:22 PM > To: users@jackrabbit.apache.org > Subject: AW: Multiple instances of repository > > Hi Nikhil, > > if you want to use clustering, you have to define a repository home for e= ach cluster. > > Clustering is necessary, if you want to have the same data/indexes at all= cluster nodes - the key word is synchronization. > > If your instances and the repositories operate separately from each other= , you don't need clustering. Separate repository homes, data stores and per= sistence managers will do the job. > > Kindly regards, Robert > > -----Urspr=FCngliche Nachricht----- > Von: nikhil.agrawal@emeter.com [mailto:nikhil.agrawal@emeter.com] > Gesendet: Dienstag, 16. November 2010 08:33 > An: users@jackrabbit.apache.org > Betreff: Multiple instances of repository > > Hi, > > I am using jackrabbit as JCR implementation in my project. I am running j= ackrabbit with in my application in the same jvm. > The application read the content from repository and also writes some con= tent in repository. > There could be multiple concurrent instances of my application running on= the same or different machines. > I have a configuration file for jackrabbit and I have a single repository= home for jackrabbit. > Now as soon as one instance of the application is up and running, I can't= run the other instance as the first instance creates a lock file in reposi= tory home. > After doing some search I came to know about running the jackrabbit in cl= ustered mode. > Now my question is even in this case I will have to specify a different r= epository home for every run, right? > That means I should form the repository home path at the run time because= at compile time I am not sure how many instance will be run. > This is a standalone java application and theoretically n number of insta= nce can be run. > My question is when I have to specify a different repository path for eve= ry run, then the jackrabbit will work even with out clustering? > Because .lock file will be different for different runs as the repository= home is different. > I know I am missing something here, please help me. > I am attaching my conf file with this mail. > > Thanks, > Nikhil > >