Return-Path: X-Original-To: apmail-manifoldcf-user-archive@www.apache.org Delivered-To: apmail-manifoldcf-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25EB3110E6 for ; Tue, 1 Jul 2014 14:19:47 +0000 (UTC) Received: (qmail 45405 invoked by uid 500); 1 Jul 2014 14:19:47 -0000 Delivered-To: apmail-manifoldcf-user-archive@manifoldcf.apache.org Received: (qmail 45348 invoked by uid 500); 1 Jul 2014 14:19:47 -0000 Mailing-List: contact user-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@manifoldcf.apache.org Delivered-To: mailing list user@manifoldcf.apache.org Received: (qmail 45338 invoked by uid 99); 1 Jul 2014 14:19:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Jul 2014 14:19:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daddywri@gmail.com designates 209.85.160.169 as permitted sender) Received: from [209.85.160.169] (HELO mail-yk0-f169.google.com) (209.85.160.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Jul 2014 14:19:42 +0000 Received: by mail-yk0-f169.google.com with SMTP id 79so5801118ykr.28 for ; Tue, 01 Jul 2014 07:19:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=QBeBwV2TuNkgfYQL7wtHCVLndhGebUp7cJvGY+tYDMQ=; b=SeVNLhpkm89C3LncorAm4mMKlIEzWxgSXdli13yyDt+Gq4RS5XO+EM9vchie5S7N88 dImJYbuwquobQCd+fKlQlL0qSawrlRS7w8OSM6bRRY1Kq+vQnjjuDVNJHhqGtQTycC8G DIEE5wwdAGEr14Ck4qpx/1jO7l5KDeRECHA4BfmXP79liymEj1/A4YVlRW8Womrbw4e0 z5xfCjEU3puTtOJX3KlDWXZDHBVAP4zt/LLIdvwP6wte4gvgA2kEY8axGuTsnA5WoUXP PwJDIsJ9vVqROrzNLEH89yZdE1gdHvF5JjZu5hBJn/hkMbN5BtKLopC2IQU5N9BU1qYE NHNA== MIME-Version: 1.0 X-Received: by 10.236.45.10 with SMTP id o10mr68209037yhb.49.1404224361527; Tue, 01 Jul 2014 07:19:21 -0700 (PDT) Received: by 10.170.118.139 with HTTP; Tue, 1 Jul 2014 07:19:21 -0700 (PDT) In-Reply-To: References: <-3166914015276671907@unknownmsgid> Date: Tue, 1 Jul 2014 10:19:21 -0400 Message-ID: Subject: Re: Zookeeper in Apache ManifoldCF From: Karl Wright To: lalit jangra Cc: "user@manifoldcf.apache.org" Content-Type: multipart/alternative; boundary=089e011615fc41017704fd2276c7 X-Virus-Checked: Checked by ClamAV on apache.org --089e011615fc41017704fd2276c7 Content-Type: text/plain; charset=UTF-8 Hi Lalit, I presumed in my recommendation that your "active" and "passive" manifoldcf instances were using the same PostgreSQL server, but were using different database instances within it. That is the only way it could reasonable work. Any time you have a Zookeeper cluster, they recommend you have three instances. Effectively you are setting up two ManifoldCF clusters: an "active" one, and a "passive" one. Each one has its own database instance within PostgreSQL, and each one (if it is multiprocess) should have 3 zookeeper instances. I hope this is clear. Karl On Tue, Jul 1, 2014 at 9:54 AM, lalit jangra wrote: > Thanks Karl, > > I have a little variation here and this is about having both MCF nodes in > Active/Active nodes pointing to same DB, so still Zookeeper is required? > > Also does it mean by " two sets of three zookeeper machines", i need to > setup three zookeepers onto each node so total 6 zookepeer node here > working on both machine in same ensamble? > > Regards. > > > On Mon, Jun 30, 2014 at 6:50 PM, Karl Wright wrote: > >> Hi Lalit, >> >> You can keep things really simple by having both active and passive mcf >> instances run each as a single process, either under jetty or using the >> combined war under tomcat. If that is not acceptable, you would need two >> sets of three zookeeper machines, one set for each instance. >> >> Karl >> >> Sent from my Windows Phone >> ------------------------------ >> From: lalit jangra >> Sent: 6/30/2014 12:19 PM >> To: user@manifoldcf.apache.org >> Subject: Re: Zookeeper in Apache ManifoldCF >> >> Thanks Karl & Graeme, >> >> Let me elaborate my scenario and what i am trying to achieve. >> >> I have two servers each running MCF 1.5.1 individually. But both of them >> are backed by same PostGreSQL DB so both of MCF applications are pointing >> to same DB at any point of time, without having their own dedicated DBs. >> Next, primary/active DB instance is backed up with periodical backups from >> active to passive instance. >> >> Only one DB instance will be active at any time, with other DB instance >> acting as active standby. In case of breakdown of primary/active instance, >> passive/secondary will take over and becomes primary/active instance >> handling all DB transactions, thus making primary as new secondary DB >> instance. >> >> Similarly i have two solr 4.6 instances which act in active/passive mode >> with periodic backup of active/primary to passive/secondary with active >> standby and failover. >> >> So my intention of clustering is high availability of system with >> failover but i will not use both of MCF instances parallely or >> simultaneously. >> >> Finally i am limited to having two instances only but as mentioned >> earlier, we need at least three Zookeeper instances for a proper Zookeeper >> clustering. >> >> Is it still worthy to go and use Zookeeper or i can do simple clustering >> where each of MCF node is clustered using same DB. Please suggest. >> >> Thanks for help. >> >> Regards. >> >> >> On Fri, Jun 27, 2014 at 11:15 AM, Graeme Seaton >> wrote: >> >>> Hi Lalit, >>> >>> For production use, you will want to spin up your own ZK cluster using >>> the instructions on the zookeeper site (as pointed out earlier at least 3 >>> is recommended).... >>> >>> You then need to modify the properties.xml file in >>> multiprocess-zk-example to point to the list of Zookeeper servers. You >>> also need to modify properties-global.xml with the appropriate global >>> settings i.e. logging levels, Postgresql database etc. and then run >>> setglobalproperties.sh to register the settings in ZK. >>> >>> To test that is working, set up a crawl and then tail the manifoldcf.log >>> file on each of your nodes to check that they are all crawling in parallel. >>> >>> HTH, >>> >>> Graeme >>> >>> >>> On 25/06/14 12:19, Karl Wright wrote: >>> >>> Hi Lalit, >>> >>> Zookeeper does not use a database; it keeps its stuff in the local file >>> system. Each Zookeeper node has its own local data, and everything else is >>> socket communication between them. >>> >>> As for information: http://zookeeper.apache.org/ >>> >>> Karl >>> >>> >>> >>> On Wed, Jun 25, 2014 at 6:56 AM, lalit jangra >>> wrote: >>> >>>> Thanks Karl, >>>> >>>> Apologies as i am not very familiar with Zookeeper and trying to figure >>>> out on same. >>>> >>>> Is there any more documentation/pointers available for same as that >>>> would be more helpful. >>>> >>>> Also i have 2 tomcat servers in cluster, each having MCF 1.5.1 setup >>>> and configured to point to same PostGreSQL DB & DB is backed up for >>>> failover. From your inputs, it seems that we need to configure a separate >>>> standalone Zookeeper server which will act as Master and both nodes in >>>> cluster will need to work as slaves and talk to standalone Zookeeper master. >>>> >>>> Also the Zookeeper server will have its own DB so either we can host >>>> it separately or we can use same Postgres DB? >>>> >>>> Regards. >>>> >>>> >>>> >>>> On Wed, Jun 25, 2014 at 11:33 AM, Karl Wright >>>> wrote: >>>> >>>>> Hi Lalit, >>>>> >>>>> 1. zookeeper is already spun into MCF. in fact you start a zookeeper >>>>> instance when you run the mcf zookeeper example. They recommend, though, >>>>> that for failover you have 3 instances, etc. >>>>> 2. Looks like the documentation is out of date and something old is >>>>> left in there. >>>>> 3. Zookeeper is a client/server kind of arrangement. You need at >>>>> least ONE zookeeper server, and each cluster member includes a zookeeper >>>>> client, which is configured to talk with ALL the zookeeper server instances >>>>> you have. >>>>> 4. There is ONE database instance; the instance may be supported by >>>>> failover and redundant Postgresql, but it appears as one instance. TO get >>>>> failover from Postgres you need the Enterprise Edition, which costs money. >>>>> >>>>> Karl >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Jun 25, 2014 at 4:47 AM, lalit jangra < >>>>> lalit.j.jangra@gmail.com> wrote: >>>>> >>>>>> Thanks Karl, >>>>>> >>>>>> That was helpful. >>>>>> >>>>>> I am setting clustered setup on Tomcats as i was following >>>>>> instructions @ >>>>>> http://manifoldcf.apache.org/release/trunk/en_US/how-to-build-and-deploy.html#Simplified+multi-process+model+using+ZooKeeper-based+synchronization >>>>>> and i need some suggestions here. >>>>>> >>>>>> 1. Do we need to download zookeeper and put it in >>>>>> multiprocess-zk-example folder or it is already spun into MCF and we are >>>>>> good to go? >>>>>> 2. It says all jars under *processes *should be put into classpath >>>>>> but i can not see any *processes *folder under MCF? >>>>>> 3. Do we need to setup Zookeeper on both nodes or only at one node, >>>>>> i assume we need to do on both nodes ? >>>>>> 4. Do we also need to setup databases separately on both nodes >>>>>> again. Also can we setup Zookeeper DB using same PostGreSQL or it will use >>>>>> its own HSQL DB? >>>>>> >>>>>> Finally how can i test that my Zookeeper is setp and ready to roll? >>>>>> >>>>>> Thanks for your help. >>>>>> >>>>>> Regards. >>>>>> >>>>>> >>>>>> On Tue, Jun 24, 2014 at 1:56 PM, Karl Wright >>>>>> wrote: >>>>>> >>>>>>> Hi Lalit, >>>>>>> ZooKeeper is standard for cluster deployments these days. See the >>>>>>> multiprocess-zookeeper example for ideas about how to deploy it. It's also >>>>>>> important to read the how-to-build-and-deploy page to understand the >>>>>>> example. >>>>>>> >>>>>>> Thanks, >>>>>>> Karl >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 24, 2014 at 8:04 AM, lalit jangra < >>>>>>> lalit.j.jangra@gmail.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am planning to use MCF in cluster mode. For same, i want to know >>>>>>>> if Zookeeper is of any help here? >>>>>>>> >>>>>>>> If yes, how can it be leveraged in distributed MCF servers? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Lalit Jangra. >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Lalit Jangra. >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Lalit Jangra. >>>> >>> >>> >>> >> >> >> -- >> Regards, >> Lalit Jangra. >> > > > > -- > Regards, > Lalit Jangra. > --089e011615fc41017704fd2276c7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Lalit,

I presumed in my rec= ommendation that your "active" and "passive" manifoldcf= instances were using the same PostgreSQL server, but were using different = database instances within it.=C2=A0 That is the only way it could reasonabl= e work.

Any time you have a Zookeeper cluster, they recommend you have th= ree instances.=C2=A0 Effectively you are setting up two ManifoldCF clusters= : an "active" one, and a "passive" one.=C2=A0 Each one = has its own database instance within PostgreSQL, and each one (if it is mul= tiprocess) should have 3 zookeeper instances.

I hope this is clear.

Karl



On Tue, Jul 1, 2014 at 9:54 AM,= lalit jangra <lalit.j.jangra@gmail.com> wrote:
Thanks Karl,=

I have a little variation here and this is about having both = MCF nodes in Active/Active nodes pointing to same DB, so still Zookeeper is= required?

Also does it mean by " two sets of three zookeeper machines&= quot;,=C2=A0 i need to setup three zookeepers onto each node so total 6 zoo= kepeer node here working on both machine in same=C2=A0 ensamble?

Regards.


On Mon, Jun 30, 2014 at 6:50 PM, Karl= Wright <daddywri@gmail.com> wrote:
Hi Lalit,

You can keep things really simp= le by having both active and passive mcf instances run each as a single pro= cess, either under jetty or using the combined war under tomcat.=C2=A0 If t= hat is not acceptable, you would need two sets of three zookeeper machines,= one set for each instance.

Karl

Sent from my Windows Phone

From: lalit jan= gra
Sent: 6/30/2014 12:19 PM
To: user@manifoldcf.apache.org
Subject: Re: Zookeeper in Apache ManifoldCF

Thanks Karl & Graeme,

Let me elaborate my scenario and what i am trying to achieve.

I have two servers each running MCF 1.5.1 individually. But both of the= m are backed by same PostGreSQL DB so both of MCF applications are pointing= to same DB at any point of time, without having their own dedicated DBs. N= ext, primary/active DB instance is=C2=A0 backed up with periodical backups = from active to passive instance.

Only one DB instance will be active at any time, with other DB instance= acting as active standby. In case of breakdown of primary/active instance,= passive/secondary will take over and becomes primary/active instance handl= ing all DB transactions, thus making primary as new secondary DB instance.<= br>
Similarly i have two solr 4.6 instances which act in active/passi= ve mode with periodic backup of active/primary to passive/secondary with ac= tive standby and failover.

So my intention of clustering is hi= gh availability of system with failover but i will not use both of MCF inst= ances parallely or simultaneously.

Finally i am limited to having two instances only but as mentione= d earlier, we need at least three Zookeeper instances for a proper Zookeepe= r clustering.

Is it still worthy to go and use Zookeeper or i = can do simple clustering where each of MCF node is clustered using same DB.= Please suggest.

Thanks for help.

Regards.


On Fri, Jun 27, 2014 at 11:15 = AM, Graeme Seaton <lists@graemes.com> wrote:
=20 =20 =20
Hi Lalit,

For production use, you will want to spin up your own ZK cluster using the instructions on the zookeeper site (as pointed out earlier at least 3 is recommended)....

You then need to modify the properties.xml file in multiprocess-zk-example to point to the list of Zookeeper servers.=C2=A0 You also need to modify properties-global.xml with the appropriate global settings i.e. logging levels, Postgresql database etc. and then run setglobalproperties.sh to register the settings in ZK.

To test that is working, set up a crawl and then tail the manifoldcf.log file on each of your nodes to check that they are all crawling in parallel.

HTH,

Graeme


On 25/06/14 12:19, Karl Wright wrote:
Hi Lalit,

Zookeeper does not use a database; it keeps its stuff in the local file system.=C2=A0 Each Zookeeper node has its own local data, and everything else is socket communication between them.

As for information: http://zookeeper.apache.org/

Karl



On Wed, Jun 25, 2014 at 6:56 AM, lalit jangra <lalit.j.jangra@gmail.com> wrote:
Thanks Karl,

Apologies as i am not very familiar with Zookeeper and trying to figure out on same.

Is there any more documentation/pointers available for same as that would be more helpful.

Also i have 2 tomcat servers in cluster, each having MCF 1.5.1 setup and configured to point to same PostGreSQL DB & DB is backed up for failover. From your inputs, it seems that we need to configure a separate standalone Zookeeper server which will act as Master and both nodes in cluster will need to work as slaves and talk to standalone Zookeeper master.

Also the Zookeeper server will have its own DB so either we can host it separately or we can use same Postgres DB?

Regards.



On Wed, Jun 25, 2014 at 11:33 AM, Karl Wright <daddywri@gmail.com>= wrote:
Hi Lalit,

1. zookeeper is already spun into MCF.=C2=A0 in fact you start a zookeeper instance when you run the mcf zookeeper example.=C2=A0 They recommend, though, that for failover you have 3 instances, etc.
2. Looks like the documentation is out of date and something old is left in there.
3. Zookeeper is a client/server kind of arrangement.=C2=A0 You need at least ONE zookeeper server, and each cluster member includes a zookeeper client, which is configured to talk with ALL the zookeeper server instances you have.
4.=C2=A0 There is ONE database instance; the instance may be supported by failover and redundant Postgresql, but it appears as one instance.=C2=A0 TO get failover from Postgres y= ou need the Enterprise Edition, which costs money.

Karl




On Wed, Jun 25, 2014 at 4:47 AM, lalit jangra <= lalit.j.jangra@gmail.com> wrote:
Thanks Karl,

That was helpful.

I am setting clustered setup on Tomcats as i was following instructions @ http://manifoldcf.apache.org/release/trunk/en_US/how-to-buil= d-and-deploy.html#Simplified+multi-process+model+using+ZooKeeper-based+sync= hronization and i need some suggestions here.

1. Do we need to download zookeeper and put it in multiprocess-zk-example folder or it is already spun into MCF and we are good to go?
2. It says all jars under pr= ocesses should be put into classpath but i can not see any processes folder under MCF?
3. Do we need to setup Zookeeper on both nodes or only at one node, i assume we need to do on both nodes ?
4. Do we also need to setup databases separately on both nodes again. Also can we setup Zookeeper DB using same PostGreSQL or it will use its own HSQL DB?

Finally how can i test that my Zookeeper is setp and ready to roll?

Thanks for your help.

Regards.


On Tue, Jun 24, 2014 at 1:56 PM, Karl Wright <daddywri@gma= il.com> wrote:
Hi Lalit,
ZooKeeper is standard for cluster deployments these days.=C2=A0 See the multiprocess-zookeeper example for ideas about how to deploy it.=C2=A0 It&= #39;s also important to read the how-to-build-and-deploy page to understand the example.

Thanks,
Karl



On Tue, Jun 24, 2014 at 8:04 AM, lalit jangra <lalit= .j.jangra@gmail.com> wrote:
Hi,

I am planning to use MCF in cluster mode. For same, i want to know if Zookeeper is of any help here?

If yes, how can it be leveraged in distributed MCF servers?

Regards,
Lalit Jangra.




--
Regards,
Lalit Jangra.




--
Regards,
Lalit Jangra.





--
Regards,
Lalit Jangr= a.



--
Regards,
Lalit Jangra.

--089e011615fc41017704fd2276c7--