curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Check Peck <comptechge...@gmail.com>
Subject Re: How to use PathChildrenCache properly for keeping a watch on three znodes on Zookeeper?
Date Wed, 23 Jul 2014 22:05:12 GMT
Thanks a lot Cameron. That makes sense a lot. So If I need to do it
properly I should have my executors declare like this at the top of my
class -

    private static ExecutorService service =
Executors.newFixedThreadPool(15);

and then use the above "ExecutorService" like below -

    cache.getListenable().addListener(listener, service);

Am I got everything right with the above code?



On Wed, Jul 23, 2014 at 2:49 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
wrote:

> When you call cache.getListenable().addListener(listener), if you don't
> provide an executor service, then it will use the default Curator one,
> which will only have a single thread. This means that your listener will
> only be called by a single thread, so if you do anything in that listener
> that blocks (i.e restarting the server) then you're not going to get any
> other events until after your blocking event has finished.
>
> You could either provide an executor that allows more than a single thread
> (and thus your childEvent method could get called concurrently for
> different events), or you could use your own executor to do the restarts,
> which would allow the curator event thread to keep processing stuff.
>
>
> On Thu, Jul 24, 2014 at 6:33 AM, Check Peck <comptechgeeky@gmail.com>
> wrote:
>
>> I am using Curator library for Zookeeper. I am using zookeeper to monitor
>> whether my app servers are up or not. If they are not up or shut down, then
>> bring them up. I need to keep a watch on three of my znodes on the
>> zookeeper. I am keeping a watch on ("/test/proc/phx/server",
>> ("/test/proc/slc/server") and ("/test/proc/lvs/server"))I have a znode
>> structure like below.
>>
>>     /test/proc
>>             /phx
>>                 /server
>>                     /h1
>>                     /h2
>>                     /h3
>>                     /h4
>>                     /h5
>>             /slc
>>                 /server
>>                     /h1
>>                     /h2
>>                     /h3
>>                     /h4
>>                     /h5
>>             /lvs
>>                 /server
>>                     /h1
>>                     /h2
>>                     /h3
>>                     /h4
>>                     /h5
>>
>> As you can see above, for "/test/proc/phx/server", we have 5 hosts
>> starting with "h", similary for slc and lvs as well. And all those hosts
>> starting with "h" are ephimeral znodes. Now as soon as any server dies,
>> let's say for PHX, h4 machine went down, then the "h4" ephemeral znodes
>> gets deleted from the "/test/proc/phx/server" and then I will try to
>> re-start h4 machine on PHX datacenter. Similarly with SLC and LVS.
>>
>> Below is my code by which I am keeping a watch and re-starting the
>> servers if any machine went down in any datacenters. With the below code
>> what I am seeing is, suppose if three machine went down in same datacenter,
>> then it restart those three one by one. Meaning let's say h1, h3, h5 went
>> down in PHX datacenter, then first it will restart h1 and as soon as h1 is
>> done, then it will restart h3 and then h5. So it is always waiting for one
>> to get finished and then restart another host. I am not sure why? Those
>> three should be restarted instantly right since it's a background thread ?
>>
>> And also sometimes what I am seeing if all the hosts went down instantly
>> then it doesn't restart anything? May be thread is getting stuck? Does my
>> below code looks right with the way I am keeping a watch on three
>> Datacenters PHX ("/test/proc/phx/server"), SLC("/test/proc/slc/server") and
>> LVS("/test/proc/lvs/server")
>>
>>     List<String> datacenters = Arrays.asList("PHX", "SLC", "LVS");
>>     for (String dc : datacenters) {
>>         // in this example we will cache data. Notice that this is
>> optional.
>>         PathChildrenCache cache = new
>> PathChildrenCache(zookClient.getClient(), "/test/proc" + "/" + dc + "/" +
>> "server", true);
>>         cache.start();
>>
>>         addListener(cache);
>>     }
>>
>>     private static void addListener(PathChildrenCache cache) {
>>
>>         PathChildrenCacheListener listener = new
>> PathChildrenCacheListener() {
>>             public void childEvent(CuratorFramework client,
>> PathChildrenCacheEvent event) throws Exception {
>>                 switch (event.getType()) {
>>                 case CHILD_ADDED: {
>>                     if (zookClient.isLeader()) {
>>                         String path =
>> ZKPaths.getPathAndNode(event.getData().getPath()).getPath();
>>                         String node =
>> ZKPaths.getNodeFromPath(event.getData().getPath());
>>                         String datacenter = path.split("/")[3];
>>
>>                         System.out.println("Node added: Path= ", path, ",
>> Actual Node= ", node, ", Datacenter= ", datacenter);
>>
>>                         break;
>>                     }
>>                 }
>>
>>                 case CHILD_UPDATED: {
>>                     if (zookClient.isLeader()) {
>>                         String path =
>> ZKPaths.getPathAndNode(event.getData().getPath()).getPath();
>>                         String node =
>> ZKPaths.getNodeFromPath(event.getData().getPath());
>>                         String datacenter = path.split("/")[3];
>>
>>                         System.out.println("Node updated: Path= ", path,
>> ", Actual Node= ", node, ", Datacenter= ", datacenter);
>>
>>                         break;
>>                     }
>>                 }
>>
>>                 case CHILD_REMOVED: {
>>                     if (zookClient.isLeader()) {
>>                         String path =
>> ZKPaths.getPathAndNode(event.getData().getPath()).getPath();
>>                         String node =
>> ZKPaths.getNodeFromPath(event.getData().getPath());
>>                         String datacenter = path.split("/")[3];
>>
>>                         System.out.println("Node removed: Path= ", path,
>> ", Actual Node= ", node, ", Datacenter= ", datacenter);
>>
>>                         // restart machine which goes down
>>                         // I am assuming as soon as any machine went
>> down, call will come here instantly without waiting for anything?
>>
>>                         break;
>>                     }
>>                 }
>>                 default:
>>                     break;
>>
>>                 }
>>             }
>>         };
>>         cache.getListenable().addListener(listener);
>>     }
>>
>
>

Mime
View raw message