hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HedWig/TopicManagement" by ErwinTam
Date Tue, 11 Jan 2011 19:15:46 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "HedWig/TopicManagement" page has been changed by ErwinTam.
http://wiki.apache.org/hadoop/HedWig/TopicManagement?action=diff&rev1=2&rev2=3

--------------------------------------------------

     * The topic name (string) 
     * A flag (boolean) indicating whether the client has been redirected. 
  
- When a client C subscribes to a topic T, it will contact one of the hubs (say, H1) and send
a <I>subscribe(C,T,False)</I> message. When a client receives a <i>redirect</I>
message from a hub, it will retry its subscription to the hub listed in the message (e.g.
H2). It will do this by sending a <I>subscribe(C,T,true)</I> message to the hub
H2. The flow is similar to the "false" case, except that the hub H2 knows that it should try
to become the owner of the topic, instead of choosing a random hub.
+ When a client C subscribes to a topic T, it will contact one of the hubs (say, H1) and send
a ''subscribe(C,T,False)'' message. When a client receives a ''redirect'' message from a hub,
it will retry its subscription to the hub listed in the message (e.g. H2). It will do this
by sending a ''subscribe(C,T,true)'' message to the hub H2. The flow is similar to the "false"
case, except that the hub H2 knows that it should try to become the owner of the topic, instead
of choosing a random hub.
  
  
- Upon receiving a <i>subscribe</i> message for topic T, the hub H1 will follow
these steps:
+ Upon receiving a ''subscribe'' message for topic T, the hub H1 will follow these steps:
  
-    * The hub H1 will check in !ZooKeeper to see if the topic T exists as a child of <b>Topics</B>.
If the topic T does not exist:
+    * The hub H1 will check in ZooKeeper to see if the topic T exists as a child of '''Topics'''.
If the topic T does not exist:
        * H1 will create the node '''Topics.T''', and the node '''Topics.T.Subscribers'''.
     * The hub H1 will read '''T.Hub''', the current hub assigned to the topic (say, H2).
        * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then H1
will add C to the list of subscribers (under '''T.Subscribers'''), and set up its internal
bookkeeping to begin delivering messages to C.
-       * If a hub exists, and is a different hub than the one the client contacted (e.g.
H1!=H2), then H1 will return to the client a <I>redirect</I> message, requesting
that the client retry its subscription at H2.
+       * If a hub exists, and is a different hub than the one the client contacted (e.g.
H1!=H2), then H1 will return to the client a ''redirect'' message, requesting that the client
retry its subscription at H2.
-       * If no hub exists, H1 will check the flag of the <i>subscribe</I> call
to see if the client was redirected.
+       * If no hub exists, H1 will check the flag of the ''subscribe'' call to see if the
client was redirected.
           * If false (the client has not yet been redirected) H1 will choose a random hub
H3 (possibly itself) to manage the topic. 
-             * If H1 chooses itself (H1==H3), then H1 will try to create an ephemeral node
under <B>T</B> called <B>Hub</B> with its own hostname (e.g. H1) as
the content). This creation should be done using test-and-set, so that if a <B>Hub</B>
node already exists, the creation fails.
+             * If H1 chooses itself (H1==H3), then H1 will try to create an ephemeral node
under '''T''' called '''Hub''' with its own hostname (e.g. H1) as the content). This creation
should be done using test-and-set, so that if a '''Hub''' node already exists, the creation
fails.
                 * If the ephemeral node creation succeeds, then H1 will set up its internal
bookkeeping to begin delivering messages to C.
-                * Otherwise (ephemeral node creation fails) then H1 will read the hostname
(e.g. H4) of the hub assigned to the topic from the <B>T.Hub</B> node, and return
to the client a <I>redirect</I> message, requesting that the client retry its
subscription at H4.
+                * Otherwise (ephemeral node creation fails) then H1 will read the hostname
(e.g. H4) of the hub assigned to the topic from the '''T.Hub''' node, and return to the client
a ''redirect'' message, requesting that the client retry its subscription at H4.
-             * Otherwise (H1!=H3), then H1 will return to the client a <I>redirect</I>
message, requesting that the client retry its subscription at H3.
+             * Otherwise (H1!=H3), then H1 will return to the client a ''redirect'' message,
requesting that the client retry its subscription at H3.
           * If true (the client has been redirected), then H1 will try to become the owner
of the topic.
-             * H1 will try to create an ephemeral node under <B>T</B> called
<B>Hub</B> with its own hostname (e.g. H1) as the content. This creation should
be done using test-and-set, so that if a <B>Hub</B> node already exists, the creation
fails.
+             * H1 will try to create an ephemeral node under '''T''' called '''Hub''' with
its own hostname (e.g. H1) as the content. This creation should be done using test-and-set,
so that if a '''Hub''' node already exists, the creation fails.
                 * If the ephemeral node creation succeeds, then H1 will set up its internal
bookkeeping to begin delivering messages to C.
-                * Otherwise (ephemeral node creation fails) then H1 will read the hostname
(e.g. H3) of the hub assigned to the topic from the <B>T.Hub</B> node, and return
to the client a <I>redirect</I> message, requesting that the client retry its
subscription at H3.
+                * Otherwise (ephemeral node creation fails) then H1 will read the hostname
(e.g. H3) of the hub assigned to the topic from the '''T.Hub''' node, and return to the client
a ''redirect'' message, requesting that the client retry its subscription at H3.
  
  
  Notes:
     * Because we want subscribers to be directly connected to the hub responsible for the
topic, we will redirect the client to that hub.
-    * Because the <B>T.Hub</B> node is ephemeral, it must be created by the hub
that owns the topic, not by any other hub.
+    * Because the '''T.Hub''' node is ephemeral, it must be created by the hub that owns
the topic, not by any other hub.
-    * To decide which hub to assign a topic to, the deciding hub should use the current list
of alive nodes from <B>Hedwig.Hubs</B> in !ZooKeeper.
+    * To decide which hub to assign a topic to, the deciding hub should use the current list
of alive nodes from '''Hedwig.Hubs''' in ZooKeeper.
-    * When choosing a random hub to assign a topic to, we can either do it uniformly randomly
or by weighting the random choice based on the hub's current load. To do the latter, each
hub must record in its <B>Hubs.Si.Alive</B> node its current load, and then hubs
doing topic assignment use these values to make decisions.
+    * When choosing a random hub to assign a topic to, we can either do it uniformly randomly
or by weighting the random choice based on the hub's current load. To do the latter, each
hub must record in its '''Hubs.Si.Alive''' node its current load, and then hubs doing topic
assignment use these values to make decisions.
  
- ---+++ Re-subscription process
+ === Re-subscription process ===
  
  A client may become disconnected from a hub, for many reasons including:
  
@@ -67, +67 @@

  
  When this happens, the client can just resubscribe to the topic. Using the same subscription
process as above, Hedwig will direct the client to the appropriate hub, either the (old) hub
which still owns the topic, or a (new) hub which has taken over the topic.
  
- ---+++ Publish process
+ === Publish process ===
  
  When a client C publishes to a topic T, C contacts a hub (say, H1) and tries to publish.
The publish call takes four parameters: 
     * The publisher ID (string)
@@ -75, +75 @@

     * The message M
     * A flag (boolean) which indicates whether the client has been redirected. 
  
- When the client C sends a <i>publish</I> call to H1 to publish a message on
topic T, H1 follows these steps:
+ When the client C sends a ''publish'' call to H1 to publish a message on topic T, H1 follows
these steps:
  
-    * The hub H1 will check in !ZooKeeper to see if the topic T exists as a child of <b>Topics</B>.
If the topic T does not exist:
+    * The hub H1 will check in ZooKeeper to see if the topic T exists as a child of '''Topics'''.
If the topic T does not exist:
-       * H1 will create the node <B>Topics.T</B>, and the node <B>Topics.T.Subscribers</B>.
+       * H1 will create the node '''Topics.T''', and the node '''Topics.T.Subscribers'''.
-    * The hub H1 will read <b>T.Hub</b>, the current hub assigned to the topic
(say, H2).
+    * The hub H1 will read '''T.Hub''', the current hub assigned to the topic (say, H2).
-       * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then the
hub accepts the publish and writes it into the !BookKeeper log.
+       * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then the
hub accepts the publish and writes it into the BookKeeper log.
-       * If a hub exists, but is a different hub than the one the client contacted (e.g.
H1!=H2), then the hub returns a <i>redirect</i> message, requesting that the client
retry its publish at H2.
+       * If a hub exists, but is a different hub than the one the client contacted (e.g.
H1!=H2), then the hub returns a ''redirect'' message, requesting that the client retry its
publish at H2.
-       * If no hub exists, H1 will check the flag of the <i>publish</I> call
to see if the client was redirected.
+       * If no hub exists, H1 will check the flag of the ''publish'' call to see if the client
was redirected.
           * If false (the client has not yet been redirected) H1 will choose a random hub
H3 (possibly itself) to manage the topic. 
              * If H1 chooses itself (H1==H3), then H1 will try to create an ephemeral node
under <B>T</B> called <B>Hub</B> with its own hostname (e.g. H1) as
the content). This creation should be done using test-and-set, so that if a <B>Hub</B>
node already exists, the creation fails.
                 * If the ephemeral node creation succeeds, then H1 will set up its internal
bookkeeping to begin publishing messages on T, and will accept and publish C's message.

Mime
View raw message