qpid-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gordon Sim (Confluence)" <conflue...@apache.org>
Subject [CONF] Apache Qpid > DesignNotesOnReliableDistributedTopic
Date Wed, 16 Oct 2013 17:45:01 GMT
    <base href="https://cwiki.apache.org/confluence">
            <link rel="stylesheet" href="/confluence/s/en/2176/1/21/_/styles/combined.css?spaceKey=qpid&amp;forWysiwyg=true"
<body style="background: white;" bgcolor="white" class="email-body">
<div id="pageContent">
<div id="notificationFormat">
<div class="wiki-content">
<div class="email">
    <h2><a href="https://cwiki.apache.org/confluence/display/qpid/DesignNotesOnReliableDistributedTopic">DesignNotesOnReliableDistributedTopic</a></h2>
    <h4>Page  <b>added</b> by             <a href="https://cwiki.apache.org/confluence/display/~gsim@redhat.com">Gordon
    <div class="notificationGreySide">
         <p>This is very much a work in progress!</p>

<p>Plan of action:</p>

<p>First explore design for exactly-once pub-sub using AMQP throughout.</p>

<p>Then consider how this design would be affected by (a) relaxing the<br/>
delivery guarantee to at-least-once and, orthogonally, (b) using MQTT<br/>
at the edges.</p>



	<li>support for exactly-once delivery in a pub-sub distribution pattern<br/>
  where publishers and subscribers are connected through a network of<br/>
  dispatch router instances</li>

	<li>delivery guarantee holds even in the face of individual router<br/>

<p>Initial Assumptions:</p>

	<li>using AMQP at the edges</li>

	<li>single, non-hierarchical topic</li>

	<li>any filters on receiving links are applied only at the edge</li>

<p>Design principles:</p>

	<li>router network never assumes ultimate responsibility for messages<br/>
  (i.e. no store-and-forward, only accept message from publishers when<br/>
  it has already been accepted by subscribers)</li>

	<li>only rely on weak consistency between router instances in the<br/>
  network; no synchronously replicated state between routers</li>

<p>Basic pattern:</p>

<p>When a receiving link (aka a subscriber) for the topic attaches to a<br/>
router, it will communicate that fact to all other routers, including<br/>
the container-id and link name of the link and an indication of<br/>
whether there was any unsettled state associated with the link at the<br/>
time of attachement (i.e. loosely whether the link can be considered<br/>
'new' link attaching or an 'old' link resuming).</p>

<p>Any router receiving messages for the topic on a sending link<br/>
(i.e. from a publisher), will forward that message to all other<br/>
routers in the network that have subscribers attached to them (along<br/>
the best path available).</p>

<p>Each router needs to track the acceptance of a given message from its<br/>
local subscribers. It will signal its own acceptance when all its<br/>
local subscribers have accepted.</p>

<p>The router to which the publishing link is attached will additionally<br/>
track the acceptance from all the other router instances to which it<br/>
forwarded the message and will only communicate its acceptance to that<br/>
publisher on receipt of acceptance both from all its local subscribers<br/>
and all those other router instances.</p>

<p>The publisher, on receiving acceptance, will then settle the<br/>
message. The settlement will be relayed to each local subscriber and<br/>
each router that sent an acceptance. All the other routers likewise<br/>
settle the delivery for their respective local subscribers.</p>


<p>Delivery Ids for outgoing messages:</p>

<p>Scenario: There are two connected routers, A and B, serving a given<br/>
topic. A publisher and a subscriber for that topic are both connected to<br/>
router A. The publisher sends a message which the router forwards to<br/>
the subscriber. Before the subscriber accepts this, router A<br/>
fails. Both the publisher and the subscriber failover to router B and<br/>
attempt to recover their links.</p>

<p>It must be possible to match a delivery to a subscriber with the<br/>
corresponding delivery from a publisher. However since there may be<br/>
multiple publisher streams being combined for delivery to a subscriber,<br/>
the publisher's delivery tag is not on its own sufficiently unique.</p>

<p>Ideally, the delivery tag assigned by the router network would include<br/>
the publishers container id and link name as well as the original<br/>
delivery tag.</p>

<p>However the delivery-tag is limited in size (it can contain 32 bytes<br/>
at most).</p>

<p>Since we don't want to have to maintain a per-message mapping, we<br/>
either need to rely on the publishers tag being less than 32 bytes or<br/>
we need to use a delivery tag larger than that for subscribers in<br/>
order to be able to retain the original while ensuring uniqueness.</p>

<p>Routers also need to be able to identify the original publisher of any<br/>
message they have received. That can be done by adding an annotation<br/>
to the message on the first receving router instance.</p>


<p>Inter-router communication:</p>

<p>Each router will forward messages to all other routers that have<br/>
subscribers for the topic. This needs to be 'reliable' in the sense<br/>
that messages can't go missing. However we can resend and it doesn't<br/>
need to use the AMQP defined exactly-once procedure between<br/>
routers. To start with I'll assume it doesn't for simplicity, this can<br/>
be revisited later.</p>

<p>(Question: If routers C and D are both reached by A through an intermediary<br/>
router B, and all of B, C and D have subscribers, should one message<br/>
be sent by A to all of them, or would three messages each with an<br/>
explicit target router be sent?)</p>

<p>So, delivery record at a given node tracks:</p>

<p>(a) local subscribers and the state of the delivery to each of them,<br/>

<p>(b) the other router instances with subscribers for the topic</p>

<p>It will signal its own acceptance when all its local subscribers have<br/>
accepted. The router to which the publishing link is attached will<br/>
send its acceptance to that publisher only when both all its local<br/>
subscribers have accepted, and all the router nodes it is expecting to<br/>
have accepted it.</p>

<p>Each router tracks the set of subscribers connected at each other<br/>
router. On the failure of a router instance, any other router that was<br/>
waiting for an acceptance from that failed router will track the<br/>
reconnection of the subscribers on that failed router to (an)other<br/>
router(s). There will also be a timeout such that if they don't<br/>
retattach, any state associated with the link is discarded. (This<br/>
would be the value for the timeout field of a terminus that the router<br/>
would accept).</p>

<p>When a receiving link attaches to a router, it will send out a message<br/>
describing that to all other routers in the network. Likewise it will<br/>
notify the network if/when that link is closed.</p>

<p>Routers track the subscriber identifiers (container-id and link name)<br/>
attached to all other nodes in the network. (TODO: This is not ideal<br/>
in terms of scalability, and there may be some ways to relax the<br/>
requirement a little).</p>

<p>On failure of a router, the rest of the network can then determine<br/>
when all the receivers that were attached to that failed node have<br/>
reattached, and can update their records of any deliveries for which a<br/>
response was expected from the failed node.</p>



<p>Once settled, a router no longer needs to hold on to the message<br/>
itself, but we do need to track the delivery until we are confident<br/>
that every receiver knows it has been settled. This allows us to<br/>
assume that when resuming a receiver link, any unsettled deliveries<br/>
declared by the receiver that the router is unware of, have yet to<br/>
make it to that router.</p>

<p>With AMQP 1.0 there is no indication sent back by receivers to<br/>
indicate that they know a given delivery has been settled (since it is<br/>
assumed that the sender has already discarded any record of the<br/>
delivery). We could however <b>infer</b> that they know it has been settled<br/>
when they have responded to something sent after the disposition<br/>
notifying them of that settlement.</p>

<p>The router to which the publisher of a message is attached will relay<br/>
the settlement and will also subsquently communicate to all nodes that<br/>
the delivery record can be deleted when a subsequent delivery from a<br/>
publisher who is also attached to this router is accepted.</p>

<p>[TODO: Ordering assumption here needs to be scrutinised in the face of<br/>


<p>Resuming a publisher:</p>

<p>On having a publishing link attach with unsettled state, the router to<br/>
which it attaches will examine its delivery records to see which if<br/>
any of the unsettled deliveries it has any record of.</p>

<p>For those deliveries for which it already has a record, it will start<br/>
tracking acceptance from all other interested routers, to which these<br/>
deliveries will now be resent.</p>

<p>It will request that the publisher resend any deliveries it was not<br/>
aware of, which will be then added to its records and forwarded under<br/>
a disambiguated delivery tag to its local subscribers if any exist, as<br/>
well as all interested routers in the network.</p>

<p>The router will not indicate acceptance of any of these deliveries<br/>
until all interested routers and local subscribers have accepted them.</p>

<p>Resuming a subscriber:</p>

<p>On having a receiving link attach with unsettled state, the router<br/>
will compare the unsettled delivery states as presented by the<br/>
receiver with its own records. It can respond with its own view of<br/>
delivery state for any delivery it already has a record of. Those<br/>
deliveries it does not have a record of could have been settled or not<br/>
yet received.</p>

<p>For deliveries the receiver considers unsettled and the router has a<br/>
record of, if the record indicates the delivery was settled, then the<br/>
router can omit the delivery from its own unsettled map, otherwise the<br/>
router will start tracking this receiver against that record and<br/>
include these in the unsettled map we send back. If receiver indicates<br/>
it has accepted, we can mark it as accepted by them.</p>

<p>For deliveries the receiver considers unsettled but the router has no<br/>
record of, the router assumes it has yet to receive those messages.<br/>
It creates a placeholder record for them and starts tracking this<br/>
receiver against that. It shouldn't deliver any messages to that<br/>
receiver until all preceding placeholders have been 'fulfilled',<br/>
i.e. until it has received a message matching that delivery, updated<br/>
the placeholder record accordingly and forwarded that message to this<br/>
receiver (and any others that may have also resumed including that<br/>
delivery in their unsettled state).</p>

<p>Any deliveries receiver doesn't report in its unsettled map either<br/>
have been settled (and are no longer relevant) or have yet to be<br/>
delivered to that receiver. Any unsettled records for the topic that<br/>
the router has that are not in the receivers unsettled map, it should<br/>
resend to that receiver and track the receivers acceptance of it.</p>


<p>Delivery records kept for topic:</p>

	<li>the message</li>

	<li>the publishers identity</li>

	<li>the publishers original delivery tag</li>

	<li>the outgoing delivery tag (includes the publisher tag, adds<br/>
  disambiguating pre- or post- fix)</li>

	<li>a map of local subscribers and their respective delivery statuses<br/>
  (i.e. in-doubt, accepted, settled)</li>

	<li>a map of other interested routers and their respective delivery<br/>
  statuses (i.e. in-doubt, accepted, settled) Note that this will only<br/>
  be tracked by the router to which the publisher of the message is<br/>


    <div id="commentsSection" class="wiki-content pageSection">
       <div style="float: right;" class="grey">
                        <a href="https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=qpid">Stop
watching space</a>
            <span style="padding: 0px 5px;">|</span>
                <a href="https://cwiki.apache.org/confluence/users/editmyemailsettings.action">Change
email notification preferences</a>
       <a href="https://cwiki.apache.org/confluence/display/qpid/DesignNotesOnReliableDistributedTopic">View

To unsubscribe, e-mail: commits-unsubscribe@qpid.apache.org
For additional commands, e-mail: commits-help@qpid.apache.org

View raw message