qpid-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gordon Sim (Confluence)" <conflue...@apache.org>
Subject [CONF] Apache Qpid > DesignNotesOnReliableDistributedTopic
Date Wed, 16 Oct 2013 17:53:00 GMT
<html>
<head>
    <base href="https://cwiki.apache.org/confluence">
            <link rel="stylesheet" href="/confluence/s/en/2176/1/21/_/styles/combined.css?spaceKey=qpid&amp;forWysiwyg=true"
type="text/css">
    </head>
<body style="background: white;" bgcolor="white" class="email-body">
<div id="pageContent">
<div id="notificationFormat">
<div class="wiki-content">
<div class="email">
    <h2><a href="https://cwiki.apache.org/confluence/display/qpid/DesignNotesOnReliableDistributedTopic">DesignNotesOnReliableDistributedTopic</a></h2>
    <h4>Page <b>edited</b> by             <a href="https://cwiki.apache.org/confluence/display/~gsim@redhat.com">Gordon
Sim</a>
    </h4>
        <br/>
                         <h4>Changes (26)</h4>
                                 
    
<div id="page-diffs">
                    <table class="diff" cellpadding="0" cellspacing="0">
    
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h1.</span> This is very much a work in progress! <br></td></tr>
            <tr><td class="diff-unchanged" > <br> <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">Plan
of action: <br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h2.
Context <br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">This
is a small step in the direction of clarifying what the &#39;dispatch router&#39;
might be in more detail. It is looking into one potential requirement to understand if/how
that could be delivered in keeping with what I understand to be the design principles of that
component. This isn&#39;t necessarily advocating that this is the most important feature
however. <br> <br>h3. Plan of action: <br> <br></td></tr>
            <tr><td class="diff-unchanged" >First explore design for exactly-once
pub-sub using AMQP throughout. <br> <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >at the edges. <br> <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">-------------------------------------------------------------------------
<br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h2.
Overview <br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">Requirement:
<br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h3.
Requirement: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>* support for exactly-once
delivery in a pub-sub distribution pattern <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >  failure <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h3.</span> Initial Assumptions: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>* using AMQP at the edges
<br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >* any filters on receiving links are
applied only at the edge <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h3.</span> Design principles: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>* router network never
assumes ultimate responsibility for messages <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >  network; no synchronously replicated
state between routers <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h3.</span> Basic pattern: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>When a receiving link (aka
a subscriber) for the topic attaches to a <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >the container-id and link name of
the link and an indication of <br>whether there was any unsettled state associated with
the link at the <br></td></tr>
            <tr><td class="diff-changed-lines" >time of <span class="diff-changed-words">attach<span
class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">e</span>ment</span>
(i.e. loosely whether the link can be considered <br></td></tr>
            <tr><td class="diff-unchanged" >&#39;new&#39; link attaching
or an &#39;old&#39; link resuming). <br> <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >settle the delivery for their respective
local subscribers. <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-changed-words">----<span
class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">-------------------------------------------------------------------</span></span>
<br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h2.</span> Delivery Ids for outgoing messages: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>Scenario: There are two
connected routers, A and B, serving a given <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >to the message on the first receving
router instance. <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-changed-words">----<span
class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">------------------------------------------------------------------</span></span>
<br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h2.</span> Inter-router communication: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>Each router will forward
messages to all other routers that have <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >response was expected from the failed
node. <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-changed-words">----<span
class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">-------------------------------------------------------------------</span></span>
<br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">Settlement:
<br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h2.
Settlement: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>Once settled, a router
no longer needs to hold on to the message <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >failover....] <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-changed-words">----<span
class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">-------------------------------------------------------------------</span></span>
<br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">Resuming
a publisher: <br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h2.
resuming links on failover <br></td></tr>
            <tr><td class="diff-unchanged" > <br></td></tr>
            <tr><td class="diff-added-lines" style="background-color: #dfd;">h3.
Resuming a publisher: <br> <br></td></tr>
            <tr><td class="diff-unchanged" >On having a publishing link attach
with unsettled state, the router to <br>which it attaches will examine its delivery
records to see which if <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >until all interested routers and local
subscribers have accepted them. <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-added-words"style="background-color:
#dfd;">h3.</span> Resuming a subscriber: <br></td></tr>
            <tr><td class="diff-unchanged" > <br>On having a receiving link
attach with unsettled state, the router <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >resend to that receiver and track
the receivers acceptance of it. <br> <br></td></tr>
            <tr><td class="diff-changed-lines" ><span class="diff-changed-words">----<span
class="diff-deleted-chars"style="color:#999;background-color:#fdd;text-decoration:line-through;">-------------------------------------------------------------------</span></span>
<br></td></tr>
            <tr><td class="diff-unchanged" > <br>Delivery records kept for
topic: <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >  be tracked by the router to which
the publisher of the message is <br>  attached. <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">
<br>----------------------------------------------------------------------- <br></td></tr>
    
            </table>
    </div>                            <h4>Full Content</h4>
                    <div class="notificationGreySide">
        <h1><a name="DesignNotesOnReliableDistributedTopic-Thisisverymuchaworkinprogress%21"></a>This
is very much a work in progress!</h1>


<h2><a name="DesignNotesOnReliableDistributedTopic-Context"></a>Context</h2>

<p>This is a small step in the direction of clarifying what the 'dispatch router' might
be in more detail. It is looking into one potential requirement to understand if/how that
could be delivered in keeping with what I understand to be the design principles of that component.
This isn't necessarily advocating that this is the most important feature however.</p>

<h3><a name="DesignNotesOnReliableDistributedTopic-Planofaction%3A"></a>Plan
of action:</h3>

<p>First explore design for exactly-once pub-sub using AMQP throughout.</p>

<p>Then consider how this design would be affected by (a) relaxing the<br/>
delivery guarantee to at-least-once and, orthogonally, (b) using MQTT<br/>
at the edges.</p>

<h2><a name="DesignNotesOnReliableDistributedTopic-Overview"></a>Overview</h2>

<h3><a name="DesignNotesOnReliableDistributedTopic-Requirement%3A"></a>Requirement:</h3>

<ul>
	<li>support for exactly-once delivery in a pub-sub distribution pattern<br/>
  where publishers and subscribers are connected through a network of<br/>
  dispatch router instances</li>
</ul>


<ul>
	<li>delivery guarantee holds even in the face of individual router<br/>
  failure</li>
</ul>


<h3><a name="DesignNotesOnReliableDistributedTopic-InitialAssumptions%3A"></a>Initial
Assumptions:</h3>

<ul>
	<li>using AMQP at the edges</li>
</ul>


<ul>
	<li>single, non-hierarchical topic</li>
</ul>


<ul>
	<li>any filters on receiving links are applied only at the edge</li>
</ul>


<h3><a name="DesignNotesOnReliableDistributedTopic-Designprinciples%3A"></a>Design
principles:</h3>

<ul>
	<li>router network never assumes ultimate responsibility for messages<br/>
  (i.e. no store-and-forward, only accept message from publishers when<br/>
  it has already been accepted by subscribers)</li>
</ul>


<ul>
	<li>only rely on weak consistency between router instances in the<br/>
  network; no synchronously replicated state between routers</li>
</ul>


<h3><a name="DesignNotesOnReliableDistributedTopic-Basicpattern%3A"></a>Basic
pattern:</h3>

<p>When a receiving link (aka a subscriber) for the topic attaches to a<br/>
router, it will communicate that fact to all other routers, including<br/>
the container-id and link name of the link and an indication of<br/>
whether there was any unsettled state associated with the link at the<br/>
time of attachment (i.e. loosely whether the link can be considered<br/>
'new' link attaching or an 'old' link resuming).</p>

<p>Any router receiving messages for the topic on a sending link<br/>
(i.e. from a publisher), will forward that message to all other<br/>
routers in the network that have subscribers attached to them (along<br/>
the best path available).</p>

<p>Each router needs to track the acceptance of a given message from its<br/>
local subscribers. It will signal its own acceptance when all its<br/>
local subscribers have accepted.</p>

<p>The router to which the publishing link is attached will additionally<br/>
track the acceptance from all the other router instances to which it<br/>
forwarded the message and will only communicate its acceptance to that<br/>
publisher on receipt of acceptance both from all its local subscribers<br/>
and all those other router instances.</p>

<p>The publisher, on receiving acceptance, will then settle the<br/>
message. The settlement will be relayed to each local subscriber and<br/>
each router that sent an acceptance. All the other routers likewise<br/>
settle the delivery for their respective local subscribers.</p>

<hr />

<h2><a name="DesignNotesOnReliableDistributedTopic-DeliveryIdsforoutgoingmessages%3A"></a>Delivery
Ids for outgoing messages:</h2>

<p>Scenario: There are two connected routers, A and B, serving a given<br/>
topic. A publisher and a subscriber for that topic are both connected to<br/>
router A. The publisher sends a message which the router forwards to<br/>
the subscriber. Before the subscriber accepts this, router A<br/>
fails. Both the publisher and the subscriber failover to router B and<br/>
attempt to recover their links.</p>

<p>It must be possible to match a delivery to a subscriber with the<br/>
corresponding delivery from a publisher. However since there may be<br/>
multiple publisher streams being combined for delivery to a subscriber,<br/>
the publisher's delivery tag is not on its own sufficiently unique.</p>

<p>Ideally, the delivery tag assigned by the router network would include<br/>
the publishers container id and link name as well as the original<br/>
delivery tag.</p>

<p>However the delivery-tag is limited in size (it can contain 32 bytes<br/>
at most).</p>

<p>Since we don't want to have to maintain a per-message mapping, we<br/>
either need to rely on the publishers tag being less than 32 bytes or<br/>
we need to use a delivery tag larger than that for subscribers in<br/>
order to be able to retain the original while ensuring uniqueness.</p>

<p>Routers also need to be able to identify the original publisher of any<br/>
message they have received. That can be done by adding an annotation<br/>
to the message on the first receving router instance.</p>

<hr />

<h2><a name="DesignNotesOnReliableDistributedTopic-Interroutercommunication%3A"></a>Inter-router
communication:</h2>

<p>Each router will forward messages to all other routers that have<br/>
subscribers for the topic. This needs to be 'reliable' in the sense<br/>
that messages can't go missing. However we can resend and it doesn't<br/>
need to use the AMQP defined exactly-once procedure between<br/>
routers. To start with I'll assume it doesn't for simplicity, this can<br/>
be revisited later.</p>

<p>(Question: If routers C and D are both reached by A through an intermediary<br/>
router B, and all of B, C and D have subscribers, should one message<br/>
be sent by A to all of them, or would three messages each with an<br/>
explicit target router be sent?)</p>

<p>So, delivery record at a given node tracks:</p>

<p>(a) local subscribers and the state of the delivery to each of them,<br/>
and</p>

<p>(b) the other router instances with subscribers for the topic</p>

<p>It will signal its own acceptance when all its local subscribers have<br/>
accepted. The router to which the publishing link is attached will<br/>
send its acceptance to that publisher only when both all its local<br/>
subscribers have accepted, and all the router nodes it is expecting to<br/>
have accepted it.</p>

<p>Each router tracks the set of subscribers connected at each other<br/>
router. On the failure of a router instance, any other router that was<br/>
waiting for an acceptance from that failed router will track the<br/>
reconnection of the subscribers on that failed router to (an)other<br/>
router(s). There will also be a timeout such that if they don't<br/>
retattach, any state associated with the link is discarded. (This<br/>
would be the value for the timeout field of a terminus that the router<br/>
would accept).</p>

<p>When a receiving link attaches to a router, it will send out a message<br/>
describing that to all other routers in the network. Likewise it will<br/>
notify the network if/when that link is closed.</p>

<p>Routers track the subscriber identifiers (container-id and link name)<br/>
attached to all other nodes in the network. (TODO: This is not ideal<br/>
in terms of scalability, and there may be some ways to relax the<br/>
requirement a little).</p>

<p>On failure of a router, the rest of the network can then determine<br/>
when all the receivers that were attached to that failed node have<br/>
reattached, and can update their records of any deliveries for which a<br/>
response was expected from the failed node.</p>

<hr />

<h2><a name="DesignNotesOnReliableDistributedTopic-Settlement%3A"></a>Settlement:</h2>

<p>Once settled, a router no longer needs to hold on to the message<br/>
itself, but we do need to track the delivery until we are confident<br/>
that every receiver knows it has been settled. This allows us to<br/>
assume that when resuming a receiver link, any unsettled deliveries<br/>
declared by the receiver that the router is unware of, have yet to<br/>
make it to that router.</p>

<p>With AMQP 1.0 there is no indication sent back by receivers to<br/>
indicate that they know a given delivery has been settled (since it is<br/>
assumed that the sender has already discarded any record of the<br/>
delivery). We could however <b>infer</b> that they know it has been settled<br/>
when they have responded to something sent after the disposition<br/>
notifying them of that settlement.</p>

<p>The router to which the publisher of a message is attached will relay<br/>
the settlement and will also subsquently communicate to all nodes that<br/>
the delivery record can be deleted when a subsequent delivery from a<br/>
publisher who is also attached to this router is accepted.</p>

<p>[TODO: Ordering assumption here needs to be scrutinised in the face of<br/>
failover....]</p>

<hr />

<h2><a name="DesignNotesOnReliableDistributedTopic-resuminglinksonfailover"></a>resuming
links on failover</h2>

<h3><a name="DesignNotesOnReliableDistributedTopic-Resumingapublisher%3A"></a>Resuming
a publisher:</h3>

<p>On having a publishing link attach with unsettled state, the router to<br/>
which it attaches will examine its delivery records to see which if<br/>
any of the unsettled deliveries it has any record of.</p>

<p>For those deliveries for which it already has a record, it will start<br/>
tracking acceptance from all other interested routers, to which these<br/>
deliveries will now be resent.</p>

<p>It will request that the publisher resend any deliveries it was not<br/>
aware of, which will be then added to its records and forwarded under<br/>
a disambiguated delivery tag to its local subscribers if any exist, as<br/>
well as all interested routers in the network.</p>

<p>The router will not indicate acceptance of any of these deliveries<br/>
until all interested routers and local subscribers have accepted them.</p>

<h3><a name="DesignNotesOnReliableDistributedTopic-Resumingasubscriber%3A"></a>Resuming
a subscriber:</h3>

<p>On having a receiving link attach with unsettled state, the router<br/>
will compare the unsettled delivery states as presented by the<br/>
receiver with its own records. It can respond with its own view of<br/>
delivery state for any delivery it already has a record of. Those<br/>
deliveries it does not have a record of could have been settled or not<br/>
yet received.</p>

<p>For deliveries the receiver considers unsettled and the router has a<br/>
record of, if the record indicates the delivery was settled, then the<br/>
router can omit the delivery from its own unsettled map, otherwise the<br/>
router will start tracking this receiver against that record and<br/>
include these in the unsettled map we send back. If receiver indicates<br/>
it has accepted, we can mark it as accepted by them.</p>

<p>For deliveries the receiver considers unsettled but the router has no<br/>
record of, the router assumes it has yet to receive those messages.<br/>
It creates a placeholder record for them and starts tracking this<br/>
receiver against that. It shouldn't deliver any messages to that<br/>
receiver until all preceding placeholders have been 'fulfilled',<br/>
i.e. until it has received a message matching that delivery, updated<br/>
the placeholder record accordingly and forwarded that message to this<br/>
receiver (and any others that may have also resumed including that<br/>
delivery in their unsettled state).</p>

<p>Any deliveries receiver doesn't report in its unsettled map either<br/>
have been settled (and are no longer relevant) or have yet to be<br/>
delivered to that receiver. Any unsettled records for the topic that<br/>
the router has that are not in the receivers unsettled map, it should<br/>
resend to that receiver and track the receivers acceptance of it.</p>

<hr />

<p>Delivery records kept for topic:</p>

<ul>
	<li>the message</li>
</ul>


<ul>
	<li>the publishers identity</li>
</ul>


<ul>
	<li>the publishers original delivery tag</li>
</ul>


<ul>
	<li>the outgoing delivery tag (includes the publisher tag, adds<br/>
  disambiguating pre- or post- fix)</li>
</ul>


<ul>
	<li>a map of local subscribers and their respective delivery statuses<br/>
  (i.e. in-doubt, accepted, settled)</li>
</ul>


<ul>
	<li>a map of other interested routers and their respective delivery<br/>
  statuses (i.e. in-doubt, accepted, settled) Note that this will only<br/>
  be tracked by the router to which the publisher of the message is<br/>
  attached.</li>
</ul>

    </div>
        <div id="commentsSection" class="wiki-content pageSection">
        <div style="float: right;" class="grey">
                        <a href="https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=qpid">Stop
watching space</a>
            <span style="padding: 0px 5px;">|</span>
                <a href="https://cwiki.apache.org/confluence/users/editmyemailsettings.action">Change
email notification preferences</a>
</div>
        <a href="https://cwiki.apache.org/confluence/display/qpid/DesignNotesOnReliableDistributedTopic">View
Online</a>
        |
        <a href="https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=34836513&revisedVersion=2&originalVersion=1">View
Changes</a>
            </div>
</div>
</div>
</div>
</div>
</body>
</html>

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@qpid.apache.org
For additional commands, e-mail: commits-help@qpid.apache.org


Mime
View raw message