qpid-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Qpid: Cluster Design Note (page edited)
Date Mon, 02 Apr 2007 13:33:00 GMT
<html>
<head>
    <base href="http://cwiki.apache.org/confluence" />
    <style type="text/css">
    <!--
    body, p, td, table, tr, .bodytext, .stepfield {
	font-family: Verdana, arial, sans-serif;
	font-size: 11px;
	line-height: 16px;
	color: #000000;
	font-weight: normal;
}
#PageContent {
	text-align: left;
	background-color: #fff;
	padding: 0px;
	margin: 0px;
    padding-bottom:20px;
}
/*
** when this stylesheet is used for the Tiny MCE Wysiwyg editor's edit area, we can't
** use an id=PageContent or class=wiki-content, so we must
** set the body style to that used for PageContent, and p to that used for wiki-content.
*/

body {
	margin: 0px;
	padding: 0px;
	text-align: center;
    background-color: #f0f0f0;
}

@media print {

body {
    background-color: #fff;
}

}

.monospaceInput {
    font:12px monospace
}

.wiki-content p, .commentblock p {
    margin: 16px 0px 16px 0px;
    padding: 0px;
}

.wiki-content-preview {
    padding: 5px;
    border-left: 1px solid #3c78b5;
    border-right: 1px solid #3c78b5;
}

ul, ol {
    margin-top: 2px;
    margin-bottom: 2px;
    padding-top: 0px;
    padding-bottom: 0px;
}

pre {
    padding: 0px;
    margin-top: 5px;
    margin-left: 15px;
    margin-bottom: 5px;
    margin-right: 5px;
    text-align: left;
}

.helpheading {
    font-weight: bold;
    background-color: #D0D9BD;
        border-bottom: 1px solid #3c78b5;
        padding: 4px 4px 4px 4px;
        margin: 0px;
        margin-top: 10px;
}
.helpcontent {
        padding: 4px 4px 20px 4px;
    background-color: #f5f7f1;
}

.code {
 	border: 1px dashed #3c78b5;
    font-size: 11px;
	font-family: Courier;
    margin: 10px;
	line-height: 13px;
}

.focusedComment {
    background: #ffffce;
}

.commentBox, .focusedComment {
    padding: 10px;
    margin: 5px 0 5px 0;
    border: 1px #bbb solid;
}

.codeHeader {
    background-color: #f0f0f0;
 	border-bottom: 1px dashed #3c78b5;
    padding: 3px;
	text-align: center;
}

.codeContent {
    text-align: left;
    background-color: #f0f0f0;
    padding: 3px;
}

.preformatted {
 	border: 1px dashed #3c78b5;
    font-size: 11px;
	font-family: Courier;
    margin: 10px;
	line-height: 13px;
}

.preformattedHeader {
    background-color: #f0f0f0;
 	border-bottom: 1px dashed #3c78b5;
    padding: 3px;
	text-align: center;
}

.preformattedContent {
    background-color: #f0f0f0;
    padding: 3px;
}

.panel {
 	border: 1px dashed #3c78b5;
    margin: 10px;
    margin-top: 0px;
}

.panelHeader {
    background-color: #f0f0f0;
 	border-bottom: 1px dashed #3c78b5;
    padding: 3px;
	text-align: center;
}

.panelContent {
    background-color: #f0f0f0;
    padding: 5px;
}

.anonymousAlert {
    background-color: #f0f0f0;
 	border: 1px dashed red;
    font-size: 11px;
    padding: 10px 5px 10px 5px;
    margin: 4px;
	line-height: 13px;
}

.lockAlert {
    background-color: #f0f0f0;
    width: 50%;
 	border: 1px dashed red;
    font-size: 11px;
    padding: 10px 5px 10px 5px;
    margin: 4px;
	line-height: 13px;
}


.code-keyword {
  color: #000091;
  background-color: inherit;
}

.code-object {
  color: #910091;
  background-color: inherit;
}

.code-quote {
  color: #009100;
  background-color: inherit;
}

.code-comment {
  color: #808080;
  background-color: inherit;
}


.code-xml .code-keyword {
  color: inherit;
  font-weight: bold;
}

.code-tag {
  color: #000091;
  background-color: inherit;
}

.breadcrumbs {
    background-color: #f0f0f0;
 	border-color: #3c78b5;
	border-width: 1px 0px 1px 0px;
	border-style: solid;
    font-size: 11px;
    padding: 3px 0px 3px 0px;
}

.navmenu {
    border: 1px solid #ccc;
}

.menuheading {
    font-weight: bold;
    background-color: #f0f0f0;
 	border-bottom: 1px solid #3c78b5;
	padding: 4px 4px 2px 4px;
}

.menuitems {
	padding: 4px 4px 20px 4px;
}

.rightpanel {
    border-left: 1px solid #ccc;
    border-bottom: 1px solid #ccc;
}

#helpheading {
    text-align: left;
    font-weight: bold;
    background-color: #D0D9BD;
 	border-bottom: 1px solid #3c78b5;
	padding: 4px 4px 4px 4px;
	margin: 0px;
}
#helpcontent {
	padding: 4px 4px 4px 4px;
    background-color: #f5f7f1;
}
.helptab-unselected {
    font-weight: bold;
	padding: 5px;
    background-color: #f5f7f1;
}
.helptab-selected {
    font-weight: bold;
    background-color: #D0D9BD;
	padding: 5px;
}
.helptabs {
    margin: 0px;
    background-color: #f5f7f1;
	padding: 5px;
}
.infopanel-heading {
    font-weight: bold;
	padding: 4px 0px 2px 0px;
}

.pagebody {
}

.pageheader {
	padding: 5px 5px 5px 0px;
 	border-bottom: 1px solid #3c78b5;
}

.pagetitle {
	font-size: 22px;
	font-weight: bold;
	font-family: Arial, sans-serif;
	color: #003366;
}

.newpagetitle {
    color: #ccc !important;
}

.steptitle {
	font-size: 18px;
	font-weight: bold;
	font-family: Arial, sans-serif;
	color: #003366;
	margin-bottom: 7px;
}

.substeptitle {
    font-size: 12px;
    font-weight: bold;
    font-family: Arial, sans-serif;
    color: #003366;
    margin: 2px 4px 4px 4px;
    padding: 2px 4px 1px 4px;
}

.stepdesc {
    font-family: Verdana, arial, sans-serif;
	font-size: 11px;
	line-height: 16px;
	font-weight: normal;
    color: #666666;
    margin-top: 7px;
    margin-bottom: 7px;
}

.steplabel {
    font-weight: bold;
    margin-right: 4px;
    color: black;
    float: left;
    width: 15%;
    text-align: right;
}

.stepfield {
    background: #f0f0f0;
    padding: 5px;
}

.submitButtons{
    margin-top:5px;
    text-align:right;
}

.formtitle {
	font-size: 12px;
	font-weight: bold;
	font-family: Arial, sans-serif;
	color: #003366;
}

.sectionbottom {
    border-bottom: 1px solid #3c78b5;
}

.topRow {
    border-top: 2px solid #3c78b5;
}

.tabletitle {
	font-size: 14px;
	font-weight: bold;
	font-family: Arial, sans-serif;
    padding: 3px 0px 2px 0px;
    margin: 8px 4px 2px 0px;
	color: #003366;
	border-bottom: 2px solid #3c78b5;
}
.pagesubheading {
    color: #666666;
    font-size: 10px;
    padding: 0px 0px 5px 0px;
}

HR {
	color: 3c78b5;
	height: 1;
}

A:link, A:visited, A:active, A:hover {
	color: #003366;
}

h1 A:link, h1 A:visited, h1 A:active {
	text-decoration: none;
}

h1 A:hover {
    border-bottom: 1px dotted #003366;
}

.wiki-content > :first-child, .commentblock > :first-child {
    margin-top: 3px;
}

.logocell {
    padding: 10px;
}

input {
	font-family: verdana, geneva, arial, sans-serif;
	font-size: 11px;
	color: #000000;
}

textarea, textarea.editor {
	font-family: verdana, geneva, arial, sans-serif;
	font-size: 11px;
	color: #333333;
}

/* use logoSpaceLink instead.
.spacenametitle {
	font: 21px/31px Impact, Arial, Helvetica;
    font-weight: 100;
    color: #999999;
	margin: 0px;
}
.spacenametitle img {
  margin: 0 0 -4px 0;
}
.spacenametitle a {
    text-decoration: none;
    color: #999999;
}
.spacenametitle a:visited {
    text-decoration: none;
    color: #999999;
}*/

.spacenametitle-printable {
	font: 20px/25px Impact, Arial, Helvetica;
    font-weight: 100;
    color: #999999;
	margin: 0px;
}
.spacenametitle-printable a {
    text-decoration: none;
    color: #999999;
}
.spacenametitle-printable a:visited {
    text-decoration: none;
    color: #999999;
}

.blogDate {
	font-weight: bold;
	text-decoration: none;
	color: black;
}

.blogSurtitle {
    background: #f0f0f0;
 	border: 1px solid #ddd;
	padding: 3px;
	margin: 1px 1px 10px 1px;
}

.blogHeading {
    font-size: 20px;
    line-height: normal;
    font-weight: bold;
    padding: 0px;
    margin: 0px;
}

.blogHeading a {
   text-decoration: none;
   color: black;
}

.endsection {
	align: right;
	color: #666666;
	margin-top: 10px;
}
.endsectionleftnav {
	align: right;
	color: #666666;
	margin-top: 10px;
}

h1 {
	font-size: 24px;
	line-height: normal;
	font-weight: bold;
	background-color: #f0f0f0;
	color: #003366;
 	border-bottom: 1px solid #3c78b5;
	padding: 2px;
	margin: 36px 0px 4px 0px;
}

h2 {
	font-size: 18px;
	line-height: normal;
	font-weight: bold;
	background-color: #f0f0f0;
 	border-bottom: 1px solid #3c78b5;
	padding: 2px;
	margin: 27px 0px 4px 0px;
}

h3 {
	font-size: 14px;
	line-height: normal;
	font-weight: bold;
	background-color: #f0f0f0;
	padding: 2px;
	margin: 21px 0px 4px 0px;
}

h4 {
	font-size: 12px;
	line-height: normal;
	font-weight: bold;
	background-color: #f0f0f0;
	padding: 2px;
	margin: 18px 0px 4px 0px;
}

h4.search {
	font-size: 12px;
	line-height: normal;
	font-weight: normal;
	background-color: #f0f0f0;
	padding: 4px;
	margin: 18px 0px 4px 0px;
}

h5 {
	font-size: 10px;
	line-height: normal;
	font-weight: bold;
	background-color: #f0f0f0;
	padding: 2px;
	margin: 14px 0px 4px 0px;
}

h6 {
	font-size: 8px;
	line-height: normal;
	font-weight: bold;
	background-color: #f0f0f0;
	padding: 2px;
	margin: 14px 0px 4px 0px;
}

.smallfont {
    font-size: 10px;
}
.descfont {
    font-size: 10px;
    color: #666666;
}
.smallerfont {
    font-size: 9px;
}
.smalltext {
    color: #666666;
    font-size: 10px;
}
.smalltext a {
    color: #666666;
}
.smalltext-blue {
    color: #3c78b5;
    font-size: 10px;
}
.surtitle {
    margin-left: 1px;
    margin-bottom: 5px;
    font-size: 14px;
    color: #666666;
}

/* css hack found here:  http://www.fo3nix.pwp.blueyonder.co.uk/tutorials/css/hacks/ */
.navItemOver { font-size: 10px; font-weight: bold; color: #ffffff; background-color: #003366; cursor: hand; voice-family: '\'}\''; voice-family:inherit; cursor: pointer;}
.navItemOver a { color: #ffffff; background-color:#003366; text-decoration: none; }
.navItemOver a:visited { color: #ffffff; background-color:#003366; text-decoration: none; }
.navItemOver a:hover { color: #ffffff; background-color:#003366; text-decoration: none; }
.navItem { font-size: 10px; font-weight: bold; color: #ffffff; background-color: #3c78b5; }
.navItem a { color: #ffffff; text-decoration: none; }
.navItem a:hover { color: #ffffff; text-decoration: none; }
.navItem a:visited { color: #ffffff; text-decoration: none; }

div.padded { padding: 4px; }
div.thickPadded { padding: 10px; }
h3.macrolibrariestitle {
    margin: 0px 0px 0px 0px;
}

div.centered { text-align: center; margin: 10px; }
div.centered table {margin: 0px auto; text-align: left; }

.tableview table {
    margin: 0;
}

.tableview th {
    text-align: left;
    color: #003366;
    font-size: 12px;
    padding: 5px 0px 0px 5px;
    border-bottom: 2px solid #3c78b5;
}
.tableview td {
    text-align: left;
    border-color: #ccc;
    border-width: 0px 0px 1px 0px;
    border-style: solid;
    margin: 0;
    padding: 4px 10px 4px 5px;
}

.grid {
    margin: 2px 0px 5px 0px;
    border-collapse: collapse;
}
.grid th  {
    border: 1px solid #ccc;
    padding: 2px 4px 2px 4px;
    background: #f0f0f0;
    text-align: center;
}
.grid td  {
    border: 1px solid #ccc;
    padding: 3px 4px 3px 4px;
}
.gridHover {
	background-color: #f9f9f9;
}

td.infocell {
    background-color: #f0f0f0;
}
.label {
	font-weight: bold;
	color: #003366;
}

label {
	font-weight: bold;
	color: #003366;
}

.error {
	background-color: #fcc;
}

.errorBox {
	background-color: #fcc;
    border: 1px solid #c00;
    padding: 5px;
    margin: 5px;
}

.errorMessage {
	color: #c00;
}

.success {
	background-color: #dfd;
}

.successBox {
	background-color: #dfd;
    border: 1px solid #090;
    padding: 5px;
    margin-top:5px;
    margin-bottom:5px;
}

blockquote {
	padding-left: 10px;
	padding-right: 10px;
	margin-left: 5px;
	margin-right: 0px;
	border-left: 1px solid #3c78b5;
}

table.confluenceTable
{
    margin: 5px;
    border-collapse: collapse;
}

/* Added as a temporary fix for CONF-4223. The table elements appear to be inheriting the border: none attribute from the sectionMacro class */
table.confluenceTable td.confluenceTd
{
    border-width: 1px;
    border-style: solid;
    border-color: #ccc;
    padding: 3px 4px 3px 4px;
}

/* Added as a temporary fix for CONF-4223. The table elements appear to be inheriting the border: none attribute from the sectionMacro class */
table.confluenceTable th.confluenceTh
{
    border-width: 1px;
    border-style: solid;
    border-color: #ccc;
    padding: 3px 4px 3px 4px;
    background-color: #f0f0f0;
    text-align: center;
}

td.confluenceTd
{
    border-width: 1px;
    border-style: solid;
    border-color: #ccc;
    padding: 3px 4px 3px 4px;
}

th.confluenceTh
{
    border-width: 1px;
    border-style: solid;
    border-color: #ccc;
    padding: 3px 4px 3px 4px;
    background-color: #f0f0f0;
    text-align: center;
}

DIV.small {
	font-size: 9px;
}

H1.pagename {
	margin-top: 0px;
}

IMG.inline  {}

.loginform {
    margin: 5px;
    border: 1px solid #ccc;
}

/* The text how the "This is a preview" comment should be shown. */
.previewnote { text-align: center;
                font-size: 11px;
                    color: red; }

/* How the preview content should be shown */
.previewcontent { background: #E0E0E0; }

/* How the system messages should be shown (DisplayMessage.jsp) */
.messagecontent { background: #E0E0E0; }

/* How the "This page has been modified..." -comment should be shown. */
.conflictnote { }

.createlink {
    color: maroon;
}
a.createlink {
    color: maroon;
}
.templateparameter {
    font-size: 9px;
    color: darkblue;
}

.diffadded {
    background: #ddffdd;
    padding: 1px 1px 1px 4px;
	border-left: 4px solid darkgreen;
}
.diffdeleted {
    color: #999;
    background: #ffdddd;
    padding: 1px 1px 1px 4px;
	border-left: 4px solid darkred;
}
.diffnochange {
    padding: 1px 1px 1px 4px;
	border-left: 4px solid lightgrey;
}
.differror {
    background: brown;
}
.diff {
    font-family: lucida console, courier new, fixed-width;
	font-size: 12px;
	line-height: 14px;
}
.diffaddedchars {
    background-color:#99ff99;
    font-weight:bolder;
}
.diffremovedchars {
    background-color:#ff9999;
    text-decoration: line-through;
    font-weight:bolder;
}

.greybackground {
    background: #f0f0f0
}

.greybox {
 	border: 1px solid #ddd;
	padding: 3px;
	margin: 1px 1px 10px 1px;
}

.borderedGreyBox {
    border: 1px solid #cccccc;
    background-color: #f0f0f0;
    padding: 10px;
}

.greyboxfilled {
 	border: 1px solid #ddd;
    background: #f0f0f0;
    padding: 3px;
	margin: 1px 1px 10px 1px;
}

.navBackgroundBox {
    padding: 5px 5px 5px 5px;
    font-size: 22px;
	font-weight: bold;
	font-family: Arial, sans-serif;
	color: white;
    background: #3c78b5;
    text-decoration: none;
}

.previewBoxTop {
	background-color: #f0f0f0;
    border-width: 1px 1px 0px 1px;
    border-style: solid;
    border-color: #3c78b5;
    padding: 5px;
    margin: 5px 0px 0px 0px;
    text-align: center;
}
.previewContent {
    background-color: #fff;
 	border-color: #3c78b5;
	border-width: 0px 1px 0px 1px;
	border-style: solid;
	padding: 10px;
	margin: 0px;
}
.previewBoxBottom {
	background-color: #f0f0f0;
    border-width: 0px 1px 1px 1px;
    border-style: solid;
    border-color: #3c78b5;
    padding: 5px;
    margin: 0px 0px 5px 0px;
    text-align: center;
}

.functionbox {
    background-color: #f0f0f0;
 	border: 1px solid #3c78b5;
	padding: 3px;
	margin: 1px 1px 10px 1px;
}

.functionbox-greyborder {
    background-color: #f0f0f0;
 	border: 1px solid #ddd;
	padding: 3px;
	margin: 1px 1px 10px 1px;
}

.search-highlight {
    background-color: #ffffcc;
}

/* normal (white) background */
.rowNormal {
    background-color: #ffffff;
 }

/* alternate (pale yellow) background */
.rowAlternate {
    background-color: #f7f7f7;
}

/* used in the list attachments table */
.rowAlternateNoBottomColor {
    background-color: #f7f7f7;
}

.rowAlternateNoBottomNoColor {
}

.rowAlternateNoBottomColor td {
    border-bottom: 0px;
}

.rowAlternateNoBottomNoColor td {
    border-bottom: 0px;
}

/* row highlight (grey) background */
.rowHighlight {
    background-color: #f0f0f0;

}

TD.greenbar {FONT-SIZE: 2px; BACKGROUND: #00df00; BORDER: 1px solid #9c9c9c; PADDING: 0px; }
TD.redbar {FONT-SIZE: 2px; BACKGROUND: #df0000; BORDER: 1px solid #9c9c9c; PADDING: 0px; }
TD.darkredbar {FONT-SIZE: 2px; BACKGROUND: #af0000; BORDER: 1px solid #9c9c9c; PADDING: 0px; }

TR.testpassed {FONT-SIZE: 2px; BACKGROUND: #ddffdd; PADDING: 0px; }
TR.testfailed {FONT-SIZE: 2px; BACKGROUND: #ffdddd; PADDING: 0px; }

.toolbar  {
    margin: 0px;
    border-collapse: collapse;
}

.toolbar td  {
    border: 1px solid #ccc;
    padding: 2px 2px 2px 2px;
    color: #ccc;
}

td.noformatting {
    border-width: 0px;
    border-style: none;
    text-align: center;
	padding: 0px;
}

.commentblock {
    margin: 12px 0 12px 0;
}

/*
 * Divs displaying the license information, if necessary.
 */
.license-eval, .license-none, .license-nonprofit {
    border-top: 1px solid #bbbbbb;
    text-align: center;
    font-size: 10px;
    font-family: Verdana, Arial, Helvetica, sans-serif;
}

.license-eval, .license-none {
    background-color: #ffcccc;
}

.license-eval b, .license-none b {
    color: #990000
}

.license-nonprofit {
    background-color: #ffffff;
}

/*
 * The shadow at the bottom of the page between the main content and the
 * "powered by" section.
 */
.bottomshadow {
    height: 12px;
    background-image: url("$req.contextPath/images/border/border_bottom.gif");
    background-repeat: repeat-x;
}

/*
 * Styling of the operations box
 */
.navmenu .operations li, .navmenu .operations ul {
    list-style: none;
    margin-left: 0;
    padding-left: 0;
}

.navmenu .operations ul {
    margin-bottom: 9px;
}

.navmenu .label {
    font-weight: inherit;
}

/*
 * Styling of ops as a toolbar
 */
.toolbar div {
    display: none;
}

.toolbar .label {
    display: none;
}

.toolbar .operations {
    display: block;
}

.toolbar .operations ul {
    display: inline;
    list-style: none;
    margin-left: 10px;
    padding-left: 0;
}

.toolbar .operations li {
    list-style: none;
    display: inline;
}

/* list page navigational tabs */
#foldertab {
padding: 3px 0px 3px 8px;
margin-left: 0;
border-bottom: 1px solid #3c78b5;
font: bold 11px Verdana, sans-serif;
}

#foldertab li {
list-style: none;
margin: 0;
display: inline;
}

#foldertab li a {
padding: 3px 0.5em;
margin-left: 3px;
border: 1px solid #3c78b5;
border-bottom: none;
background: #3c78b5;
text-decoration: none;
}

#foldertab li a:link { color: #ffffff; }
#foldertab li a:visited { color: #ffffff; }

#foldertab li a:hover {
color: #ffffff;
background: #003366;
border-color: #003366;
}

#foldertab li a.current {
background: white;
border-bottom: 1px solid white;
color: black;
}

#foldertab li a.current:link { color: black; }
#foldertab li a.current:visited { color: black; }
#foldertab li a.current:hover {
background: white;
border-bottom: 1px solid white;
color: black;
}

/* alphabet list */
ul#squaretab {
margin-left: 0;
padding-left: 0;
white-space: nowrap;
font: bold 8px Verdana, sans-serif;
}

#squaretab li {
display: inline;
list-style-type: none;
}

#squaretab a {
padding: 2px 6px;
border: 1px solid #3c78b5;
}

#squaretab a:link, #squaretab a:visited {
color: #fff;
background-color: #3c78b5;
text-decoration: none;
}

#squaretab a:hover {
color: #ffffff;
background-color: #003366;
border-color: #003366;
text-decoration: none;
}

#squaretab li a#current {
background: white;
color: black;
}

.blogcalendar * {
    font-family:verdana, arial, sans-serif;
    font-size:x-small;
    font-weight:normal;
    line-height:140%;
    padding:2px;
}


table.blogcalendar {
    border: 1px solid #3c78b5;
}

.blogcalendar th.calendarhead, a.calendarhead {
    font-size:x-small;
    font-weight:bold;
    padding:2px;
    text-transform:uppercase;
    background-color: #3c78b5;
    color: #ffffff;
    letter-spacing: .3em;
    text-transform: uppercase;
}

.calendarhead:visited {color: white;}
.calendarhead:active {color: white;}
.calendarhead:hover {color: white;}

.blogcalendar th {
    font-size:x-small;
    font-weight:bold;
    padding:2px;
    background-color:#f0f0f0;
}

.blogcalendar td {
    font-size:x-small;
    font-weight:normal;
}

.searchGroup { padding: 0 0 10px 0; background: #f0f0f0; }
.searchGroupHeading { font-size: 10px; font-weight: bold; color: #ffffff; background-color: #3c78b5; padding: 2px 4px 1px 4px; }
.searchItem { padding: 1px 4px 1px 4px; }
.searchItemSelected { padding: 1px 4px 1px 4px; font-weight: bold; background: #ddd; }

/* permissions page styles */
.permissionHeading {
    border-bottom: #bbb; border-width: 0 0 1px 0; border-style: solid; font-size: 16px; text-align: left;
}
.permissionTab {
    border-width: 0 0 0 1px; border-style: solid; background: #3c78b5; color: #ffffff; font-size: 10px;
}
.permissionSuperTab {
    border-width: 0 0 0 1px; border-style: solid; background: #003366; color: #ffffff;
}
.permissionCell {
    border-left: #bbb; border-width: 0 0 0 1px; border-style: solid;
}

/* warning panel */
.warningPanel { background: #FFFFCE; border:#F0C000 1px solid; padding: 8px; margin: 10px; }
/* alert panel */
.alertPanel { background: #FFCCCC; border:#C00 1px solid; padding: 8px; margin: 10px; }
/* info panel */
.infoPanel { background: #D8E4F1; border:#3c78b5 1px solid; padding: 8px; margin: 10px; }

/* side menu highlighting (e.g. space content screen) */
.optionPadded { padding: 2px; }
.optionSelected { background-color: #ffffcc; padding: 2px; border: 1px solid #ddd; margin: -1px; }
.optionSelected a { font-weight: bold; text-decoration: none; color: black; }

/* information macros */
.noteMacro { border-style: solid; border-width: 1px; border-color: #F0C000; background-color: #FFFFCE; text-align:left; margin-top: 5px; margin-bottom: 5px}
.warningMacro { border-style: solid; border-width: 1px; border-color: #c00; background-color: #fcc; text-align:left; margin-top: 5px; margin-bottom: 5px}
.infoMacro { border-style: solid; border-width: 1px; border-color: #3c78b5; background-color: #D8E4F1; text-align:left; margin-top: 5px; margin-bottom: 5px}
.tipMacro { border-style: solid; border-width: 1px; border-color: #090; background-color: #dfd; text-align:left; margin-top: 5px; margin-bottom: 5px}
.informationMacroPadding { padding: 5px 0 0 5px; }

table.infoMacro td, table.warningMacro td, table.tipMacro td, table.noteMacro td, table.sectionMacro td {
    border: none;
}

table.sectionMacroWithBorder td.columnMacro { border-style: dashed; border-width: 1px; border-color: #cccccc;}

.pagecontent
{
    padding: 10px;
    text-align: left;
}

/* styles for links in the top bar */
.topBarDiv a:link {color: #ffffff;}
.topBarDiv a:visited {color: #ffffff;}
.topBarDiv a:active {color: #ffffff;}
.topBarDiv a:hover {color: #ffffff;}
.topBarDiv {color: #ffffff;}

.topBar {
    background-color: #003366;
}


/* styles for extended operations */
.greyLinks a:link {color: #666666; text-decoration:underline;}
.greyLinks a:visited {color: #666666; text-decoration:underline;}
.greyLinks a:active {color: #666666; text-decoration:underline;}
.greyLinks a:hover {color: #666666; text-decoration:underline;}
.greyLinks {color: #666666; display:block; padding: 10px}

.logoSpaceLink {color: #999999; text-decoration: none}
.logoSpaceLink a:link {color: #999999; text-decoration: none}
.logoSpaceLink a:visited {color: #999999; text-decoration: none}
.logoSpaceLink a:active {color: #999999; text-decoration: none}
.logoSpaceLink a:hover {color: #003366; text-decoration: none}

/* basic panel (basicpanel.vmd) style */
.basicPanelContainer {border: 1px solid #3c78b5; margin-top: 2px; margin-bottom: 8px; width: 100%}
.basicPanelTitle {padding: 5px; margin: 0px; background-color: #f0f0f0; color: black; font-weight: bold;}
.basicPanelBody {padding: 5px; margin: 0px}

.separatorLinks a:link {color: white}
.separatorLinks a:visited {color: white}
.separatorLinks a:active {color: white}

.greynavbar {background-color: #f0f0f0; border-top: 1px solid #3c78b5; margin-top: 2px}

div.headerField {
    float: left;
    width: auto;
    height: 100%;
}

.headerFloat {
    margin-left: auto;
    width: 50%;
}

.headerFloatLeft {
    float: left;
    margin-right: 20px;
    margin-bottom: 10px;
}

#headerRow {
    padding: 10px;
}

div.license-personal {
   background-color: #003366;
   color: #ffffff;
}

div.license-personal a {
   color: #ffffff;
}

.greyFormBox {
    border: 1px solid #cccccc;
    padding: 5px;
}

/* IE automatically adds a margin before and after form tags. Use this style to remove that */
.marginlessForm {
    margin: 0px;
}

.openPageHighlight {
    background-color: #ffffcc;
    padding: 2px;
    border: 1px solid #ddd;
}

.editPageInsertLinks, .editPageInsertLinks a
{
    color: #666666;
    font-weight: bold;
    font-size: 10px;
}

/* Style for label heatmap. */
.top10 a {
    font-weight: bold;
    font-size: 2em;
    color: #003366;
}
.top25 a {
    font-weight: bold;
    font-size: 1.6em;
    color: #003366;
}
.top50 a {
    font-size: 1.4em;
    color: #003366;
}
.top100 a {
    font-size: 1.2em;
    color: #003366;
}

.heatmap {
    list-style:none;
    width: 95%;
    margin: 0px auto;
}

.heatmap a {
    text-decoration:none;
}

.heatmap a:hover {
    text-decoration:underline;
}

.heatmap li {
    display: inline;
}

.minitab {
padding: 3px 0px 3px 8px;
margin-left: 0;
margin-top: 1px;
margin-bottom: 0px;
border-bottom: 1px solid #3c78b5;
font: bold 9px Verdana, sans-serif;
text-decoration: none;
float:none;
}
.selectedminitab {
padding: 3px 0.5em;
margin-left: 3px;
margin-top: 1px;
border: 1px solid #3c78b5;
background: white;
border-bottom: 1px solid white;
color: #000000;
text-decoration: none;
}
.unselectedminitab {
padding: 3px 0.5em;
margin-left: 3px;
margin-top: 1px;
border: 1px solid #3c78b5;
border-bottom: none;
background: #3c78b5;
color: #ffffff;
text-decoration: none;
}

a.unselectedminitab:hover {
color: #ffffff;
background: #003366;
border-color: #003366;
}

a.unselectedminitab:link { color: white; }
a.unselectedminitab:visited { color: white; }

a.selectedminitab:link { color: black; }
a.selectedminitab:visited { color: black; }

.linkerror { background-color: #fcc;}

a.labelOperationLink:link {text-decoration: underline}
a.labelOperationLink:active {text-decoration: underline}
a.labelOperationLink:visited {text-decoration: underline}
a.labelOperationLink:hover {text-decoration: underline}

a.newLabel:link {background-color: #ddffdd}
a.newLabel:active {background-color: #ddffdd}
a.newLabel:visited {background-color: #ddffdd}
a.newLabel:hover {background-color: #ddffdd}

ul.square {list-style-type: square}

.inline-control-link {
    background: #ffc;
    font-size: 9px;
    color: #666;
    padding: 2px;
    text-transform: uppercase;
    text-decoration: none;
}


.inline-control-link a:link {text-decoration: none}
.inline-control-link a:active {text-decoration: none}
.inline-control-link a:visited {text-decoration: none}
.inline-control-link a:hover {text-decoration: none}

.inline-control-link {
    background: #ffc;
    font-size: 9px;
    color: #666;
    padding: 2px;
    text-transform: uppercase;
    text-decoration: none;
    cursor: pointer;
}

div.auto_complete {
    width: 350px;
    background: #fff;
}
div.auto_complete ul {
    border: 1px solid #888;
    margin: 0;
    padding: 0;
    width: 100%;
    list-style-type: none;
}
div.auto_complete ul li {
    margin: 0;
    padding: 3px;
}
div.auto_complete ul li.selected {
    background-color: #ffb;
}
div.auto_complete ul strong.highlight {
    color: #800;
    margin: 0;
    padding: 0;
}

/******* Edit Page Styles *******/
.toogleFormDiv{
    border:1px solid #A7A6AA;
    background-color:white;
    padding:5px;
    margin-top: 5px;
}

.toogleInfoDiv{
    border:1px solid #A7A6AA;
    background-color:white;
    display:none;
    padding:5px;
    margin-top: 10px;
}

.inputSection{
    margin-bottom:20px;
}

#editBox{
   border:1px solid lightgray;
   background-color:#F0F0F0;
}

/******* Left Navigation Theme Styles ********/
.leftnav li a {
    text-decoration:none;
    color:white;
    margin:0px;
    display:block;
    padding:2px;
    padding-left:5px;
    background-color: #3c78b5;
    border-top:1px solid #3c78b5;
}

.leftnav li a:active {color:white;}
.leftnav li a:visited {color:white;}
.leftnav li a:hover {background-color: #003366; color:white;}

/* Added by Shaun during i18n */
.replaced
{
    background-color: #33CC66;
}

.topPadding
{
    margin-top: 20px;
}

/* new form style */
.form-block {
    padding: 6px;
}
.form-error-block {
    padding: 6px;
    background: #fcc;
    border-top: #f0f0f0 1px solid;
    border-bottom: #f0f0f0 1px solid;
    margin-bottom: 6px;
    padding: 0 12px 0 12px;
}
.form-element-large {
    font-size: 16px;
    font-weight: bold;
    font-family: Arial, sans-serif;
    color: #003366;
}

.form-element-small {
    font-size: 12px;
    font-weight: bold;
    font-family: Arial, sans-serif;
    color: #003366;
}

.form-header {
    background: lightyellow;
    border-top: #f0f0f0 1px solid;
    border-bottom: #f0f0f0 1px solid;
    margin-bottom: 6px;
    padding: 0 12px 0 12px;
}
.form-header p, .form-block p, .form-error-block p {
    line-height: normal;
    margin: 12px 0 12px 0;
}
.form-example {
    color: #888;
    font-size: 11px;
}
.form-divider {
    border-bottom: #ccc 1px solid;
    margin-bottom: 6px;
}
.form-buttons {
    margin-top: 6px;
    border-top: #ccc 1px solid;
    border-bottom: #ccc 1px solid;
    background: #f0f0f0;
    padding: 10px;
    text-align: center;
}
.form-buttons input {
    width: 100px;
}
.form-block .error {
    padding: 6px;
    margin-bottom: 6px;
}
    -->
    </style>
</head>
<body>

<div id="PageContent">
<table class="pagecontent" border="0" cellpadding="0" cellspacing="0" width="100%"><tr>
<td valign="top" class="pagebody">

    <div class="pageheader">
        <span class="pagetitle">
            Page Edited :
            <a href="http://cwiki.apache.org/confluence/display/qpid">qpid</a> :
            <a href="http://cwiki.apache.org/confluence/display/qpid/Cluster+Design+Note">Cluster Design Note</a>
        </span>
    </div>

     <p>
        <a href="http://cwiki.apache.org/confluence/display/qpid/Cluster+Design+Note">Cluster Design Note</a>
        has been edited by             <a href="http://cwiki.apache.org/confluence/display/~aconway">Alan Conway</a>
            <span class="smallfont">(Apr 02, 2007)</span>.
     </p>
    
     <p>
                 <a href="http://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=31318&originalVersion=14&revisedVersion=15">(View changes)</a>
     </p>

    <span class="label">Content:</span><br/>
    <div class="greybox wiki-content"><p><b>Cluster Design Note</b></p>

<hr />

<div>
<ul>
  <li><a href='#ClusterDesignNote-Overview'>Overview</a></li>
  <li><a href='#ClusterDesignNote-Requirements'>Requirements</a></li>
  <li><a href='#ClusterDesignNote-AbstractModelandTerms'>Abstract Model and Terms</a></li>
  <li><a href='#ClusterDesignNote-ClusterStateandReplication'>Cluster State and Replication</a>
<ul>
  <li><a href='#ClusterDesignNote-Replicationmechanisms'>Replication mechanisms</a></li>
  <li><a href='#ClusterDesignNote-TypesofstateWehavetoconsiderseveralkindsofstate%3A'>Types of state We have to consider several kinds of state:</a></li>
  <li><a href='#ClusterDesignNote-TheClusterMap%3AMembershipandWiring'>The Cluster Map: Membership and  Wiring</a></li>
  <li><a href='#ClusterDesignNote-QueueContent'>Queue Content</a>
<ul>
  <li><a href='#ClusterDesignNote-FragmentedSharedQueues'>Fragmented Shared Queues</a></li>
</ul></li>
</ul></li>
  <li><a href='#ClusterDesignNote-SessionState'>Session State</a>
<ul>
  <li><a href='#ClusterDesignNote-Inflightcommands'>In-flight commands</a></li>
  <li><a href='#ClusterDesignNote-Resumingachannel'>Resuming a channel</a></li>
  <li><a href='#ClusterDesignNote-Replicatingsessionstate.'>Replicating session state.</a></li>
</ul></li>
  <li><a href='#ClusterDesignNote-MappingofAMQPcommandstoreplicationmechanisms'>Mapping of AMQP commands  to replication mechanisms</a>
<ul>
  <li><a href='#ClusterDesignNote-queue.declare%2Fbind%2Fdelete%2Cexchange.declare%2Fdelete'>queue.declare/bind/delete, exchange.declare/delete</a></li>
  <li><a href='#ClusterDesignNote-message.transfer%2Fbasic.publish%28clienttobroker%29'>message.transfer/basic.publish (client to broker)</a></li>
  <li><a href='#ClusterDesignNote-message.transfer%28brokertoclient%29%2Cmessage.deliver'>message.transfer(broker to client), message.deliver</a></li>
  <li><a href='#ClusterDesignNote-message.consume%2Fbasic.consume'>message.consume/basic.consume</a></li>
  <li><a href='#ClusterDesignNote-basic.ack%2Fmessage.ok%28fromclient%29'>basic.ack/message.ok(from client)</a></li>
  <li><a href='#ClusterDesignNote-basic.ack%2Fmessage.ok%28frombroker%29'>basic.ack/message.ok(from broker)</a></li>
  <li><a href='#ClusterDesignNote-basic.reject%2Fmessage.reject'>basic.reject / message.reject</a></li>
  <li><a href='#ClusterDesignNote-reference.open%2Fapppend%2Fclose%28clienttobroker%29'>reference.open/apppend/close (client to broker)</a></li>
  <li><a href='#ClusterDesignNote-reference.open%2Fapppend%2Fclose%28brokertoclient%29'>reference.open/apppend/close (broker to client) **</a></li>
  <li><a href='#ClusterDesignNote-Allcommands'>All commands</a></li>
</ul></li>
  <li><a href='#ClusterDesignNote-ClientBrokerProtocol'>Client-Broker Protocol</a></li>
  <li><a href='#ClusterDesignNote-BrokerBrokerProtocol'>Broker-Broker Protocol</a></li>
  <li><a href='#ClusterDesignNote-PersistenceandRecovery'>Persistence and Recovery</a>
<ul>
  <li><a href='#ClusterDesignNote-Competingfailuremodes%3A'>Competing failure modes:</a></li>
  <li><a href='#ClusterDesignNote-Persistenceoverview'>Persistence overview</a></li>
</ul></li>
  <li><a href='#ClusterDesignNote-Journals'>Journals</a>
<ul>
  <li><a href='#ClusterDesignNote-Overview'>Overview</a></li>
  <li><a href='#ClusterDesignNote-Useofjournals'>Use of journals</a></li>
  <li><a href='#ClusterDesignNote-Whataboutdisklessreliability%3F'>What about diskless reliability?</a></li>
</ul></li>
  <li><a href='#ClusterDesignNote-Virtualsynchrony'>Virtual synchrony</a></li>
  <li><a href='#ClusterDesignNote-Configuration'>Configuration</a>
<ul>
  <li><a href='#ClusterDesignNote-SimplifyingpatternsPossiblewaystoconfigureacluster%3A'>Simplifying patterns Possible ways to configure a cluster:</a></li>
  <li><a href='#ClusterDesignNote-Dynamicclusterconfiguration'>Dynamic cluster configuration</a></li>
</ul></li>
  <li><a href='#ClusterDesignNote-Transactions'>Transactions</a>
<ul>
  <li><a href='#ClusterDesignNote-Localtransactions'>Local transactions</a></li>
  <li><a href='#ClusterDesignNote-DistributedTransactions'>Distributed Transactions</a></li>
</ul></li>
  <li><a href='#ClusterDesignNote-OpenQuestions'>Open Questions</a></li>
  <li><a href='#ClusterDesignNote-Implementationbreakdown.'>Implementation breakdown.</a></li>
</ul></div>

<hr />

<h1><a name="ClusterDesignNote-Overview"></a>Overview</h1>

<p>A Qpid <em>cluster</em> is a group of brokers co-operating to provide the illusion of a single "virtual broker" with some extra qualities:</p>
<ul>
	<li>The cluster continues to provide service as long some members survive. Exact guarantee will depend on configuration.</li>
	<li>If a client is disconnected unexpectedly it can fail-over to another cluster member, giving the impression of uninterupted service.</li>
</ul>


<p>This design discusses clustering at the AMQP protocol layer, i.e. members of a cluster have distinct AMQP addresses and AMQP protocol commands are exchanged to negotiate reconnection and failover. "Transparent" failover in this context means transparent to the application, the AMQP client will be aware of disconnects and must take action to fail over.</p>

<p>Ultimately we will propose extensions to AMQP spec but for the initial implementation we can use existing extension points:</p>

<ul>
	<li>Field table parameters to various AMQP methods (declare() arguments etc.)</li>
	<li>Field table in message headers.</li>
	<li>System exchanges and queues.</li>
</ul>



<h1><a name="ClusterDesignNote-Requirements"></a>Requirements</h1>


<p><b>Clients use standard AMQP</b> to talk to a cluster during normal operation. They only need to use some extensions to get replica information and during fail-over.</p>

<p><b>Transparent failover</b>: In the event of failover, the sequence of protocol commands sent and received by the client <em>excluding failover-related commands</em> is identical to the sequence that would have been sent/received if no failure had occured. Thus an AMQP client library can hide failover-related commands from the application.</p>

<p><b>Transactional failover</b>: In the event of a failover, any incomplete transactions are rolled back. Any un-acknowledged non-transactional commands may need to be re-transimtted.</p>

<p><em><b>TODO</b></em>: Do we need to offer both? Transactional failover is a weaker guarantee and only interesting if it offers better performance. It allows persistence/replication to be deferred till prepare/commit time.On the other hand in the normal sucessful case a similar amount of data has to be written/replicated either way.</p>

<p><em><b>TODO</b></em>: Define levels of reliability we want to provide - survive one node failure, survive multiple node failures, survive total failure, network partitions etc. Does durable/non-durable message distinction mean anything in a reliable cluster? I.e. can we lose non-durable messages on a node failure? Can we lose them on orderly shutdown or total failure?</p>

<p><em><b>TODO</b></em>: The requirements triangle. Concrete performance data.</p>



<h1><a name="ClusterDesignNote-AbstractModelandTerms"></a>Abstract Model and Terms</h1>

<p>A quick re-cap of AMQP terminology and introduction to some new terms:</p>

<p>A <em>broker</em> is a container for 3 types of <em>broker components</em>: <em>queues</em>, <em>exchanges</em> and <em>bindings</em>. Broker components represent resources available to multiple clients, and are not affected by clients connecting and disconnecting<style type='text/css'>
.FootnoteMarker, .FootnoteNum a {
  background: transparent url(/confluence/download/resources/com.adaptavist.confluence.footnoteMacros:footnote/gfx/footnote.png) no-repeat top right;
  padding: 1px 2px 0px 1px;
  border-left: 1px solid #8898B8;
  border-bottom: 1px solid #6B7C9B;
  margin: 1px;
  text-decoration: none;
}
.FootnoteNum a {
  margin-top: 2px;
  margin-right: 0px;
}
.FootnoteNum {
  font-size: x-small;
  text-align: right;
  padding-bottom: 4px;
}
.footnote-th1 {
  text-align: right;
}
.Footnote {
  padding-left: 7px;
  margin-bottom: 4px;
  border: 1px none #DDDDDD;
  writingMode: tb-rl;
}
.accessibility {
     display: none;
     visibility: hidden;
}
@media aural,braille,embossed {
        .FootnoteMarker, .FootnoteNum a {
         border: 1px solid #000000;
         background: #ffffff none;
    }
    .accessibility {
         display: run-in;
         visibility: visible;
    }
}
</style>
<script type='text/javascript' language='JavaScript'>
//<!--\n
var effectInProgress = {};
var despamEffect = function (id,effectType,duration) {
  if ((effectInProgress[id]) || (typeof(Effect)=="undefined") || (typeof(Effect[effectType])=="undefined")) return;
  new Effect[effectType](id);
  effectInProgress[id]=true;
  setTimeout('effectInProgress[\"'+id+'\"]=false;',duration*1000);
};
var oldFootnoteId = '';
var footnoteHighlight = function(id,pulsateNum) {
  if (oldFootnoteId!='') document.getElementById('Footnote'+oldFootnoteId).style['borderStyle'] = 'none';
  oldFootnoteId = id;
  document.getElementById('Footnote'+id).style['borderStyle'] = 'solid';
  despamEffect('Footnote'+id,'Highlight',1)
  if (pulsateNum) despamEffect('FootnoteNum'+id,'Pulsate',3)
}
var footnoteMarkerHighlight = function(id) {
  if (oldFootnoteId!='') document.getElementById('Footnote'+oldFootnoteId).style['borderStyle'] = 'none';
  oldFootnoteId = '';
  despamEffect('FootnoteMarker'+id,'Pulsate',3)
}
//-->
</script>
<sup id='FootnoteMarker1'>
    <a name='FootnoteMarker1'
        href='#Footnote1'
        onClick='footnoteHighlight("1",true);'
        alt='Footnote: Click here to display the footnote'
        title='Footnote: Click here to display the footnote'
        class='FootnoteMarker'>
            1
    </a>
</sup>. <em>Persistent</em> broker components are unaffected by shut-down and re-start of a broker.</p>

<p>A <em>client</em> uses the components contained in a broker via the AMQP protocol. The <em>client components</em> are <em>connection</em>, <em>channel</em>, <em>consumer</em> and _session_
<sup id='FootnoteMarker2'>
    <a name='FootnoteMarker2'
        href='#Footnote2'
        onClick='footnoteHighlight("2",true);'
        alt='Footnote: Click here to display the footnote'
        title='Footnote: Click here to display the footnote'
        class='FootnoteMarker'>
            2
    </a>
</sup>. Client components represent the relationship between a client and a broker.</p>

<p>A client's interaction with a unclustered <em>individual broker</em> 
<sup id='FootnoteMarker3'>
    <a name='FootnoteMarker3'
        href='#Footnote3'
        onClick='footnoteHighlight("3",true);'
        alt='Footnote: Click here to display the footnote'
        title='Footnote: Click here to display the footnote'
        class='FootnoteMarker'>
            3
    </a>
</sup> is defined by AMQP 0-8/0-9: create a connection to the brokers <em>address</em>, create channels, exchange AMQP commands (which may create consumers), disconnect. After a disconnect the client can reconnect to the same broker address. Broker components created by the previous connection are preserved but client components are not. In the event of a disorderly disconnect the outcome of commands in flight can only be determined by their effects on broker components.</p>

<p>A <em>broker cluster</em> (or just <em>cluster</em>) is a "virtual" broker implemented by several <em>member brokers</em> (or just <em>members</em>.) A cluster has many AMQP addresses - the addresses of all its members - all semantically equivalent for client connection 
<sup id='FootnoteMarker4'>
    <a name='FootnoteMarker4'
        href='#Footnote4'
        onClick='footnoteHighlight("4",true);'
        alt='Footnote: Click here to display the footnote'
        title='Footnote: Click here to display the footnote'
        class='FootnoteMarker'>
            4
    </a>
</sup>. The cluster members co-operate to ensure:</p>
<ul>
	<li>all broker components are available via any cluster member.</li>
	<li>all broker components remain available if a single member fails. Service may degrade in multiple failures depending on configuration.</li>
	<li>clients disconnected by a member failure or network failure can reconnect to another member and resume their <em>session</em>.</li>
</ul>


<p>A <em>session</em> identifies a client-broker relationship that can outlive a single connection. AMQP 0-9 provides some support for sessions in the `resume` command.</p>

<p>Orderly closure of a connection by either peer ends the session.  If there is an unexpected disconnect, the session remains viable for some (possibly long) timeout period and the client can reconnect to the cluster and resume.</p>

<p>If a connection is in a session, events in the AMQP.0-8 spec that are triggered by closing the connection (e.g. deleting auto-delete queues) are instead trigged by the close (or timeout) of the session.</p>

<p>Note the session concept could also be used in the absence of clustering to allow a client to disconnect and resume a long-running session with the same broker. This is outside the scope of this design.</p>



<h1><a name="ClusterDesignNote-ClusterStateandReplication"></a>Cluster State and Replication</h1>

<h2><a name="ClusterDesignNote-Replicationmechanisms"></a>Replication mechanisms</h2>

<p><b>Virtual Synchrony</b> protocols such as AIS or JGroups use multicast and allow a cluster to maintain a consistent view across all members. We will use such a protocol to replicate low-volume <em>membership</em> and <em>wiring</em> changes.</p>

<p><b>Primary/backup</b> one primary owner of a given object replicates point-to-point to one or mor backups. On failure of primary one backup takes over. Appropraite for commodity hardware where each node has independent store.</p>


<p><b>Shared store</b> such as GFS. Primary updates shared store, backups recover from store. For <em>hot backup</em> the primary can also forward point-to-point to backups. Appropriate for high-speed storage network.</p>

<p><em><b>TODO</b></em>: Is GFS also appropriate for comoddity? Can we push more reliability work down into the GFS layer?</p>

<p><b>Proxy</b>: forwards traffic to the primary. Allows all objects to be visible on all nodes. (Note backup-as-proxy can be optimized as a special case to reduce traffic.)</p>

<p>For virtual synchrony we will use specialized multicast protocol such as Totem or JGroups. For point-to-point communication we will use AMQP.  As far as possible we will use ordinary AMQP operations on special system queues and exchanges rather than requiring protocol extensions. Ultimately we will propose extensions for interoperability but we should be able to prove the implementation without them.</p>



<h2><a name="ClusterDesignNote-TypesofstateWehavetoconsiderseveralkindsofstate%3A"></a>Types of state We have to consider several kinds of state:</h2>

<ul>
	<li><em>Cluster Membership</em>: Active cluster members (nodes) and data about them.</li>
	<li><em>AMQP Wiring</em>: Names and properties of queues, exchanges and bindings.</li>
	<li><em>AMQP Content</em>: Data in messages on queues.</li>
	<li><em>Session</em>: Conversation with a single client, including references.</li>
</ul>


<p>Data must be replicated and stored such that:</p>
<ul>
	<li>A client knows which node(s) can be used for failover.</li>
	<li>After a failover, the client can continue its session uninterruped.</li>
	<li>No acknowledged messages or commands are lost.</li>
	<li>No messges or commands are applied twice.</li>
</ul>



<p>Cluster membership, wiring and session identities are low volume, and will be replicated using virtual synchrony so the entire cluster has a consistent picture.</p>

<p>Queue content is high volume so it is replicated point-to-point using primary-backup to avoid flooding the network.</p>

<p>Session state is potentially high volume and only relevant to a single client, so it is also replicated point-to-point.</p>

<p>How to choose the number and location of backup nodes for a given queue or session is an open question. Note that the choice is independent for every queue and session in principle, but in practice they will probably be grouped in some way.</p>


<h2><a name="ClusterDesignNote-TheClusterMap%3AMembershipandWiring"></a>The Cluster Map: Membership and  Wiring</h2>

<p>Membership, wiring and session changes are low volume. They are replicated to entire cluster symmetrically using a virtual synchrony protocol such as openAIS or JGroups.</p>

<p>Wiring inclues</p>
<ul>
	<li>exchange names and properties</li>
	<li>queue names, properties and bindings.</li>
</ul>


<p>Membership data includes:</p>
<ul>
	<li>address of each node</li>
	<li>state of health of each node</li>
	<li>primary/backup for each queue/exchange</li>
	<li>session names, primary/backup for each session.</li>
</ul>



<h2><a name="ClusterDesignNote-QueueContent"></a>Queue Content</h2>

<p>Message content is too high volume to replicate to the entire cluster, so each queue has a primary node and one or more backup nodes. Other nodes can act as proxies. The client is unaware of the distinction, it sees an identical picture regardless of what broker it connects to.</p>

<p>Note a single cluster node may contain a mix of primary, backup and proxy queues.</p>

<p><b>TODO</b>: Ordering issues with proxys and put-back messages (reject, transaction rollback) or selectors.</p>

<h3><a name="ClusterDesignNote-FragmentedSharedQueues"></a>Fragmented Shared Queues</h3>

<p>A shared queue has reduced ordering requirements and increased distribution requirements. <em>Fragmenting</em> a shared queue is a special type of replication. The queue is broken into a set of disjoint sub-queues each on a separate node to distribute load.</p>

<p>Each fragment (sub-queue) content is replicated to backups just like a normal queue, independently of the other fragments.</p>

<p>The fragments collaberate to create the appearance of a single queue. Fragments store incomging messges in the local queue, and serve local consumers from the local queue whenever possible. When a fragment does not have messages to satisfy its consumers it consumes messages from other fragments in the group. Proxies to a fragmented queue will consume from the "nearest" fragment if possible.</p>

<p><b>TODO</b>: Proxies can play a more active role. Ordering guarantees, we can provide "same producer to same consumer preserves order" since messages from the same producer always go on the same fragment queue. May break down in the presence of failover unless we remember which fragment received messges from the client and proxy to the same one on the failover replica.</p>




<h1><a name="ClusterDesignNote-SessionState"></a>Session State</h1>

<p>Session state includes:</p>
<ul>
	<li>open channels, channel attributes (qos, transactions etc.).</li>
	<li>active consumers.</li>
	<li>open references.</li>
	<li>completed command history.</li>
	<li>commands in flight.</li>
	<li>open transactions</li>
	<li>exclusive/private queues.</li>
</ul>


<p>The broker a client is connected to is the session primary, one or more other brokers are session backup. On failure of the primary the client fails-over to a backup as described below.</p>

<p><em>TODO: We could allow resume on non-session-backup node, by letting it download session state from a session backup.</em></p>

<p>The primary-backup protocol must guarantee that the backup has sufficient data to resume at all times without becoming a synchronous bottleneck.</p>

<h2><a name="ClusterDesignNote-Inflightcommands"></a>In-flight commands</h2>

<p>Both peers must store sent commands for possible resend and received commands to detect possible duplicates in a failover.</p>

<p>To keep session size finite a peer can:</p>
<ul>
	<li>forget sent commands when we know the other peer has received them.</li>
	<li>forget received commands when we know the other peer will not resend them.</li>
</ul>


<p>An algorithm to achieve this:</p>

<p>self_received(r): <blockquote><p>if r.is_response: peer_received(sent<span class="nobr"><a href="/confluence/pages/createpage.action?spaceKey=qpid&amp;title=r.responds_to_id&amp;linkCreation=true&amp;fromPageId=31318" title="Create Page: r.responds_to_id" class="createlink">r.responds&#95;to&#95;id<sup><img class="rendericon" src="/confluence/images/icons/plus.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>) for s in sent<span class="nobr"><a href="/confluence/pages/createpage.action?spaceKey=qpid&amp;title=0..r.process_mark&amp;linkCreation=true&amp;fromPageId=31318" title="Create Page: 0..r.process_mark" class="createlink">0..r.process&#95;mark<sup><img class="rendericon" src="/confluence/images/icons/plus.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span>: peer_received(s)</p></blockquote></p>

<p>peer_received(s): <blockquote><p>sent.erase(s)			# forget s but also... # Peer will never resend commands &lt;= s.process_mark. for r in received<span class="nobr"><a href="/confluence/pages/createpage.action?spaceKey=qpid&amp;title=0..s.process_mark&amp;linkCreation=true&amp;fromPageId=31318" title="Create Page: 0..s.process_mark" class="createlink">0..s.process&#95;mark<sup><img class="rendericon" src="/confluence/images/icons/plus.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span> received.erase(r)</p></blockquote></p>

<p>The weakest rules for interop between peers A and B are:</p>

<ul>
	<li>A MAY forget a sent command when A knows B received it.</li>
	<li>A MUST NOT re-send a command after <em>B could know that</em> A knows B received it.</li>
	<li>A MUST remember received commands till A knows that B knows A received it.</li>
</ul>


<p>Or in protocol terms:</p>

<ul>
	<li>A MAY forget sent command N when it receives a response to N.</li>
	<li>A MUST NOT resend N after sending a response to a response to N.</li>
	<li>A MUST remember received command N until it has both sent M responding to N <em>and</em> received a response to M.</li>
</ul>



<h2><a name="ClusterDesignNote-Resumingachannel"></a>Resuming a channel</h2>

<p>When a channel is first opened, the broker provides a session-id. If there is a failure, the client can connect to the session backup broker and resume the channel as follows (sync code is just for illustration)</p>

<p><em>TODO does it matter if the new channel number is different from the old?</em></p>

<ol>
	<li>Client client_resume: <blockquote><p>send(command=channel_resume, command_id=0, session_id=resume_id, process_mark=pre_crash_process_mark) ok = receive(command=channel_ok) self_received(ok) # Clean up to peers process mark. resend() continue_session_as_normal()</p></blockquote></li>
</ol>


<ol>
	<li>Both sides resend(): <blockquote><ol>
	<li>Resend in-flight messages. for s in sent: # Careful not to respond to a command we haven't received yet. if s.is_response: until(received.contains(s.resonds_to_id)): self_received(receive()) send(s);   # Original command ids and process_mark</li>
</ol>
</blockquote></li>
</ol>


<ol>
	<li>Broker broker_received_channel_resume(r): <blockquote><p>session=sessions<span class="nobr"><a href="/confluence/pages/createpage.action?spaceKey=qpid&amp;title=r.session_id&amp;linkCreation=true&amp;fromPageId=31318" title="Create Page: r.session_id" class="createlink">r.session&#95;id<sup><img class="rendericon" src="/confluence/images/icons/plus.gif" height="7" width="7" align="absmiddle" alt="" border="0"/></sup></a></span> self_received(r) # Up to date with peers process mark. send(command=channel_ok, command_id=0, process_mark=session.process_mark) resend() continue_session_as_normal()</p></blockquote></li>
</ol>



<h2><a name="ClusterDesignNote-Replicatingsessionstate."></a>Replicating session state.</h2>

<p><em>TODO: Need to minimize primary synchronously waiting on backup, while ensuring that the primary always knows that the backup is in a state that satisfies the clients expectations for failover. See recent email thread betwween me &amp; gordon</em></p>





<h1><a name="ClusterDesignNote-MappingofAMQPcommandstoreplicationmechanisms"></a>Mapping of AMQP commands  to replication mechanisms</h1>

<h2><a name="ClusterDesignNote-queue.declare%2Fbind%2Fdelete%2Cexchange.declare%2Fdelete"></a>queue.declare/bind/delete, exchange.declare/delete</h2>

<p>Update cluster map.  Local broker creates the initial queue as primary and establishes a backup.</p>

<p><em>Private queue</em>: backed up on the <em>session backup</em>.</p>

<p><em>Shared queue</em>: local primary queue is the first <em>primary fragment</em>. Other brokers that receive publishes for the queue can proxy to this fragment or create their own local fragment (<em>TODO: How do we decide?</em>) Consumes are always served from the local fragment if possible, otherwise proxied to another fragment <em>(TODO: load balancing algorithms to choose the appropriate fragment)</em></p>


<h2><a name="ClusterDesignNote-message.transfer%2Fbasic.publish%28clienttobroker%29"></a>message.transfer/basic.publish (client to broker)</h2>

<p>Local broker evaluates the binding to determine which queue(s) receive the message.</p>
<ul>
	<li>primary queues: update local queue, replicate to backup.</li>
	<li>proxy queues: forward to primary<br/>
(When the proxy is also a backup we can optimize out the replication step.)</li>
</ul>


<p>If the message is delivered to more than one proxy queue on the same node, we just relay the message once. Brokers must be able to differentiate between normal message transfer and proxy/replication transfer so that when the evaluate the binding they only apply the message to local primary/backup queues respectively, and don't attempt to re-forward messages.</p>

<p><em>TODO: there are a few options</em>:</p>
<ul>
	<li>Use custom backup/proxy exchanges and pass an explicit list of queues to receive the message in the header table.</li>
	<li>Use normal AQMP commands over a marked connectin/channel</li>
	<li>Introduce new cluster commands.</li>
</ul>



<h2><a name="ClusterDesignNote-message.transfer%28brokertoclient%29%2Cmessage.deliver"></a>message.transfer(broker to client), message.deliver</h2>

<ul>
	<li>primary: replicate deliver to backup(s) deliver to client.</li>
	<li>proxy: pass through to client.</li>
</ul>


<p>Before sending a message to a client, the primary must be sure that the session backup 'knows' about the delivery; i.e. in the event of primary failure the backup knows about unacked messages and will be able to handle an ack or reject for it, resend or requeue it.</p>

<p>If we can define a clear and deterministic algorithm for message dispatch, and if we replicate all 'inputs' in order then that should be sufficient.</p>

<p>Selectors slightly complicate the picture, as do multiple consumers and flow control particularly for shared queues where the consumers could be from different sessions.</p>

<p>In the case of an exclusive or private queue all the inputs come from a single session. If all session requests are handled serially on both primary and backup then dispatch should be deterministic; if separate threads were used to process separate queues that would be lost as the allocation of delivery tags would be dependent on the interleaving of those threads.</p>

<p>One way of avoiding the need for deterministic dispatch would be for the primary to send a message to the backup(s) to indicate an allocation before the deliver is sent to the client. This could inform the backup of the queue in question, the message id and the delivery tag/request id. The big drawback is that it requires a round-trip to the backup before each deliver and would really affect throughput.</p>

<p>This looks like an area that needs some specific focus. Can we convince ourselves of a clear and deterministic dispatch algorithm, are thereother solutions that would avoid requiring this without too much synchronicity?</p>



<h2><a name="ClusterDesignNote-message.consume%2Fbasic.consume"></a>message.consume/basic.consume</h2>
<ul>
	<li>proxy: forward consume. No replication, client will re-establish consumers.</li>
	<li>primary: register consumer.</li>
</ul>



<h2><a name="ClusterDesignNote-basic.ack%2Fmessage.ok%28fromclient%29"></a>basic.ack/message.ok(from client)</h2>
<ul>
	<li>proxy: forward</li>
	<li>primary: mark message processed, replicate to backups.</li>
</ul>



<h2><a name="ClusterDesignNote-basic.ack%2Fmessage.ok%28frombroker%29"></a>basic.ack/message.ok(from broker)</h2>
<ul>
	<li>proxy: forward to client</li>
	<li>client: mark message processed.</li>
</ul>



<h2><a name="ClusterDesignNote-basic.reject%2Fmessage.reject"></a>basic.reject / message.reject</h2>

<p>Similar to the processing of basic.ack. However here the message might be requeued or might be moved to a dead letter queue. Ignoring the dead letter queue in the first instance, the backup would merely cancel the effect of the basic.allocate on receiving the basic.reject.</p>


<h2><a name="ClusterDesignNote-reference.open%2Fapppend%2Fclose%28clienttobroker%29"></a>reference.open/apppend/close (client to broker)</h2>
<ul>
	<li>proxy: replicate to session backup, forward to primary.</li>
	<li>primary: process.</li>
</ul>



<h2><a name="ClusterDesignNote-reference.open%2Fapppend%2Fclose%28brokertoclient%29"></a>reference.open/apppend/close (broker to client) **</h2>
<ul>
	<li>primary: send open/append/close.</li>
	<li>proxy: replicate to session backup, forward to client.</li>
</ul>



<h2><a name="ClusterDesignNote-Allcommands"></a>All commands</h2>
<ul>
	<li>proxy replicates required command history to session backup.</li>
</ul>




<h1><a name="ClusterDesignNote-ClientBrokerProtocol"></a>Client-Broker Protocol</h1>

<p>Normal AMQP with the following extensions.</p>

<p>Initial connection:</p>
<ul>
	<li>Pass session name as 0-9 connection identifier or via arguments table.</li>
	<li>Broker provides list of failover replicas in arguments table.</li>
</ul>


<p>During connection:</p>
<ul>
	<li>Client can subscribe to a special "cluster exchange" for messages carrying updates to failover candidates.</li>
</ul>


<p>On failure:</p>
<ul>
	<li>client chooses failover node randomly from most recent list.</li>
	<li>cluster list my identify "preferred" failover candidates.</li>
</ul>


<p>On re-connect:</p>
<ul>
	<li>0-9 resume command identifies session.</li>
	<li>Client rebuilds conversational state.</li>
	<li>opens channels</li>
	<li>creates consumers</li>
	<li>establishes</li>
	<li>replays unacknowledeged commands and continues session.</li>
</ul>


<p>Note: the client sends conversational state data in messages to a special system exchange. We cant simply use standard AMQP to rebuild channel state, as we will end up with channels with a different command numbering from the interrupted session. For transparency we also want to distinguish reconnection from resumed "normal" operation.</p>

<p>At this point the session can continue.</p>


<h1><a name="ClusterDesignNote-BrokerBrokerProtocol"></a>Broker-Broker Protocol</h1>

<p>Broker-broker communication uses extended AMQP over specially identified connections and channels (identified in the connection negotiation argument table.)</p>

<p><b>Proxying</b>: acting as a proxy, a broker forwards commands from client to primary and vice versa. The proxy is as transparent and stateless as possible. A proxy must renumer channels and commands since a single incoming connection may be proxied to more than one outbound connection, so it does need to keep some state. This state is part of the session state replicated to the session backup.</p>

<p><b>Queue/fragment replication</b>: Depends on whether AMQP or GFS is used to replicate content.</p>

<p><b>AMQP</b>: For enqueue use AMQP transfer command to transfer content to backup(s). For dequeue use AMQP get command to indicate message removed - no data is transferfed for get over a replication channel.</p>

<p><em>TODO</em>: this use of get is strained, it starts to look like we may need a separate replication class of commands.</p>

<p><b>GFS</b>: Queue state is updated in journal files. On failover, the backup reconstruct queue state from the journal.</p>

<p><b>Session replication</b>: The broker must replicate a command (and get confirmation it was replicated) before responding. For async clients this can be done in a pair of asynchronous streams, i.e. we don't have to wait for a response to command A before we forward command B.</p>

<p>Session data is replicated via AMQP on special connections. Primary forwards all outgoing requests and incoming responses to the session backup. Backup can track the primary request/response tables and retransmit messages.</p>

<p><em><b>TODO</b></em>: 0-9 references force us to have heavy session backup, because message data on a reference is not associated with any queue and therefore can't be backed up in a queue backup. If references are removed in 0-10 revisit the need for session backups, we may be able to comress session data enough to store it in the cluster map.</p>


<h1><a name="ClusterDesignNote-PersistenceandRecovery"></a>Persistence and Recovery</h1>

<h2><a name="ClusterDesignNote-Competingfailuremodes%3A"></a>Competing failure modes:</h2>

<p><b>Tibco</b>: fast when running clean but performance over time has GC "spikes" Single journal for all queues. "holes" in log have to be garbage collected to re-use the log. 1 slow consumer affects everyone because it causes fragmentation of the log.</p>

<p><b>MQ</b>: write to journal, write journal to DB, read from DB. Consistent &amp; reliable but slow.</p>

<p><b>Street homegrown solutions</b>: transient MQ with home grown persistence. Can we get more design details for these solutions?</p>



<h2><a name="ClusterDesignNote-Persistenceoverview"></a>Persistence overview</h2>

<p>There are 3 reasons to persist a message:</p>

<p><b>Durable messages</b>: must be stored to disk across broker shutdowns or failures.</p>
<ul>
	<li>stored when received.</li>
	<li>read during start-up.</li>
	<li>must be removed after deliver.</li>
</ul>


<p><b>Reliability</b>: recover after a crash.</p>
<ul>
	<li>stored when received.</li>
	<li>read during crash recovery.</li>
	<li>must be removed after delivery.</li>
</ul>


<p><b>Flow-to-disk</b>: to reduce memory use for full queues.</p>
<ul>
	<li>stored when memory gets tight.</li>
	<li>read when delivered.</li>
	<li>must be removed after delivery.</li>
</ul>



<p>Durable and reliable cases are very similar: storage time is performance-critical (blocks response to sender) but reading is not and cleanup can be done by an async thread or process.</p>

<p>For flow-to-disk, when queues are full, both store and reading are critical.</p>

<p>So it looks like the same solution will work for durable and reliable.</p>

<p>Flow-to-disk has different requirements but it would be desirable to re-use some or all of the durable/reliable solution. In particular if flow-to-disk is combined with durable/reliablle it would be wasteful to write the message to disk a second time - instead it would seem better to keep an in-memory index that allows messages to be read quickly from the reliable/durable store.</p>

<p>We also need to persist <b>wiring</b> (Queues/Exchanges/Bindings), but this is much less performance critical. The entire wiring model is held in memory so wiring is only read at startup, and updates are low volume and not performance-critical. A simple database should suffice.</p>



<h1><a name="ClusterDesignNote-Journals"></a>Journals</h1>

<h2><a name="ClusterDesignNote-Overview"></a>Overview</h2>

<p>A journal is a sequential record of actions taken (e.g. messages enqueued, responses sent.) sufficient to reconstruct the state of the journalled entity (e.g. queue) in the case of failure and recovery.</p>


<p><b>TODO</b>: <em>Journal indexing, async journal (thruput vs. latency), journal as common API for backups and disk store?</em></p>

<p><em><b>TODO</b></em>: <em>Windows for error in journalling - how to make disk update and network ack atomic?  How do other technologies handle it?</em></p>

<p><em><b>TODO</b></em>: <em>References strike again: where do they go in a journal-per-queue?</em></p>

<p><em><b>TODO</b></em>: <em>Journal per broker pros/cons</em></p>


<h2><a name="ClusterDesignNote-Useofjournals"></a>Use of journals</h2>

<p>For reliability and durability we will use</p>
<ul>
	<li>Queue journal (per queue) records enqueue/dequeues and acknowledgements.</li>
	<li>Session journal (per session) records references in progress</li>
</ul>


<p>The broker writes enqueue and dequeue records to the end of the active journal file. When the file reaches a fixed size it starts a new one.</p>

<p>A cleanup agent (thread or process) removes, recycles or compacts journal files that have no live (undelivered) messages. (References complicate the book-keeping a little but don't alter the conceptual model.)</p>

<p>Recovery or restart reconstructs the queue contents from the enqueue/dequeue records in the journal.</p>

<p>Flow-to-disk can re-use the journal framework, with a simple extension: the broker keeps an in-memory index of live messages in the journal.</p>

<p>If flow-to-disk is combined with reliability then messages are automatically journalled on arrival, so flow-to-disk can simply delete them from memory and use the in-memory index to read them for delivery.</p>

<p>Without reliability flow-to-disk is similar except that messages are only journalled if memory gets tight.</p>

<p><b>Disk thrashing</b>: Why do we think skipping disk heads around between multiple journals will be better than seeking up and down a single journal? Are we assuming that we only need to optimize the case where long sequences of traffic tend to be for the same queue?</p>

<p><b>No write on fast consume</b>: Optimization - if we can deliver (and get ack) faster than we write then no need to write. How does this interact with HA?</p>

<p><b>Async journalling</b>: writing to client, writing to journal, acks from client, acks from journal are separate async streams? So if we get client ack before the journalling stream has written the journal we cancel the write? But what kind of ack info do we need? Need a diagram of interactions, failure points and responses at each point. Start simple and optimize, but dont rule out optimizations.</p>


<h2><a name="ClusterDesignNote-Whataboutdisklessreliability%3F"></a>What about diskless reliability?</h2>

<p>Is memory+network replication with no disk a viable option for high-speed transient message flow? May be faster, but can't support durable messages/persistent queues. We will lose messages in total failure or multiple failures where all backups fail, but we can survive single failures and will run a lot faster than diskful.</p>




<h1><a name="ClusterDesignNote-Virtualsynchrony"></a>Virtual synchrony</h1>

<p><b>TODO</b>: Wiring &amp; membership via virtual synchrony</p>

<p><b>TODO</b>: journaling, speed. Will file-per-q really help with disk burnout?</p>


<h1><a name="ClusterDesignNote-Configuration"></a>Configuration</h1>

<h2><a name="ClusterDesignNote-SimplifyingpatternsPossiblewaystoconfigureacluster%3A"></a>Simplifying patterns Possible ways to configure a cluster:</h2>
<ul>
	<li>Virtual hosts as units of replication.</li>
	<li>Backup rings: all primary components in a broker use the same backup broker and vice-versa. Backups form rings.</li>
	<li>Broker component rinks: all the components <em>except sessions</em> have the same backup broker. Session backups are chosen at random so a brokers load will be distributed rather than all falling on its backup.</li>
	<li>Disk management issues?</li>
	<li>Shared storage issues?</li>
</ul>



<h2><a name="ClusterDesignNote-Dynamicclusterconfiguration"></a>Dynamic cluster configuration</h2>
<ul>
	<li>Failover: the primary use case.</li>
	<li>Add node: backup, proxy, primary case?</li>
	<li>Redirect clients from loaded broker (pretend failure)</li>
	<li>Move queue primary from loaded broker/closer to consumers?</li>
	<li>Re-start after failover.</li>
</ul>


<p><b>Issue:</b> unit of failover/redirect is connection/channel but "working set" of queues and exchanges is unrelated. Use virtual host as unit for failover/relocation? It's also a queue namespace...</p>

<p>If a queue moves we have to redirect its <em>consumers</em>, can't redirect entire channels! Channels in the same session may move between connections. Or rather we depend on broker to proxy?</p>

<p>Backups: chained backups rather than multi-backup? Ring backup? What about split brain, elections, quorums etc.</p>

<p>Should new backups acquire state from primary, from disk or possibly both? Depends on GFS/SAN vs. commodity hw?</p>



<h1><a name="ClusterDesignNote-Transactions"></a>Transactions</h1>

<h2><a name="ClusterDesignNote-Localtransactions"></a>Local transactions</h2>

<p>AMQP offers local and distributed transactions, however in a cluster a local transaction could involve queues that are distributed across several nodes.</p>

<p><em><b>TODO</b></em>: This complicates the model of a proxy as a simple forwarder. You cannot simply forward a local transaction involving queues on two separate primary brokers, the proxy has to be aware of the transaction.</p>

<p><em><b>TODO</b></em> Can we use point-to-point local transactions or do we have to turn this into a dtx? If dtx, who co-ordinates? Is every broker potentially a transaction co-ordinator?</p>

<p><em><b>TODO</b></em>: For distributed transactions, will the primary broker and its backups act as a single clustered resource manager for the resource set, or will a failure of one broker abort the transaction?</p>


<h2><a name="ClusterDesignNote-DistributedTransactions"></a>Distributed Transactions</h2>

<p>The prepare needs to be replicated so that if one node fails before completion another node can honour the guarantee to be able to commit or abort. It is also possibe that the work of a transaction is distributed across more than one node anyway.</p>

<p>I think broadcasting all dtx commands over the group communication protocol seems like the most likely way to handle this.</p>

<p>The session in which the commands are initiated needs to be replicated also to allow clean resumption on failover.</p>




<h1><a name="ClusterDesignNote-OpenQuestions"></a>Open Questions</h1>

<p>Issues: double failure in backup ring: A -&gt; B -&gt; C. Simultaneous failure of A and B. C doesn't have the replica data to take over for A.</p>

<p>Java/C++ interworking - is there a requirement? Fail over from C++ to Java? Common persistence formats?</p>



<h1><a name="ClusterDesignNote-Implementationbreakdown."></a>Implementation breakdown.</h1>

<p>The following are independently useful units of work that combine to give the full story:</p>

<p><b>Proxy Queues</b>: Useful in federation. Pure-AMQP proxies for exchanges might also be useful but are not needed for current purpose as we will use virtual synchrony to replicate wiring.</p>

<p><b>Fragmented queues</b>: Over pure AMQP (no VS) useful by itself for unreliable high volume shared queue federation.</p>

<p><b>Virtual Synchrony Cluster</b>: Multicast membership and total ordering protocol for brokers. Not useful alone, but useful with proxies and/or fragments for dynamic federations.</p>

<p><b>Primary-backup replication</b>: Over AMQP, no persistence. Still offers some level of reliability in a simple primary-backup pair.</p>

<p><b>Persistence</b>: Useful on its own for flow-to-disk and durable messages. Must meet the performance requirements of reliable journalling.</p>

<hr />

<p><table class='Footnotes' style='width: 100%; border:none;' cellspacing='0' cellpadding='0' summary='This table contains one or more notes for references made elsewhere on the page.'>
  <caption class='accessibility'>Footnotes</caption>
  <thead class='accessibility'>
    <tr class='accessibility'>
      <th class='accessibility' id='footnote-th1'>Reference</th>
      <th class='accessibility' id='footnote-th2'>Notes</th>
    </tr>
  </thead>
  <tbody>
    <tr name='Footnote1'>
      <td valign='top' class='FootnoteNum' headings='footnote-th1'>
        <a href='#FootnoteMarker1'
          onClick='footnoteMarkerHighlight("1");'
          onMouseOver='footnoteHighlight("1",false);'
          alt='Footnote: Click to return to reference in text'
          title='Footnote: Click to return to reference in text'
          id='FootnoteNum1'>
            1
        </a>
      </td>
      <td id='Footnote1'
        valign='top'
        width='100%'
        class='Footnote'
        headings='footnote-th2'>
          Exclusive or auto-delete queues are deleted on disconnect, we'll return to this point.
      </td>
    </tr>
    <tr name='Footnote2'>
      <td valign='top' class='FootnoteNum' headings='footnote-th1'>
        <a href='#FootnoteMarker2'
          onClick='footnoteMarkerHighlight("2");'
          onMouseOver='footnoteHighlight("2",false);'
          alt='Footnote: Click to return to reference in text'
          title='Footnote: Click to return to reference in text'
          id='FootnoteNum2'>
            2
        </a>
      </td>
      <td id='Footnote2'
        valign='top'
        width='100%'
        class='Footnote'
        headings='footnote-th2'>
          The "session" concept is not fully defined in AMQP 0-8 or 0-9 but is under discussion. This design note will define a session that may be proposed to AMQP.
      </td>
    </tr>
    <tr name='Footnote3'>
      <td valign='top' class='FootnoteNum' headings='footnote-th1'>
        <a href='#FootnoteMarker3'
          onClick='footnoteMarkerHighlight("3");'
          onMouseOver='footnoteHighlight("3",false);'
          alt='Footnote: Click to return to reference in text'
          title='Footnote: Click to return to reference in text'
          id='FootnoteNum3'>
            3
        </a>
      </td>
      <td id='Footnote3'
        valign='top'
        width='100%'
        class='Footnote'
        headings='footnote-th2'>
          An individual broker by this definition is really a broker behind a single AMQP address. Such a broker might in fact be a cluster using technologies like TCP failover/load balancing. This is outside the scope of this design, which focusses on clustering at the AMQP protocol layer, where cluster members have separate AMQP addresses.
      </td>
    </tr>
    <tr name='Footnote4'>
      <td valign='top' class='FootnoteNum' headings='footnote-th1'>
        <a href='#FootnoteMarker4'
          onClick='footnoteMarkerHighlight("4");'
          onMouseOver='footnoteHighlight("4",false);'
          alt='Footnote: Click to return to reference in text'
          title='Footnote: Click to return to reference in text'
          id='FootnoteNum4'>
            4
        </a>
      </td>
      <td id='Footnote4'
        valign='top'
        width='100%'
        class='Footnote'
        headings='footnote-th2'>
          They may not be equivalent on other grounds, e.g. network distance from client, load etc.
      </td>
    </tr>
  </tbody>
</table></p></div>


</td></tr></table></div>
<p>
<table border="0" cellpadding="0" cellspacing="0" width="100%">
    <tr>
        <td height="12" background="http://cwiki.apache.org/confluence/images/border/border_bottom.gif"><img src="http://cwiki.apache.org/confluence/images/border/spacer.gif" width="1" height="1" border="0"/></td>
    </tr>
</table>

<div class="smalltext">
    Powered by
    <a href="http://www.atlassian.com/software/confluence/default.jsp?clicked=footer" class="smalltext">Atlassian Confluence</a>
    (Version: 2.2.9 Build:#527 Sep 07, 2006)
    -
    <a href="http://jira.atlassian.com/secure/BrowseProject.jspa?id=10470" class="smalltext">Bug/feature request</a><br/>
    <br>
    <a href="http://cwiki.apache.org/confluence/users/viewnotifications.action">Unsubscribe or edit your notifications preferences</a>

</div>

</body>
</html>


Mime
View raw message