qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robbie Gemmell <robbie.gemm...@gmail.com>
Subject Re: Questions on Java Broker BDB HA failover (was: [jira] [Commented] (QPID-4910) Python, Ruby, and C++ clients automatically connect to replica server when master fails)
Date Thu, 06 Jun 2013 20:01:19 GMT
Hi James,

It is probably worth you clearing out any bdbha store files you have
managed to create in your previous efforts and starting fresh. The bdb
files themselves contain all the group information for the nodes once
created, it is possible you have got any existing files on disk into a bit
of a state with the steps taken thus far.

Here is some example config that I used to start up a two node cluster
locally on the same machine, using port 5002 for the second node simply
because I obviously couldnt have them use the same port.

First node (master at this point):

<virtualhosts>
    <virtualhost>
        <name>localhost</name>
        <localhost>
            <store>

<class>org.apache.qpid.server.store.berkeleydb.BDBHAMessageStore</class>

<environment-path>/path/to/bdbhastore/node1</environment-path>
                <highAvailability>
                    <groupName>ReplicationGroup</groupName>
                    <nodeName>node1</nodeName>
                    <nodeHostPort>localhost:5001</nodeHostPort>
                    <helperHostPort>localhost:5001</helperHostPort>

<durability>NO_SYNC\,NO_SYNC\,SIMPLE_MAJORITY</durability>
                    <coalescingSync>true</coalescingSync>
                    <designatedPrimary>true</designatedPrimary>
                </highAvailability>
            </store>
        </localhost>
    </virtualhost>
</virtualhosts>


Second node (replica at this point):

<virtualhosts>
    <virtualhost>
        <name>localhost</name>
        <localhost>
            <store>

<class>org.apache.qpid.server.store.berkeleydb.BDBHAMessageStore</class>

<environment-path>/path/to/bdbhastore/node2</environment-path>
                <highAvailability>
                    <groupName>ReplicationGroup</groupName>
                    <nodeName>node2</nodeName>
                    <nodeHostPort>localhost:5002</nodeHostPort>
                    <helperHostPort>localhost:5001</helperHostPort>

<durability>NO_SYNC\,NO_SYNC\,SIMPLE_MAJORITY</durability>
                    <coalescingSync>true</coalescingSync>
                    <designatedPrimary>false</designatedPrimary>
                </highAvailability>
            </store>
        </localhost>
    </virtualhost>
</virtualhosts>

All you should have to do to bring up the nodes on different machines is
change the hostname from localhost to whatever your nodes actual hostnames
are (making sure the each of the names are resolvable from the other host)
and change the second node to use port 5001 if you want them both on 5001
on their respective machine, i.e.:

<nodeHostPort>node1hostname:5001</nodeHostPort>
<helperHostPort>node1hostname:5001</helperHostPort>

and

<nodeHostPort>node2hostname:5001</nodeHostPort>
<helperHostPort>node1hostname:5001</helperHostPort>


(In hindsight I shouldnt have called the virtualhost 'localhost', it would
possibly be clearer if i had picked something else. The vhost name has no
relationship to the bdbha node names or hostnames).

Robbie

====================================================

On 6 June 2013 18:16, James Belch <jamesbelch@verizon.net> wrote:
I made the following changes:

master:

<nodeHostPort>host1:5001</nodeHostPort>
<helperHostPort>host1:5001</helperHostPort>

replica:

<nodeHostPort>host2:5001</nodeHostPort>
<helperHostPort>host1:5001</helperHostPort>

I still get the same error.  Any ideas?

Thanks,
James




On 6 June 2013 18:03, Robbie Gemmell <robbie.gemmell@gmail.com> wrote:

> Hi James,
>
> There may be other issues, but the most obvious one is that the helper
> node configuration looks like it needs updated:
>
>
> "<nodeName>host1</nodeName>
> <nodeHostPort>host1:5001</nodeHostPort>
> <helperHostPort>host1:5002</helperHostPort>"
>
> The first node created/started should have its helper details set to its
> own node details, i.e <host1address>:5001, allowing it to create the group
> and become master.
>
> "<nodeName>host2</nodeName>
> <nodeHostPort>host2:5001</nodeHostPort>
> <helperHostPort>host1:5002</helperHostPort>"
>
> When starting the subsequent node to become the replica it should also
> have its helper details set to the address of the first node, i.e
> <host1address>:5001 again, allowing it to join the existing group.
>
> Robbie
>
> On 6 June 2013 17:38, Rob Godfrey <rob.j.godfrey@gmail.com> wrote:
>
>> Resending from my gmail rather than apache account, as my apache account
>> doesn't seem to be able to post to users :-)
>>
>> On 6 June 2013 18:29, Robert Godfrey <rgodfrey@apache.org> wrote:
>>
>> > Forwarding to the Qpid Users mail group - which is probably a better bet
>> > to get answers
>> >
>> > ---------- Forwarded message ----------
>> > From: James Belch (JIRA) <jira@apache.org>
>> > Date: 6 June 2013 17:35
>> > Subject: [jira] [Commented] (QPID-4910) Python, Ruby, and C++ clients
>> > automatically connect to replica server when master fails
>> > To: rgodfrey@apache.org
>> >
>> >
>> >
>> >     [
>> >
>> https://issues.apache.org/jira/browse/QPID-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677143#comment-13677143
>> ]
>> >
>> > James Belch commented on QPID-4910:
>> > -----------------------------------
>> >
>> > Thanks for the quick response.  I think we will to configure the clients
>> > with a list of brokers for the non Java clients.  Could you guys answer
>> > another question for me regarding failover.  I am using Berkeley DB to
>> > implement our High Availability solution.  I have the master configured
>> as
>> > follows:
>> >
>> > <name>localhost</name>
>> >   <localhost>
>> >     <store>
>> >
>> > <class>org.apache.qpid.server.store.berkeleydb.BDBHAMessageStore</class>
>> >       <environment-path>${work}/bdbhastore/host1</environment-path>
>> >       <highAvailability>
>> >         <groupName>ReplicationGroup</groupName>
>> >         <nodeName>host1</nodeName>
>> >         <nodeHostPort>host1:5001</nodeHostPort>
>> >         <helperHostPort>host1:5002</helperHostPort>
>> >         <durability>NO_SYNC\,NO_SYNC\,SIMPLE_MAJORITY</durability>
>> >         <coalescingSync>true</coalescingSync>
>> >         <designatedPrimary>true</designatedPrimary>
>> >       </highAvailability>
>> >     </store>
>> >     ...
>> >  </localhost>
>> >
>> > I have the replica configured as follows:
>> > <name>localhost</name>
>> >   <localhost>
>> >     <store>
>> >
>> > <class>org.apache.qpid.server.store.berkeleydb.BDBHAMessageStore</class>
>> >       <environment-path>${work}/bdbhastore/host2</environment-path>
>> >       <highAvailability>
>> >         <groupName>ReplicationGroup</groupName>
>> >         <nodeName>host2</nodeName>
>> >         <nodeHostPort>host2:5001</nodeHostPort>
>> >         <helperHostPort>host1:5002</helperHostPort>
>> >         <durability>NO_SYNC\,NO_SYNC\,SIMPLE_MAJORITY</durability>
>> >         <coalescingSync>true</coalescingSync>
>> >         <designatedPrimary>false</designatedPrimary>
>> >       </highAvailability>
>> >     </store>
>> >     ...
>> >  </localhost>
>> >
>> >
>> > When I start the replica server, I get the following error: "New node
>> > host2(-1) unknown to rep group".
>> > If I do a netstat, I see the connections attempting to be made, but the
>> > sockets go to TIME_WAIT state and timeout after a minute.  Any ideas?
>> >
>> > > Python, Ruby, and C++ clients automatically connect to replica server
>> > when master fails
>> > >
>> >
>> ---------------------------------------------------------------------------------------
>> > >
>> > >                 Key: QPID-4910
>> > >                 URL: https://issues.apache.org/jira/browse/QPID-4910
>> > >             Project: Qpid
>> > >          Issue Type: Improvement
>> > >          Components: Java Broker
>> > >    Affects Versions: 0.20
>> > >         Environment: C++, Ruby, Python, and Java clients connecting
>> to a
>> > Java Broker running on Redhat 6.3
>> > >            Reporter: James Belch
>> > >             Fix For: 0.23
>> > >
>> > >
>> > > I am currently in the process of designing a high availability
>> solution
>> > for our software.  We are using the Java broker, and we have Java, Ruby,
>> > C++, and Python clients.  I was reading your High Availability document
>> at
>> >
>> http://qpid.apache.org/books/0.18/AMQP-Messaging-Broker-Java-Book/html/High-Availability.htmlandsaw
a footnote at the bottom stating "[1] The automatic failover
>> > feature is available only for AMQP connections from the Java client.
>> > Management connections (JMX) do not current offer this feature."  Is
>> this
>> > still the case or was this fixed in .20?  If this is still the case,
>> will
>> > it be fixed in a future release?
>> >
>> > --
>> > This message is automatically generated by JIRA.
>> > If you think it was sent incorrectly, please contact your JIRA
>> > administrators
>> > For more information on JIRA, see:
>> http://www.atlassian.com/software/jira
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message