activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "austin.mills" <austin.mi...@involver.com>
Subject Problems with Store-and-Forward network and Failover transport
Date Tue, 07 Dec 2010 21:51:15 GMT

We are encountering some problems with our ActiveMQ store-and-forward setup
where clients are connecting to machines they shouldn't, and my guess is
that it's because of undocumented/unexpected behavior in the Failover
transport.  Our JMS clients on our background workers should all be talking
to a single central broker to get jobs, but instead seem to be connecting
almost randomly to other brokers which should only be doing
store-and-forward to the central broker.  Can somebody help me figure out
why the clients are connecting to the wrong brokers? 

More details follow:

We have a three tiered setup, with multiple app servers, a single central
broker, and multiple workers. The app servers generate jobs, which are sent
as a JMS message. To allow the app servers to continue creating jobs in
spite of occasional network outages, each app server has its own broker
located on the app server itself. These brokers are set up to
store-and-forward messages to the central broker.  Here are the respective
configurations being used (all persistent queues, clients and brokers are
running 5.4.2, JVM is 1.6.0_21, OS is debian):

App server:
    <broker xmlns="http://activemq.apache.org/schema/core"
brokerName="apphostname" dataDirectory="${activemq.base}/data">
        <networkConnectors>
            <networkConnector uri="static:(nio://brokerhostname:61616)"
userName="username" password="pass"/>
        </networkConnectors>
        <persistenceAdapter>
            <kahaDB directory="${activemq.base}/data/kahadb"
enableIndexWriteAsync="true"/>
        </persistenceAdapter>
        <plugins>
            <simpleAuthenticationPlugin>
	        <users>
		    <authenticationUser username="username" password="pass"
groups="users,admins"/>
	        </users>
            </simpleAuthenticationPlugin>
        </plugins>
        <transportConnectors>
            <transportConnector name="nio" uri="nio://0.0.0.0:61616"/>
        </transportConnectors>
    </broker>

Broker:
    <broker xmlns="http://activemq.apache.org/schema/core"
brokerName="broker" dataDirectory="${activemq.base}/data">
        <persistenceAdapter>
            <kahaDB directory="${activemq.base}/data/kahadb"
enableIndexWriteAsync="true"/>
        </persistenceAdapter>
        <plugins>
            <simpleAuthenticationPlugin>
                <users>
                    <authenticationUser username="username" password="pass"
                        groups="users,admins"/>
                </users>
            </simpleAuthenticationPlugin>
        </plugins>
        <systemUsage>
            <systemUsage>
                <memoryUsage>
                    <memoryUsage limit="975 mb"/>
                </memoryUsage>
            </systemUsage>
        </systemUsage>
        <transportConnectors>
            <transportConnector name="nio" uri="nio://0.0.0.0:61616"/>
        </transportConnectors>
    </broker>

The JMS clients on the workers connect to the URI
"failover:(nio://brokerhostname:61616)". According to
http://activemq.apache.org/how-can-i-support-auto-reconnection.html, I
should be able to use a URL of this style with a single endpoint to get
automatic reconnection, even with a single broker. 

In practice, here is what happens: We start up the app server brokers, and
the central broker, and everything looks good. The app servers successfully
connect to the central broker. We can start up the application, and messages
begin flowing to the app server brokers. Then we start up the workers, which
should connect to the central broker. This is where things get a little
screwy. Looking at netstat, I can see that some of the workers connect to
the central broker. Others, however, seem to be connected directly to the
app server brokers. This is a problem, because we might have thousands of
job messages in the central broker, but the app server broker that the
worker client is connected to might have none. 

As far as I can tell, the clients are connecting to the central broker,
getting a list of all of the brokers from the central broker, and then
connecting to one of them at random (possibly the central broker). If we
remove the Failover transport and just use a url for the client like
"nio://brokerhostname:61616", then things work as expected. I suspect from a
casual reading of the activemq source that this is because of the Failover
transport attempting to find multiple brokers to talk to, but in this case
we're merely trying to use that in order to have the clients attempt to
reconnect to the central broker when they get disconnected. This is not at
all what I expected from the description of the Failover Transport, and so
my main questions were the following:

1. Is my supposition correct, that this behavior is due to the use of the
Failover transport and the list of brokers being provided by the central
broker to clients?
2. Is this the intended behavior?
3. If that's the intended behavior, how would I achieve our desired
configuration?

Thanks for any assistance,
--Austin

-- 
View this message in context: http://activemq.2283324.n4.nabble.com/Problems-with-Store-and-Forward-network-and-Failover-transport-tp3077313p3077313.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Mime
View raw message