geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GEODE-1393) locator returns incorrect server information when starting up
Date Mon, 16 May 2016 15:09:13 GMT

    [ https://issues.apache.org/jira/browse/GEODE-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284701#comment-15284701
] 

ASF subversion and git services commented on GEODE-1393:
--------------------------------------------------------

Commit 6523c97c92f607746d80b11c7cb5315b1137f5a2 in incubator-geode's branch refs/heads/develop
from [~bschuchardt]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=6523c97 ]

GEODE-1393 locator returns incorrect server information when starting up

When a locator auto-reconnects its ServerLocator needs to initialize its
ControllerAdvisor so that it has server information to give to clients.
The ServerLocator was creating a new ControllerAdvisor but didn't ask it
to perform a handshake to fill in its profiles.

ReconnectDUnitTest had an existing testReconnectWithQuorum test that
wasn't doing what it was supposed to.  I've removed the TODO from that
test and modified it to force-disconnect the tests Locator.  The
locator must restart its TcpServer component before it can start
a DistributedSystem, so this exercises the path in
InternalLocator.attemptReconnect() that boots the TcpServer prior to
connecting the DistributedSystem.  After the DistributedSystem
finishes reconnecting the ServerLocator's distribution advisor
should have been initialized by performing the handshake.


> locator returns incorrect server information when starting up
> -------------------------------------------------------------
>
>                 Key: GEODE-1393
>                 URL: https://issues.apache.org/jira/browse/GEODE-1393
>             Project: Geode
>          Issue Type: Bug
>          Components: locator
>            Reporter: Bruce Schuchardt
>            Assignee: Bruce Schuchardt
>
> When starting up a locator has no knowledge of cache servers that might be in the distributed
system but it will process server-location requests from clients and return them incorrect
information until it receives load info from the servers.
> In one test I saw a locator be ejected from the distributed system.  When it auto-reconnected
some cache clients asked it for server locations and, though there were 6 cache servers available
the clients got this exception:
> {noformat}
> com.gemstone.gemfire.cache.client.NoAvailableServersException
>         at com.gemstone.gemfire.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:257)
>         at com.gemstone.gemfire.cache.client.internal.OpExecutorImpl.getNextOpServerLocation(OpExecutorImpl.java:318)
>         at com.gemstone.gemfire.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:130)
>         at com.gemstone.gemfire.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:123)
>         at com.gemstone.gemfire.cache.client.internal.PoolImpl.execute(PoolImpl.java:714)
>         at com.gemstone.gemfire.cache.client.internal.GetOp.execute(GetOp.java:97)
>         at com.gemstone.gemfire.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:112)
>         at com.gemstone.gemfire.internal.cache.tx.ClientTXRegionStub.findObject(ClientTXRegionStub.java:72)
>         at com.gemstone.gemfire.internal.cache.TXStateStub.findObject(TXStateStub.java:379)
>         at com.gemstone.gemfire.internal.cache.TXStateProxyImpl.findObject(TXStateProxyImpl.java:607)
>         at com.gemstone.gemfire.internal.cache.LocalRegion.get(LocalRegion.java:1460)
>         at com.gemstone.gemfire.internal.cache.LocalRegion.get(LocalRegion.java:1398)
>         at com.gemstone.gemfire.internal.cache.LocalRegion.get(LocalRegion.java:1385)
>         at com.gemstone.gemfire.internal.cache.AbstractRegion.get(AbstractRegion.java:336)
> {noformat}
> ServerLocator has a readiness check but it is only testing to see if its DistributedSystem
instance variable has been initialized.  It ought to wait until it has received a server load
update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message