zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From s influxdb <elastic....@gmail.com>
Subject Re: node 2 not rejoining cluster
Date Tue, 12 Apr 2016 02:02:10 GMT
created a parallel independant zookeeper cluster on the same set of
machines with different ports and that worked. This indicates the port was
the issue.

On Mon, Apr 11, 2016 at 1:35 PM, s influxdb <elastic.l.k@gmail.com> wrote:

> reboot of the server didn't help
>
> On Thu, Apr 7, 2016 at 6:50 PM, s influxdb <elastic.l.k@gmail.com> wrote:
>
>> I ran tcpdump on all the three nodes.
>> It looks like that for every  [PSH, ACK] there is a missing [ACK] from
>> the other nodes to this 2nd node on port 3888.
>>
>>
>> On Thu, Apr 7, 2016 at 1:29 PM, s influxdb <elastic.l.k@gmail.com> wrote:
>>
>>> Thanks Flavio for your quick replies.
>>> The zookeeper version is 3.4.6
>>>
>>>
>>>
>>> On Thu, Apr 7, 2016 at 1:23 PM, Flavio P JUNQUEIRA <fpj@apache.org>
>>> wrote:
>>>
>>>> You need to determine why it is not receiving notification messages.
>>>> From
>>>> the information you've given, it doesn't look like a zookeeper code
>>>> issue.
>>>>
>>>> BTW, which version are you using?
>>>>
>>>> -Flavio
>>>> On 7 Apr 2016 21:20, "s influxdb" <elastic.l.k@gmail.com> wrote:
>>>>
>>>> > nothin on the iptables firewall .
>>>> >
>>>> > What options do i have to reconnect this node to the cluster ?
>>>> >
>>>> >
>>>> > On Thu, Apr 7, 2016 at 10:14 AM, s influxdb <elastic.l.k@gmail.com>
>>>> wrote:
>>>> >
>>>> > > telnet works on 2888 and 3888 to the other nodes. Now i see
>>>> > > java.net.SocketTimeoutException: connect timed out messages in
the
>>>> logs
>>>> > for
>>>> > > node 2
>>>> > >
>>>> > > On Thu, Apr 7, 2016 at 3:05 AM, Flavio Junqueira <fpj@apache.org>
>>>> wrote:
>>>> > >
>>>> > >> I only see notifications from the node to itself. It says that
it
>>>> is
>>>> > >> connected to 1, but it doesn't seem to be receiving the
>>>> notification
>>>> > from
>>>> > >> 1. It also doesn't seem to be receiving the connection request
>>>> from 3.
>>>> > >>
>>>> > >> Last time I've seen something like this was due to iptables
rules,
>>>> but
>>>> > if
>>>> > >> it was working before and no configuration has changed, then
I
>>>> don't
>>>> > know
>>>> > >> what it could be.
>>>> > >>
>>>> > >> -Flavio
>>>> > >>
>>>> > >> > On 07 Apr 2016, at 05:43, s influxdb <elastic.l.k@gmail.com>
>>>> wrote:
>>>> > >> >
>>>> > >> > this is the pastie
>>>> > >> > http://pastie.org/10788301
>>>> > >> >
>>>> > >> > On Wed, Apr 6, 2016 at 9:41 PM, s influxdb <
>>>> elastic.l.k@gmail.com>
>>>> > >> wrote:
>>>> > >> >
>>>> > >> >> We had one of the node giving OOM java.lang.OutOfMemoryError:
>>>> unable
>>>> > to
>>>> > >> >> create new native thread and then being unresponsive.
>>>> > >> >>
>>>> > >> >> We tried to add the node back to the cluster but with
no luck.
>>>> > >> >>
>>>> > >> >> It doesn't seem to "Receive any notification "  messages
from
>>>> the
>>>> > other
>>>> > >> >> nodes.
>>>> > >> >> Keeps "Sending notifications " in loop
>>>> > >> >>
>>>> > >> >> Please see attached the logs of the node that is out
of
>>>> rotation.
>>>> > >> >>
>>>> > >> >> Any inputs appreciated.
>>>> > >> >>
>>>> > >> >> Thanks
>>>> > >> >>
>>>> > >>
>>>> > >>
>>>> > >
>>>> >
>>>>
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message