cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@agentisenergy.com>
Subject Re: Restarting cluster
Date Fri, 24 Jun 2011 14:50:26 GMT
It was port 7000 that was my issue.  I was thinking everything was going off
9160, and hadn't made sure that port was open.

Thanks Sasha and Jonathan.

On Fri, Jun 24, 2011 at 8:42 AM, Jonathan Ellis <jbellis@gmail.com> wrote:

> Did you try netcat to verify that you can get to the internal port on
> machine X from machine Y?
>
> On Fri, Jun 24, 2011 at 8:20 AM, David McNelis
> <dmcnelis@agentisenergy.com> wrote:
> > Running on Centos.
> > We had a massive power failure and our UPS wasn't up to 48 hours without
> > power...
> > In this situation the IP addresses have all stayed the same.  I can still
> > connect to the "other" node from cli, so I don't think its an issue where
> > the iptables settings weren't saved and started blocking traffic.
> > In terms of the log files, the only related line from the log files is
> > saying:
> >  INFO [main] 2011-06-24 07:48:44,750 StorageService.java (line 382)
> Loading
> > persisted ring state
> >  INFO [main] 2011-06-24 07:48:44,757 StorageService.java (line 418)
> Starting
> > up server gossip
> > When I turn on debugging and restart the non-seed node I get this line:
> > DEBUG [WRITE-/192.168.80.XXX] 2011-06-24 08:04:48,798
> > OutboundTcpConnection.java (line 161) attempting to connect to
> > /192.168.80.XXX
> > But no errors after it.
> >
> > On Fri, Jun 24, 2011 at 7:58 AM, Sasha Dolgy <sdolgy@gmail.com> wrote:
> >>
> >> Normally, no.  What you've done is fine.  What is the environment?
> >>
> >> On amazon EC2 for example, the instance could have crashed, a new one
> >> is brought online and has a different internal IP ...
> >>
> >> in the cassandra/logs/system.log are there any messages on the 2nd
> >> node and how it relates to the seed node?
> >>
> >> On Fri, Jun 24, 2011 at 2:49 PM, David McNelis
> >> <dmcnelis@agentisenergy.com> wrote:
> >> > I am running 0.8.0 on CentOS.  I have a 2 nodes in my cluster, one is
> a
> >> > seed, the other is autobootstrapped.
> >> > After having an unexpected shutdown of both of the physical machines I
> >> > am
> >> > trying to restart the cluster.  I first started the seed node, it went
> >> > through the normal startup process and finished without error.  Once
> >> > that
> >> > was complete I started the second node, again no errors in the log as
> it
> >> > was
> >> > starting, it started the gossip server, ect.
> >> > However when I look at the ring using nodetool, both machines  show
> >> > their
> >> > own status as up, then show the other machine as Down with a state of
> >> > Normal
> >> > and a load of ?.  I have tried restarting the individual nodes in
> >> > different
> >> > orders, waiting a while after restarting a node, but still the 'other'
> >> > node
> >> > always has a status of "down".  nodetool repair [keyspace] did not
> make
> >> > any
> >> > difference either and nodetool join just told me that the nodes were
> >> > already
> >> > a part of the ring.
> >> > I can't imagine this is how it *should* be behaving... is there a
> piece
> >> > I'm
> >> > missing in terms of getting one node to recognize the other as being
> Up?
> >
> >
> >
> > --
> > David McNelis
> > Lead Software Engineer
> > Agentis Energy
> > www.agentisenergy.com
> > o: 630.359.6395
> > c: 219.384.5143
> > A Smart Grid technology company focused on helping consumers of energy
> > control an often under-managed resource.
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Mime
View raw message