incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ran Tavory <ran...@gmail.com>
Subject Re: nodetool cleanup isn't cleaning up?
Date Tue, 01 Jun 2010 18:56:48 GMT
I'm using RackAwareStrategy. But it still doesn't make sense I think...
let's see what did I miss...
According to http://wiki.apache.org/cassandra/Operations


   -

   RackAwareStrategy: replica 2 is placed in the first node along the ring
   the belongs in *another* data center than the first; the remaining N-2
   replicas, if any, are placed on the first nodes along the ring in the *
   same* rack as the first



192.168.252.124Up        803.33 MB
56713727820156410577229101238628035242     |<--|
192.168.252.99Up         352.85 MB
56713727820156410577229101238628035243     |   ^
192.168.252.125Up        134.24 MB
85070591730234615865843651857942052863     v   |
192.168.254.57Up         676.41 MB
 113427455640312821154458202477256070485    |   ^
192.168.254.58Up          99.74 MB
 141784319550391026443072753096570088106    v   |
192.168.254.59Up          99.94 MB
 170141183460469231731687303715884105727    |-->|

Alright, so I made a mistake and didn't use the alternate-datacenter
suggestion on the page so the first node of every DC is overloaded with
replicas. However,  the current situation still doesn't make sense to me.
.252.124 will be overloaded b/c it has the first token in the 252 dc.
.254.57 will also be overloaded since it has the first token in the .254 DC.
But for which node does 252.99 serve as a replicator? It's not the first in
the DC and it's just one single token more than it's predecessor (which is
in the same DC).

On Tue, Jun 1, 2010 at 4:00 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> I'm saying that .99 is getting a copy of all the data for which .124
> is the primary.  (If you are using RackUnawarePartitioner.  If you are
> using RackAware it is some other node.)
>
> On Tue, Jun 1, 2010 at 1:25 AM, Ran Tavory <rantav@gmail.com> wrote:
> > ok, let me try and translate your answer ;)
> > Are you saying that the data that was left on the node is
> > non-primary-replicas of rows from the time before the move?
> > So this implies that when a node moves in the ring, it will affect
> > distribution of:
> > - new keys
> > - old keys primary node
> > -- but will not affect distribution of old keys non-primary replicas.
> > If so, still I don't understand something... I would expect even the
> > non-primary replicas of keys to be moved since if they don't, how would
> they
> > be found? I mean upon reads the serving node should not care about
> whether
> > the row is new or old, it should have a consistent and global mapping of
> > tokens. So I guess this ruins my theory...
> > What did you mean then? Is this deletions of non-primary replicated data?
> > How does the replication factor affect the load on the moved host then?
> >
> > On Tue, Jun 1, 2010 at 1:19 AM, Jonathan Ellis <jbellis@gmail.com>
> wrote:
> >>
> >> well, there you are then.
> >>
> >> On Mon, May 31, 2010 at 2:34 PM, Ran Tavory <rantav@gmail.com> wrote:
> >> > yes, replication factor = 2
> >> >
> >> > On Mon, May 31, 2010 at 10:07 PM, Jonathan Ellis <jbellis@gmail.com>
> >> > wrote:
> >> >>
> >> >> you have replication factor > 1 ?
> >> >>
> >> >> On Mon, May 31, 2010 at 7:23 AM, Ran Tavory <rantav@gmail.com>
> wrote:
> >> >> > I hope I understand nodetool cleanup correctly - it should clean
up
> >> >> > all
> >> >> > data
> >> >> > that does not (currently) belong to this node. If so, I think
it
> >> >> > might
> >> >> > not
> >> >> > be working correctly.
> >> >> > Look at nodes 192.168.252.124 and 192.168.252.99 below
> >> >> > 192.168.252.99Up         279.35 MB
> >> >> > 3544607988759775661076818827414252202
> >> >> >      |<--|
> >> >> > 192.168.252.124Up         167.23 MB
> >> >> > 56713727820156410577229101238628035242     |   ^
> >> >> > 192.168.252.125Up         82.91 MB
> >> >> >  85070591730234615865843651857942052863     v   |
> >> >> > 192.168.254.57Up         366.6 MB
> >> >> >  113427455640312821154458202477256070485    |   ^
> >> >> > 192.168.254.58Up         88.44 MB
> >> >> >  141784319550391026443072753096570088106    v   |
> >> >> > 192.168.254.59Up         88.45 MB
> >> >> >  170141183460469231731687303715884105727    |-->|
> >> >> > I wanted 124 to take all the load from 99. So I issued a move
> >> >> > command.
> >> >> > $ nodetool -h cass99 -p 9004 move
> >> >> > 56713727820156410577229101238628035243
> >> >> >
> >> >> > This command tells 99 to take the space b/w
> >> >> >
> >> >> >
> >> >> >
> (56713727820156410577229101238628035242, 56713727820156410577229101238628035243]
> >> >> > which is basically just one item in the token space, almost
> >> >> > nothing... I
> >> >> > wanted it to be very slim (just playing around).
> >> >> > So, next I get this:
> >> >> > 192.168.252.124Up         803.33 MB
> >> >> > 56713727820156410577229101238628035242     |<--|
> >> >> > 192.168.252.99Up         352.85 MB
> >> >> > 56713727820156410577229101238628035243     |   ^
> >> >> > 192.168.252.125Up         134.24 MB
> >> >> > 85070591730234615865843651857942052863     v   |
> >> >> > 192.168.254.57Up         676.41 MB
> >> >> > 113427455640312821154458202477256070485    |   ^
> >> >> > 192.168.254.58Up         99.74 MB
> >> >> >  141784319550391026443072753096570088106    v   |
> >> >> > 192.168.254.59Up         99.94 MB
> >> >> >  170141183460469231731687303715884105727    |-->|
> >> >> > The tokens are correct, but it seems that 99 still has a lot of
> data.
> >> >> > Why?
> >> >> > OK, that might be b/c it didn't delete its moved data.
> >> >> > So next I issued a nodetool cleanup, which should have taken care
> of
> >> >> > that.
> >> >> > Only that it didn't, the node 99 still has 352 MB of data. Why?
> >> >> > So, you know what, I waited for 1h. Still no good, data wasn't
> >> >> > cleaned
> >> >> > up.
> >> >> > I restarted the server. Still, data wasn't cleaned up... I issued
a
> >> >> > cleanup
> >> >> > again... still no good... what's up with this node?
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Jonathan Ellis
> >> >> Project Chair, Apache Cassandra
> >> >> co-founder of Riptano, the source for professional Cassandra support
> >> >> http://riptano.com
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of Riptano, the source for professional Cassandra support
> >> http://riptano.com
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Mime
View raw message