incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Lost data after expanding cluster c* 1.2.3-1
Date Tue, 16 Apr 2013 21:42:42 GMT
Sorry can you repost the details of that issue including the CL you are using. 

Aaron

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/04/2013, at 12:57 AM, Kais Ahmed <kais@neteck-fr.com> wrote:

> Thanks aaron,
> 
> I feel that rebuilding indexes went well, but the result of my query (SELECT * FROM userdata
WHERE login='kais';) is still emty.
> 
> INFO [Creating index: userdata.userdata_login_idx] 2013-03-30 01:16:33,110 SecondaryIndex.java
(line 175) Submitting index build of userdata.userdata_login_idx
> INFO [Creating index: userdata.userdata_login_idx] 2013-03-30 01:34:11,667 SecondaryIndex.java
(line 202) Index build of userdata.userdata_login_idx complete
> 
> Thanks,
> 
> 
> 2013/4/9 aaron morton <aaron@thelastpickle.com>
> Look in the logs for messages from the SecondaryIndexManager 
> 
> starts with "Submitting index build of"
> end with "Index build of"
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 7/04/2013, at 12:55 AM, Kais Ahmed <kais@neteck-fr.com> wrote:
> 
>> hi aaron,
>> 
>> nodetool compactionstats on all nodes return 1 pending task :
>> 
>> ubuntu@app:~$ nodetool compactionstats host
>> pending tasks: 1
>> Active compaction remaining time :        n/a
>> 
>> The command nodetool rebuild_index was launched several days ago.
>> 
>> 2013/4/5 aaron morton <aaron@thelastpickle.com>
>>> but nothing's happening, how can i monitor the progress? and how can i know when
it's finished?
>> 
>> check nodetool compacitonstats
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 4/04/2013, at 2:51 PM, Kais Ahmed <kais@neteck-fr.com> wrote:
>> 
>>> Hi aaron,
>>> 
>>> I ran the command "nodetool rebuild_index host keyspace cf" on all the nodes,
in the log i see :
>>> 
>>> INFO [RMI TCP Connection(5422)-10.34.139.xxx] 2013-04-04 08:31:53,641 ColumnFamilyStore.java
(line 558) User Requested secondary index re-build for ...
>>> 
>>> but nothing's happening, how can i monitor the progress? and how can i know when
it's finished?
>>> 
>>> Thanks,
>>>  
>>> 
>>> 2013/4/2 aaron morton <aaron@thelastpickle.com>
>>>> The problem come from that i don't put  auto_boostrap to true for the new
nodes, not in this documentation (http://www.datastax.com/docs/1.2/install/expand_ami)
>>> auto_bootstrap defaults to True if not specified in the yaml. 
>>> 
>>>> can i do that at any time, or when the cluster are not loaded
>>> Not sure what the question is. 
>>> Both those operations are online operations you can do while the node is processing
requests. 
>>>  
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Consultant
>>> New Zealand
>>> 
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 1/04/2013, at 9:26 PM, Kais Ahmed <kais@neteck-fr.com> wrote:
>>> 
>>>> > At this moment the errors started, we see that members and other data
are gone, at this moment the nodetool status return (in red color the 3 new nodes)
>>>> > What errors?
>>>> The errors was in my side in the application, not cassandra errors
>>>> 
>>>> > I put for each of them seeds = A ip, and start each with two minutes
intervals.
>>>> > When I'm making changes I tend to change a single node first, confirm
everything is OK and then do a bulk change.
>>>> Thank you for that advice.
>>>> 
>>>> >I'm not sure what or why it went wrong, but that should get you to a
stable place. If you have any problems keep an eye on the logs for errors or warnings.
>>>> The problem come from that i don't put  auto_boostrap to true for the new
nodes, not in this documentation (http://www.datastax.com/docs/1.2/install/expand_ami)
>>>> 
>>>> >if you are using secondary indexes use nodetool rebuild_index to rebuild
those.
>>>> can i do that at any time, or when the cluster are not loaded
>>>> 
>>>> Thanks aaron,
>>>> 
>>>> 2013/4/1 aaron morton <aaron@thelastpickle.com>
>>>> Please do not rely on colour in your emails, the best way to get your emails
accepted by the Apache mail servers is to use plain text.
>>>> 
>>>> > At this moment the errors started, we see that members and other data
are gone, at this moment the nodetool status return (in red color the 3 new nodes)
>>>> What errors?
>>>> 
>>>> > I put for each of them seeds = A ip, and start each with two minutes
intervals.
>>>> When I'm making changes I tend to change a single node first, confirm everything
is OK and then do a bulk change.
>>>> 
>>>> > Now the cluster seem to work normally, but i can use the secondary for
the moment, the queryanswer are random
>>>> run nodetool repair -pr on each node, let it finish before starting the next
one.
>>>> if you are using secondary indexes use nodetool rebuild_index to rebuild
those.
>>>> Add one node new node to the cluster and confirm everything is ok, then add
the remaining ones.
>>>> 
>>>> >I'm not sure what or why it went wrong, but that should get you to a
stable place. If you have any problems keep an eye on the logs for errors or warnings.
>>>> 
>>>> Cheers
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Consultant
>>>> New Zealand
>>>> 
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 31/03/2013, at 10:01 PM, Kais Ahmed <kais@neteck-fr.com> wrote:
>>>> 
>>>> > Hi aaron,
>>>> >
>>>> > Thanks for reply, i will try to explain what append exactly
>>>> >
>>>> > I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2
ami (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with
>>>> > this config --clustername myDSCcluster --totalnodes 4--version community
>>>> >
>>>> > Two days after this cluster in production, i saw that the cluster was
overload, I wanted to extend it by adding 3 another nodes.
>>>> >
>>>> > I create a new cluster with 3 C* [D,E,F]  (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2)
>>>> >
>>>> > And follow the documentation (http://www.datastax.com/docs/1.2/install/expand_ami)
for adding them in the ring.
>>>> > I put for each of them seeds = A ip, and start each with two minutes
intervals.
>>>> >
>>>> > At this moment the errors started, we see that members and other data
are gone, at this moment the nodetool status return (in red color the 3 new nodes)
>>>> >
>>>> > Datacenter: eu-west
>>>> > ===================
>>>> > Status=Up/Down
>>>> > |/ State=Normal/Leaving/Joining/
>>>> >> Moving
>>>> >> --  Address           Load       Tokens  Owns   Host ID        
                      Rack
>>>> >> UN  10.34.142.xxx     10.79 GB   256     15.4%  4e2e26b8-aa38-428c-a8f5-e86c13eb4442
 1b
>>>> >> UN  10.32.49.xxx       1.48 MB    256        13.7%  e86f67b6-d7cb-4b47-b090-3824a5887145
 1b
>>>> >> UN  10.33.206.xxx      2.19 MB    256    11.9%  92af17c3-954a-4511-bc90-29a9657623e4
 1b
>>>> >> UN  10.32.27.xxx       1.95 MB    256      14.9%  862e6b39-b380-40b4-9d61-d83cb8dacf9e
 1b
>>>> >> UN  10.34.139.xxx     11.67 GB   256    15.5%  0324e394-b65f-46c8-acb4-1e1f87600a2c
 1b
>>>> >> UN  10.34.147.xxx     11.18 GB   256     13.9%  cfc09822-5446-4565-a5f0-d25c917e2ce8
 1b
>>>> >> UN  10.33.193.xxx     10.83 GB   256      14.7%  59f440db-cd2d-4041-aab4-fc8e9518c954
 1b
>>>> >
>>>> > I saw that the 3 nodes have join the ring but they had no data, i put
the website in maintenance and lauch a nodetool repair on
>>>> > the 3 new nodes, during 5 hours i see in opcenter the data streamed
to the new nodes (very nice :))
>>>> >
>>>> > During this time, i write a script to check if all members are present
(relative to a copy of members in mysql).
>>>> >
>>>> > After data streamed seems to be finish, but i'm not sure because nodetool
compactionstats show pending task but nodetool netstats seems to be ok.
>>>> >
>>>> > I ran my script to check if the data, but members are still missing.
>>>> >
>>>> > I decide to roolback by running nodetool decommission node D, E, F
>>>> >
>>>> > I re run my script, all seems to be ok but secondary index have strange
behavior,
>>>> > some time the row was returned some times no result.
>>>> >
>>>> > the user kais can be retrieve using his key with cassandra-cli but if
i use cqlsh :
>>>> >
>>>> > cqlsh:database> SELECT login FROM userdata where login='kais' ;
>>>> >
>>>> >  login
>>>> > ----------------
>>>> >  kais
>>>> >
>>>> > cqlsh:database> SELECT login FROM userdata where login='kais' ; //empty
>>>> > cqlsh:database> SELECT login FROM userdata where login='kais' ;
>>>> >
>>>> >  login
>>>> > ----------------
>>>> >  kais
>>>> >
>>>> > cqlsh:database> SELECT login FROM userdata where login='kais' ;
>>>> >
>>>> >  login
>>>> > ----------------
>>>> >  kais
>>>> >
>>>> > cqlsh:database> SELECT login FROM userdata where login='kais' ; //empty
>>>> > cqlsh:database> SELECT login FROM userdata where login='kais' ;
>>>> >
>>>> >  login
>>>> > ----------------
>>>> >  kais
>>>> >
>>>> > cqlsh:mydatabase>Tracing on;
>>>> > When tracing is activate i have this error but not all time
>>>> > cqlsh:mydatabase> SELECT * FROM userdata where login='kais' ;
>>>> > unsupported operand type(s) for /: 'NoneType' and 'float'
>>>> >
>>>> >
>>>> > NOTE : When the cluster contained 7 nodes, i see that my table userdata
(RF 3) on node D was replicated on E and F, that would seem strange because its 3 node was
not correctly filled
>>>> >
>>>> > Now the cluster seem to work normally, but i can use the secondary for
the moment, the query answer are random
>>>> >
>>>> > Thanks a lot for any help,
>>>> > Kais
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > 2013/3/31 aaron morton <aaron@thelastpickle.com>
>>>> > First thought is the new nodes were marked as seeds.
>>>> > Next thought is check the logs for errors.
>>>> >
>>>> > You can always run a nodetool repair if you are concerned data is not
where you think it should be.
>>>> >
>>>> > Cheers
>>>> >
>>>> >
>>>> > -----------------
>>>> > Aaron Morton
>>>> > Freelance Cassandra Consultant
>>>> > New Zealand
>>>> >
>>>> > @aaronmorton
>>>> > http://www.thelastpickle.com
>>>> >
>>>> > On 29/03/2013, at 8:01 PM, Kais Ahmed <kais@neteck-fr.com> wrote:
>>>> >
>>>> >> Hi all,
>>>> >>
>>>> >> I follow this tutorial for expanding a 4 c* cluster (production)
and add 3 new nodes.
>>>> >>
>>>> >> Datacenter: eu-west
>>>> >> ===================
>>>> >> Status=Up/Down
>>>> >> |/ State=Normal/Leaving/Joining/Moving
>>>> >> --  Address           Load       Tokens  Owns   Host ID        
                      Rack
>>>> >> UN  10.34.142.xxx     10.79 GB   256     15.4%  4e2e26b8-aa38-428c-a8f5-e86c13eb4442
 1b
>>>> >> UN  10.32.49.xxx       1.48 MB    256        13.7%  e86f67b6-d7cb-4b47-b090-3824a5887145
 1b
>>>> >> UN  10.33.206.xxx      2.19 MB    256    11.9%  92af17c3-954a-4511-bc90-29a9657623e4
 1b
>>>> >> UN  10.32.27.xxx       1.95 MB    256      14.9%  862e6b39-b380-40b4-9d61-d83cb8dacf9e
 1b
>>>> >> UN  10.34.139.xxx     11.67 GB   256    15.5%  0324e394-b65f-46c8-acb4-1e1f87600a2c
 1b
>>>> >> UN  10.34.147.xxx     11.18 GB   256     13.9%  cfc09822-5446-4565-a5f0-d25c917e2ce8
 1b
>>>> >> UN  10.33.193.xxx     10.83 GB   256      14.7%  59f440db-cd2d-4041-aab4-fc8e9518c954
 1b
>>>> >>
>>>> >> The data are not streamed.
>>>> >>
>>>> >> Can any one help me, our web site is down.
>>>> >>
>>>> >> Thanks a lot,
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Mime
View raw message