cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Reddy <rahulreddy1...@gmail.com>
Subject Re: Cassandra copy command
Date Wed, 21 Aug 2019 13:57:11 GMT
Thanks Jean,

I have dc1 and dc2 existing. added dc3 from dc1 and dc4 from dc2. If I want
to run repair on one node in dc3 from dc1 only is it possible?

On Wed, Aug 21, 2019, 8:11 AM Jean Carlo <jean.jeancarl48@gmail.com> wrote:

> Hello Rahul,
>
> To ensure the consistency among the DCs, it is enough to run a repair
> command.
>
> You can do it using http://cassandra-reaper.io/
> or runing the commande *nodetool repair* with the respectively options in
> every node.
>
> You do not need to count the rows in every DC to ensure cassandra is sync
> amongs DC after you have run the repair. But if you still want to do it,
> use Spark for it.
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
>
> On Wed, Aug 21, 2019 at 1:51 PM Rahul Reddy <rahulreddy1234@gmail.com>
> wrote:
>
>> Yep I did run rebuild   on each new node
>>
>> On Wed, Aug 21, 2019, 7:25 AM Stefan Miklosovic <
>> stefan.miklosovic@instaclustr.com> wrote:
>>
>>> Hi Rahul,
>>>
>>> how did you add that dc3 to cluster? The rule of thumb here is to do
>>> rebuild from each node, for example like here
>>>
>>> https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
>>>
>>> On Wed, 21 Aug 2019 at 12:57, Rahul Reddy <rahulreddy1234@gmail.com>
>>> wrote:
>>> >
>>> > Hi sefan,
>>> >
>>> > I'm adding new DC3 to exiting cluster and see discripencies couple of
>>> millions in Nodetool cfstats in new DC.
>>> >
>>> > My table size is 50gb
>>> > I'm trying to run copy entire table.
>>> >
>>> > Copy table to 'full_tablr.csv' with delimiter ',';
>>> >
>>> > If I run above command from dc3. Does it get the data only from dc3?
>>> >
>>> >
>>> >
>>> > On Wed, Aug 21, 2019, 6:46 AM Stefan Miklosovic <
>>> stefan.miklosovic@instaclustr.com> wrote:
>>> >>
>>> >> Hi Rahul,
>>> >>
>>> >> what is your motivation behind this? Why do you want to make sure the
>>> >> count is same? What is the purpose of that? All you should care about
>>> >> is that Cassandra will return you right results. It was designed from
>>> >> the very bottom to do that for you, you should not be bothered too
>>> >> much about such discrepancies, they will be always there in general.
>>> >> But the important fact is that once queried, you can rest assured it
>>> >> is returned (and consequentially repaired if data not match) as they
>>> >> should.
>>> >>
>>> >> What copy command you are talking about precisely, why you cant use
>>> just repair?
>>> >>
>>> >> On Wed, 21 Aug 2019 at 12:14, Rahul Reddy <rahulreddy1234@gmail.com>
>>> wrote:
>>> >> >
>>> >> > Hello,
>>> >> >
>>> >> > I have 3 datacenters . Want to make sure record count is same in
>>> all dc's . If I run copy command node1 in dc1 does it get the data from
>>> only dc1? Nodetool cfstats I'm seeing discrepancies in partitions count is
>>> it because we didn't run cleanup after adding few nodes and remove them?.
>>> To rule out any discripencies I want to run copy command from 3 DC's and
>>> compare. Please let me know if copy command extracts data from the DC only
>>> I ran it from?
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>> >> For additional commands, e-mail: user-help@cassandra.apache.org
>>> >>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>
>>>

Mime
View raw message