cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ngoc Minh VO <ngocminh...@bnpparibas.com>
Subject RE: list<T> data value multiplied x2 in multi-datacenter environment
Date Wed, 25 Nov 2015 16:43:36 GMT
No. We do not use update.
All inserts are idempotent and there is no read-before-write query.

On the corrupted data row, we have verified that the data only written once.

Thanks for your answer!

From: Laing, Michael [mailto:michael.laing@nytimes.com]
Sent: mercredi 25 novembre 2015 15:39
To: user@cassandra.apache.org
Subject: Re: list<T> data value multiplied x2 in multi-datacenter environment

You don't have any syntax in your application anywhere such as:

UPDATE data SET field5 = field5 + [ 1,2,3 ] WHERE field1=<x>...;

Just a quick idempotency check :)

On Wed, Nov 25, 2015 at 9:16 AM, Jack Krupansky <jack.krupansky@gmail.com<mailto:jack.krupansky@gmail.com>>
wrote:
Is the data corrupted exactly the same way on all three nodes and in both data centers, or
just on one or two nodes or in only one data center?

Are both columns doubled in the same row, or only one of them in a particular row?

Does sound like a bug though, worthy of a Jira ticket.

-- Jack Krupansky

On Wed, Nov 25, 2015 at 4:05 AM, Ngoc Minh VO <ngocminh.vo@bnpparibas.com<mailto:ngocminh.vo@bnpparibas.com>>
wrote:
Hello all,

We encounter an issue on our Production environment that cannot be reproduced on Test environment:
list<T> (T = double or text) value is randomly “multiplied” by 2 (i.e. value sent
to C*= [a, b, c], value stored in C* = [a, b, c, a, b, c]).

I know that it sounds weird but we just want to know whether it is a known issue (found nothing
with Google…). We are working on a small dataset to narrow down issue with log data and
maybe create a ticket in for DataStax Java Driver or Cassandra teams.

Cassandra v2.0.14
DataStax Java Driver v2.1.7.1
OS RHEL6
Prod Cluster topology = 16 nodes over 2 datacenters (RF = 3 per DC)
UAT Cluster topology = 6 nodes on 1 datacenter (RF = 3)

The only difference between Prod and UAT cluster is the multi-datacenter mode on Prod one.
We do not insert twice the same data on the same column of any specific row. All inserts/updates
are idempotent!

Data table:
CREATE TABLE data (
    field1 text,
    field2 int,
    field3 text,
    field4 double,
    field5 list<double>, -- randomly having corrupted data, containing [1, 2, 3, 1,
2, 3] instead of [1, 2, 3]
    field6 text,
    field7 list<text>,   -- randomly having corrupted data, containing [a, b, c, a,
b, c] instead of [a, b, c]
    PRIMARY KEY ((field1, field2), field3)
) WITH compaction = { 'class' : 'LeveledCompactionStrategy' };

Thanks in advance for your help.
Best regards,
Minh

This message and any attachments (the "message") is
intended solely for the intended addressees and is confidential.
If you receive this message in error,or are not the intended recipient(s),
please delete it and any copies from your systems and immediately notify
the sender. Any unauthorized view, use that does not comply with its purpose,
dissemination or disclosure, either whole or partial, is prohibited. Since the internet
cannot guarantee the integrity of this message which may not be reliable, BNP PARIBAS
(and its subsidiaries) shall not be liable for the message if modified, changed or falsified.
Do not print this message unless it is necessary,consider the environment.

----------------------------------------------------------------------------------------------------------------------------------

Ce message et toutes les pieces jointes (ci-apres le "message")
sont etablis a l'intention exclusive de ses destinataires et sont confidentiels.
Si vous recevez ce message par erreur ou s'il ne vous est pas destine,
merci de le detruire ainsi que toute copie de votre systeme et d'en avertir
immediatement l'expediteur. Toute lecture non autorisee, toute utilisation de
ce message qui n'est pas conforme a sa destination, toute diffusion ou toute
publication, totale ou partielle, est interdite. L'Internet ne permettant pas d'assurer
l'integrite de ce message electronique susceptible d'alteration, BNP Paribas
(et ses filiales) decline(nt) toute responsabilite au titre de ce message dans l'hypothese
ou il aurait ete modifie, deforme ou falsifie.
N'imprimez ce message que si necessaire, pensez a l'environnement.


Mime
View raw message