cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean Tremblay <>
Subject Re: Missing data
Date Mon, 15 Jun 2015 16:50:20 GMT
Dear all,

I identified a bit more closely the root cause of my missing data.

The problem is occurring when I use


on my client against Cassandra 2.1.6.

I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4.
Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 2.1.6.  !!!!!!

So as far as I can locate the problem, I would say that the version 2.1.6 of the driver is
not working properly and is loosing some of my records.!!!


As far as my tombstones are concerned I don’t understand their origin.
I removed all location in my code where I delete items, and I do not use TTL anywhere ( I
don’t need this feature in my project).

And yet I have many tombstones building up.

Is there another origin for tombstone beside TTL, and deleting items? Could the compaction
of LeveledCompactionStrategy be the origin of them?

@Carlos thanks for your guidance.

Kind regards


On 15 Jun 2015, at 11:17 , Carlos Rolo <<>>

Hi Jean,

The problem of that Warning is that you are reading too many tombstones per request.

If you do have Tombstones without doing DELETE it because you probably TTL'ed the data when
inserting (By mistake? Or did you set default_time_to_live in your table?). You can use nodetool
cfstats to see how many tombstones per read slice you have. This is, probably, also the cause
of your missing data. Data was tombstoned, so it is not available.


Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin:<>
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649<>

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay <<>>

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput has been change.

I loaded my data with simple insert statements. This took a bit more than one day to load
the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically had absolutely
no problems.

Now I read the log files on the client side, there I see no warning and no errors.
On the nodes side there I see many WARNING, all related with tombstones, but there are no

My problem is that I see some *many missing records* in the DB, and I have never observed
this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 - Read 2990
live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for key: D:07 (see tombstone_warn_threshold).
5000 columns were requested, slices=[388:201001-388:201412:!]

4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.


View raw message