incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Haddad <...@jonhaddad.com>
Subject Re: DELETE does not delete :)
Date Mon, 07 Oct 2013 15:15:30 GMT
I haven't used VMWare but it seems odd that it would lock up the ntp port.  try "ps aux | grep
ntp" to see if ntpd it's already running.

On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev <shutyaev@gmail.com> wrote:

> Hi MichaƂ,
> 
> I didn't notice your message at first.. Well this seems like a real cause candidate..
I'll add an explicit consistency level QUORUM and see if that helps. Thanks
> 
> 
> 2013/10/7 Alexander Shutyaev <shutyaev@gmail.com>
> Hi Nick,
> 
> Thanks for the note! We have our cassanra instances installed on virtual hosts in VMWare
and the clock synchronization is handled by the latter, so I can't use ntpdate (says that
NTP socket is in use). Is there any way to check if the clocks are really synchronized? My
best attempt was using three shell windows with commands already typed thus requiring only
clicking on the window and hitting enter. The results varied by 100-200 msec which I guess
is just about the time I need to click and press enter :)
> 
> Thanks in advance,
> Alexander
> 
> 
> 2013/10/7 Nikolay Mihaylov <nmmm@nmmm.nu>
> Hi
> 
> my two cents - before doing anything else, make sure clocks are synchronized to the millisecond.
> ntp will do so.
> 
> Nick.
> 
> 
> On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shutyaev@gmail.com> wrote:
> Hi all,
> 
> We have encountered the following problem with cassandra.
> 
> * We use cassandra v2.0.0 from Datastax community repo.
> 
> * We have 3 nodes in a cluster, all of them are seed providers.
> 
> * We have a single keyspace with replication factor = 3:
> 
> CREATE KEYSPACE bof WITH replication = {
>   'class': 'SimpleStrategy',
>   'replication_factor': '3'
> };
> 
> * We use Datastax Java CQL Driver v1.0.3 in our application.
> 
> * We have not modified any consistency settings in our app, so I assume we have the default
QUORUM (2 out of 3 in our case) consistency for reads and writes.
> 
> * We have 400+ tables which can be divided in two groups (main and uids). All tables
in a group have the same definition, they vary only by name. The sample definitions are:
> 
> CREATE TABLE bookingfile (
>   key text,
>   entity_created timestamp,
>   entity_createdby text,
>   entity_entitytype text,
>   entity_modified timestamp,
>   entity_modifiedby text,
>   entity_status text,
>   entity_uid text,
>   entity_updatepolicy text,
>   version_created timestamp,
>   version_createdby text,
>   version_data blob,
>   version_dataformat text,
>   version_datasource text,
>   version_modified timestamp,
>   version_modifiedby text,
>   version_uid text,
>   version_versionnotes text,
>   version_versionnumber int,
>   versionscount int,
>   PRIMARY KEY (key)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='NONE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> 
> CREATE TABLE bookingfile_uids (
>   date text,
>   timeanduid text,
>   deleted boolean,
>   PRIMARY KEY (date, timeanduid)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='NONE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> 
> CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);
> 
> * We don't have any problems with the tables from the main group.
> 
> * As for the tables from the uids group we have noticed that sometimes deletes from these
tables do not do their job. They don't fail, they just do nothing. We have confirmed this
by adding a select query after deletes. Most times everything is ok and select returns 0 records.
But sometimes (~5 out of 100,000) it returns the supposedly deleted row.
> 
> * We have logged the ExecutionInfo objects with query tracing that are returned by Datastax's
driver. Here are the details
> 
> DELETE FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
> 
> ExecutionInfo: [
> triedHosts=/10.10.30.23;
> queriedHost=/10.10.30.23;
> achievedConsistencyLevel=null;
> queryTrace=
> 	Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun Oct 06 19:55:57
MSK 2013
> 	Acquiring switchLock read lock on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57
MSK 2013
> 	Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57 MSK
2013
> 	Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49] at Sun Oct 06
19:55:57 MSK 2013
> 	Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57
MSK 2013
> 	Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23] at Sun Oct 06 19:55:57
MSK 2013
> 	Message received from /10.10.30.23 on /10.10.30.20[Thread-34] at Sun Oct 06 19:55:57
MSK 2013
> 	Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:57
MSK 2013
> 	Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:57 MSK
2013
> 	Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] at Sun Oct 06
19:55:57 MSK 2013
> 	Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:57
MSK 2013
> 	Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at Sun Oct 06 19:55:57
MSK 2013
> 	Determining replicas for mutation on /10.10.30.23[Native-Transport-Requests:1387368]
at Sun Oct 06 19:55:57 MSK 2013
> 	Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at Sun Oct 06 19:55:57
MSK 2013
> 	Acquiring switchLock read lock on /10.10.30.23[MutationStage:46] at Sun Oct 06 19:55:57
MSK 2013
> 	Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at Sun Oct 06 19:55:57
MSK 2013
> 	Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun Oct 06 19:55:57
MSK 2013
> 	Processing response from /10.10.30.20 on /10.10.30.23[RequestResponseStage:4] at Sun
Oct 06 19:55:57 MSK 2013
> 	Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct 06 19:55:57
MSK 2013
> 	Processing response from /10.10.30.19 on /10.10.30.23[RequestResponseStage:4] at Sun
Oct 06 19:55:57 MSK 2013;
> ]
> 
> SELECT * FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
returned 1 record
> 
> the same query 1 second later:
> 
> DELETE FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
> 
> ExecutionInfo: [
> triedHosts=/10.10.30.20;
> queriedHost=/10.10.30.20;
> achievedConsistencyLevel=null;
> queryTrace=
> 	Message received from /10.10.30.20 on /10.10.30.19[Thread-57] at Sun Oct 06 19:55:58
MSK 2013
> 	Determining replicas for mutation on /10.10.30.20[Native-Transport-Requests:1387705]
at Sun Oct 06 19:55:58 MSK 2013
> 	Acquiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:58
MSK 2013
> 	Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:58 MSK
2013
> 	Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] at Sun Oct 06
19:55:58 MSK 2013
> 	Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at Sun Oct 06 19:55:58
MSK 2013
> 	Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23] at Sun Oct 06 19:55:58
MSK 2013
> 	Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct 06 19:55:58
MSK 2013
> 	Processing response from /10.10.30.19 on /10.10.30.20[RequestResponseStage:6] at Sun
Oct 06 19:55:58 MSK 2013
> 	Message received from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun Oct 06 19:55:58
MSK 2013
> ]
> 
> SELECT * FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
returned 0 records.
> 
> * Cassandra's system.log on all 3 nodes lists nothing special during these queries -
just some compaction related INFO entries.
> 
> Can anyone help with this? What is our next step?
> 
> Thanks in advance,
> Alexander
> 
> 
> 


Mime
View raw message