incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Theroux <mthero...@yahoo.com>
Subject Re: DELETE does not delete :)
Date Thu, 17 Oct 2013 16:30:56 GMT
A couple questions:

1) How did you determine that the record is deleted on only one node? Are you looking for
tombstones, or the original entry that was inserted? Note that when an item is deleted, the
original entry can still be in an SSTABLE somewhere, and the tombstone can be in another
SSTABLE until those tables are compacted together.

2) When you did the global daily check, are you sure you are not getting range ghosts? I
assume they are still possible on 2.0 
(http://www.datastax.com/docs/0.7/getting_started/using_cli, search for range ghosts).

Thanks,
-Mike




On Thursday, October 17, 2013 6:36 AM, Alexander Shutyaev <shutyaev@gmail.com> wrote:
 
Hi Daniel, Nate.

Thanks for your answers. We have gc_grace_seconds=864000 (which is the default, I believe).
We've also checked the clocks - they are synchronized.



2013/10/16 Nate McCall <nate@thelastpickle.com>

This is almost a guaranteed sign that the clocks are off in your cluster. If you run the select
query a couple of times in a row right after deletion, do you see the data appear again?
>
>
>
>On Wed, Oct 16, 2013 at 12:12 AM, Alexander Shutyaev <shutyaev@gmail.com> wrote:
>
>Hi all,
>>
>>
>>Unfortunately, we still have a problem. I've modified my code, so that it explicitly
sets the consistency level to QUORUM for each query. However, we found out a few cases when
the record is deleted on only 1 node of 3. In this cases the delete query executed ok, and
the select query that we do right after delete returned 0 rows. Later when we ran a global
daily check select returned 1 row. How can that be? What can we be missing?
>>
>>
>>
>>2013/10/7 Jon Haddad <jon@jonhaddad.com>
>>
>>I haven't used VMWare but it seems odd that it would lock up the ntp port.  try "ps
aux | grep ntp" to see if ntpd it's already running.
>>>
>>>
>>>On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev <shutyaev@gmail.com> wrote:
>>>
>>>Hi Michał,
>>>>
>>>>
>>>>I didn't notice your message at first.. Well this seems like a real cause
candidate.. I'll add an explicit consistency level QUORUM and see if that helps. Thanks
>>>>
>>>>
>>>>
>>>>2013/10/7 Alexander Shutyaev <shutyaev@gmail.com>
>>>>
>>>>Hi Nick,
>>>>>
>>>>>Thanks for the note! We have our cassanra instances installed on virtual
hosts in VMWare and the clock synchronization is handled by the latter, so I can't use ntpdate
(says that NTP socket is in use). Is there any way to check if the clocks are really synchronized?
My best attempt was using three shell windows with commands already typed thus requiring only
clicking on the window and hitting enter. The results varied by 100-200 msec which I guess
is just about the time I need to click and press enter :)
>>>>>
>>>>>
>>>>>Thanks in advance,
>>>>>Alexander
>>>>>
>>>>>
>>>>>
>>>>>2013/10/7 Nikolay Mihaylov <nmmm@nmmm.nu>
>>>>>
>>>>>Hi
>>>>>>
>>>>>>
>>>>>>my two cents - before doing anything else, make sure clocks are synchronized
to the millisecond.
>>>>>>ntp will do so.
>>>>>>
>>>>>>
>>>>>>Nick.
>>>>>>
>>>>>>
>>>>>>
>>>>>>On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shutyaev@gmail.com>
wrote:
>>>>>>
>>>>>>Hi all,
>>>>>>>
>>>>>>>
>>>>>>>We have encountered the following problem with cassandra.
>>>>>>>
>>>>>>>
>>>>>>>* We use cassandra v2.0.0 from Datastax community repo.
>>>>>>>
>>>>>>>
>>>>>>>* We have 3 nodes in a cluster, all of them are seed providers.
>>>>>>>
>>>>>>>
>>>>>>>* We have a single keyspace with replication factor = 3:
>>>>>>>
>>>>>>>
>>>>>>>CREATE KEYSPACE bof WITH replication = {
>>>>>>>  'class': 'SimpleStrategy',
>>>>>>>  'replication_factor': '3'
>>>>>>>};
>>>>>>>
>>>>>>>
>>>>>>>* We use Datastax Java CQL Driver v1.0.3 in our application.
>>>>>>>
>>>>>>>
>>>>>>>* We have not modified any consistency settings in our app, so
I assume we have the default QUORUM (2 out of 3 in our case) consistency for reads and writes.
>>>>>>>
>>>>>>>
>>>>>>>* We have 400+ tables which can be divided in two groups (main
and uids). All tables in a group have the same definition, they vary only by name. The sample
definitions are:
>>>>>>>
>>>>>>>
>>>>>>>CREATE TABLE bookingfile (
>>>>>>>  key text,
>>>>>>>  entity_created timestamp,
>>>>>>>  entity_createdby text,
>>>>>>>  entity_entitytype text,
>>>>>>>  entity_modified timestamp,
>>>>>>>  entity_modifiedby text,
>>>>>>>  entity_status text,
>>>>>>>  entity_uid text,
>>>>>>>  entity_updatepolicy text,
>>>>>>>  version_created timestamp,
>>>>>>>  version_createdby text,
>>>>>>>  version_data blob,
>>>>>>>  version_dataformat text,
>>>>>>>  version_datasource text,
>>>>>>>  version_modified timestamp,
>>>>>>>  version_modifiedby text,
>>>>>>>  version_uid text,
>>>>>>>  version_versionnotes text,
>>>>>>>  version_versionnumber int,
>>>>>>>  versionscount int,
>>>>>>>  PRIMARY KEY (key)
>>>>>>>) WITH
>>>>>>>  bloom_filter_fp_chance=0.010000 AND
>>>>>>>  caching='KEYS_ONLY' AND
>>>>>>>  comment='' AND
>>>>>>>  dclocal_read_repair_chance=0.000000 AND
>>>>>>>  gc_grace_seconds=864000 AND
>>>>>>>  index_interval=128 AND
>>>>>>>  read_repair_chance=0.100000 AND
>>>>>>>  replicate_on_write='true' AND
>>>>>>>  populate_io_cache_on_flush='false' AND
>>>>>>>  default_time_to_live=0 AND
>>>>>>>  speculative_retry='NONE' AND
>>>>>>>  memtable_flush_period_in_ms=0 AND
>>>>>>>  compaction={'class': 'SizeTieredCompactionStrategy'} AND
>>>>>>>  compression={'sstable_compression': 'LZ4Compressor'};
>>>>>>>
>>>>>>>
>>>>>>>CREATE TABLE bookingfile_uids (
>>>>>>>  date text,
>>>>>>>  timeanduid text,
>>>>>>>  deleted boolean,
>>>>>>>  PRIMARY KEY (date, timeanduid)
>>>>>>>) WITH
>>>>>>>  bloom_filter_fp_chance=0.010000 AND
>>>>>>>  caching='KEYS_ONLY' AND
>>>>>>>  comment='' AND
>>>>>>>  dclocal_read_repair_chance=0.000000 AND
>>>>>>>  gc_grace_seconds=864000 AND
>>>>>>>  index_interval=128 AND
>>>>>>>  read_repair_chance=0.100000 AND
>>>>>>>  replicate_on_write='true' AND
>>>>>>>  populate_io_cache_on_flush='false' AND
>>>>>>>  default_time_to_live=0 AND
>>>>>>>  speculative_retry='NONE' AND
>>>>>>>  memtable_flush_period_in_ms=0 AND
>>>>>>>  compaction={'class': 'SizeTieredCompactionStrategy'} AND
>>>>>>>  compression={'sstable_compression': 'LZ4Compressor'};
>>>>>>>
>>>>>>>
>>>>>>>CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids (deleted);
>>>>>>>
>>>>>>>
>>>>>>>* We don't have any problems with the tables from the main group.
>>>>>>>
>>>>>>>
>>>>>>>* As for the tables from the uids group we have noticed that
sometimes deletes from these tables do not do their job. They don't fail, they just do nothing.
We have confirmed this by adding a select query after deletes. Most times everything is ok
and select returns 0 records. But sometimes (~5 out of 100,000) it returns the supposedly
deleted row.
>>>>>>>
>>>>>>>
>>>>>>>* We have logged the ExecutionInfo objects with query tracing
that are returned by Datastax's driver. Here are the details
>>>>>>>
>>>>>>>
>>>>>>>DELETE FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
>>>>>>>
>>>>>>>
>>>>>>>ExecutionInfo: [
>>>>>>>triedHosts=/10.10.30.23;
>>>>>>>queriedHost=/10.10.30.23;
>>>>>>>achievedConsistencyLevel=null;
>>>>>>>queryTrace=
>>>>>>>Message received from /10.10.30.23 on /10.10.30.19[Thread-56]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.19[MutationStage:49]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun
Oct 06 19:55:57 MSK 2013
>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.19[MutationStage:49]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Enqueuing response to /10.10.30.23 on /10.10.30.19[MutationStage:49]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.23]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Message received from /10.10.30.23 on /10.10.30.20[Thread-34]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.20[MutationStage:43]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun
Oct 06 19:55:57 MSK 2013
>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStage:43]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Determining replicas for mutation on /10.10.30.23[Native-Transport-Requests:1387368]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.23[MutationStage:46]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Sending message to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at
Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Processing response from /10.10.30.20 on /10.10.30.23[RequestResponseStage:4]
at Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at
Sun Oct 06 19:55:57 MSK 2013
>>>>>>>Processing response from /10.10.30.19 on /10.10.30.23[RequestResponseStage:4]
at Sun Oct 06 19:55:57 MSK 2013;
>>>>>>>]
>>>>>>>
>>>>>>>
>>>>>>>SELECT * FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
returned 1 record
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>the same query 1 second later:
>>>>>>>
>>>>>>>
>>>>>>>DELETE FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
>>>>>>>
>>>>>>>
>>>>>>>ExecutionInfo: [
>>>>>>>triedHosts=/10.10.30.20;
>>>>>>>queriedHost=/10.10.30.20;
>>>>>>>achievedConsistencyLevel=null;
>>>>>>>queryTrace=
>>>>>>>Message received from /10.10.30.20 on /10.10.30.19[Thread-57]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Determining replicas for mutation on /10.10.30.20[Native-Transport-Requests:1387705]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Acquiring switchLock read lock on /10.10.30.20[MutationStage:43]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun
Oct 06 19:55:58 MSK 2013
>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Sending message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Sending message to /10.10.30.23 on /10.10.30.20[WRITE-/10.10.30.23]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at
Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Processing response from /10.10.30.19 on /10.10.30.20[RequestResponseStage:6]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>Message received from /10.10.30.20 on /10.10.30.23[Thread-18]
at Sun Oct 06 19:55:58 MSK 2013
>>>>>>>]
>>>>>>>
>>>>>>>
>>>>>>>SELECT * FROM bookingfile_uids WHERE date=C20131006 AND timeAndUid=195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
returned 0 records.
>>>>>>>
>>>>>>>
>>>>>>>* Cassandra's system.log on all 3 nodes lists nothing special
during these queries - just some compaction related INFO entries.
>>>>>>>
>>>>>>>
>>>>>>>Can anyone help with this? What is our next step?
>>>>>>>
>>>>>>>
>>>>>>>Thanks in advance,
>>>>>>>Alexander
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
>-- 
>
>-----------------
>Nate McCall
>Austin, TX
>@zznate
>
>Co-Founder & Sr. Technical Consultant
>Apache Cassandra Consulting
>http://www.thelastpickle.com
Mime
View raw message