Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 963A5104B9 for ; Thu, 17 Oct 2013 16:31:30 +0000 (UTC) Received: (qmail 4043 invoked by uid 500); 17 Oct 2013 16:31:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3853 invoked by uid 500); 17 Oct 2013 16:31:24 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3838 invoked by uid 99); 17 Oct 2013 16:31:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Oct 2013 16:31:22 +0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLYTO_END_DIGIT,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.136.216.148] (HELO nm27-vm5.bullet.mail.gq1.yahoo.com) (98.136.216.148) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Oct 2013 16:31:18 +0000 Received: from [98.137.12.190] by nm27.bullet.mail.gq1.yahoo.com with NNFMP; 17 Oct 2013 16:30:57 -0000 Received: from [216.39.60.216] by tm11.bullet.mail.gq1.yahoo.com with NNFMP; 17 Oct 2013 16:30:57 -0000 Received: from [127.0.0.1] by omp1103.mail.gq1.yahoo.com with NNFMP; 17 Oct 2013 16:30:57 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 537642.89862.bm@omp1103.mail.gq1.yahoo.com Received: (qmail 4373 invoked by uid 60001); 17 Oct 2013 16:30:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1382027457; bh=CGGzFBWEZHOGPh66J58/aWFPPqwOSZ1nmjWDbeTg5Ww=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=qrAhdSsCE5oOSwHGLKUg7q+uWpR4nDRaKu6LDuf68V4Hw3wHuxFE9pyLnCw8N/mhAhvYwm3/6XsvdkeuAjAF0oCOgJxOBh+3hNgRnwqCHrfkTBF7ze544xRNNQjfEQ/O2f+jawp7lKR4Y8gergub3NsM6i1+Lfoosi1Nxs2bawo= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=FE7/+zHdduVnCpvWaWRER7Xit60NbOISC9RXVxyyTey+KKbdiGbD63kw5TkhsMk3K9QfyAVtJw6C5yvJ8VIGNNPPs0ElgUNOcTgtUet+n2mBFYKiLjmlzSWkZBQBs7eOwduSfAhVWtPrg29UutBUBYHHp7zQr6kZeo69N8B85XY=; X-YMail-OSG: _268IqMVM1k7q5x.9cSvJTJjJKj_UH2gY8VnYMMfgULeExN cNjDaC4WJrtHt25s8Putkz3Q9qMxrsu2rgliyftBZLxsNf.a70Hq_d0E2IE2 ID5eeDGm0ZwlR.pBghvVL8HNLWrwbRRGyB59Vo5UUAFmGgZg8UEYisZicG71 0KY_ZsoBBWZM.vaU4MtWgbwOIvZ63TXFXclXXixY32LW2WYGy_N8W8fXBlmd LtG6hFo_8AwVIC9x98LU.wZyidf1z9Jn10VLAgdruXxZbZBi0P5jHRIAmj32 I5Jf8x9YxTgzGQ4BGr4OtEHP26tnfYTFcyXXY2yAMKVqZy4elAyewdc1DSkc qD6a7ZIRvMEy1ZLYaHkkr8tq49KJ8QG19VPvyepwek3MTDixzKYRV732_izm lLzq83XSsWSJ7OAdOZSm.A.3DEg9gnE2q7bnxjiCqzdCrdVLDnw0P6ACFuT2 GdN1DWKjiH3ZVlpai3Jqdy10WpW_3F7eYzyUWZEkqsHIjq2OD06zxPvxOxSq ZpXHVNq9sK9_47JHkBL5rjflZhQLyORutWb4UuZDPhz9duzz5P1N.fsUHNWd RUT_DD5gRHghX.dWry7jVBazXpvqObolsFOBIP.8YhJzLXYjafb3LCWw59QP 0jPGzUGfbKupRNFSMr5EpJQpgiui15JMNz8WIiIMWuHq_YGGO0Y1NOVIhE6w HUdhInGZ9DApX0X8leBeH24iJpumMoQKX5vL1gaqx3liJeGhvYxVH0nNIQK6 hfXM9IwGzvk0- Received: from [75.68.55.128] by web164603.mail.gq1.yahoo.com via HTTP; Thu, 17 Oct 2013 09:30:56 PDT X-Rocket-MIMEInfo: 002.001,QSBjb3VwbGUgcXVlc3Rpb25zOgoKMSkgSG93IGRpZCB5b3UgZGV0ZXJtaW5lIHRoYXQgdGhlIHJlY29yZCBpcyBkZWxldGVkIG9uIG9ubHkgb25lIG5vZGU_wqBBcmUgeW91IGxvb2tpbmcgZm9yIHRvbWJzdG9uZXMsIG9yIHRoZSBvcmlnaW5hbCBlbnRyeSB0aGF0IHdhcyBpbnNlcnRlZD_CoE5vdGUgdGhhdCB3aGVuIGFuIGl0ZW0gaXMgZGVsZXRlZCwgdGhlIG9yaWdpbmFsIGVudHJ5IGNhbiBzdGlsbCBiZSBpbsKgYW4gU1NUQUJMRSBzb21ld2hlcmUsIGFuZCB0aGUgdG9tYnN0b25lIGNhbiBiZSBpbiBhbm8BMAEBAQE- X-Mailer: YahooMailWebService/0.8.160.587 References: <64F4809D-A694-4DF4-A72A-E37BE980740A@jonhaddad.com> Message-ID: <1382027456.2836.YahooMailNeo@web164603.mail.gq1.yahoo.com> Date: Thu, 17 Oct 2013 09:30:56 -0700 (PDT) From: Michael Theroux Reply-To: Michael Theroux Subject: Re: DELETE does not delete :) To: "user@cassandra.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="277821269-1025352039-1382027456=:2836" X-Virus-Checked: Checked by ClamAV on apache.org --277821269-1025352039-1382027456=:2836 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable A couple questions:=0A=0A1) How did you determine that the record is delete= d on only one node?=C2=A0Are you looking for tombstones, or the original en= try that was inserted?=C2=A0Note that when an item is deleted, the original= entry can still be in=C2=A0an SSTABLE somewhere, and the tombstone can be = in another SSTABLE until=C2=A0those tables are compacted together.=0A=0A2) = When you did the global daily check, are you sure you are not getting=C2=A0= range ghosts? I assume they are still possible on 2.0=C2=A0=0A(http://www.d= atastax.com/docs/0.7/getting_started/using_cli,=C2=A0search for=C2=A0range = ghosts).=0A=0AThanks,=0A-Mike=0A=0A=0A=0A=0AOn Thursday, October 17, 2013 6= :36 AM, Alexander Shutyaev wrote:=0A =0AHi Daniel, Nat= e.=0A=0AThanks for your answers. We have=C2=A0gc_grace_seconds=3D864000 (wh= ich is the default, I believe). We've also checked the clocks - they are sy= nchronized.=0A=0A=0A=0A2013/10/16 Nate McCall =0A= =0AThis is almost a guaranteed sign that the clocks are off in your cluster= . If you run the select query a couple of times in a row right after deleti= on, do you see the data appear again?=0A>=0A>=0A>=0A>On Wed, Oct 16, 2013 a= t 12:12 AM, Alexander Shutyaev wrote:=0A>=0A>Hi all,= =0A>>=0A>>=0A>>Unfortunately, we still have a problem. I've modified my cod= e, so that it explicitly sets the consistency level to QUORUM for each quer= y. However, we found out a few cases when the record is deleted on only 1 n= ode of 3. In this cases the delete query executed ok, and the select query = that we do right after delete returned 0 rows. Later when we ran a global d= aily check select returned 1 row. How can that be? What can we be missing?= =0A>>=0A>>=0A>>=0A>>2013/10/7 Jon Haddad =0A>>=0A>>I hav= en't used VMWare but it seems odd that it would lock up the ntp port. =C2= =A0try "ps aux | grep ntp" to see if ntpd it's already running.=0A>>>=0A>>>= =0A>>>On Oct 7, 2013, at 12:23 AM, Alexander Shutyaev = wrote:=0A>>>=0A>>>Hi Micha=C5=82,=0A>>>>=0A>>>>=0A>>>>I didn't notice your = message at first.. Well this seems like a real cause candidate.. I'll add a= n explicit consistency level QUORUM and see if that helps. Thanks=0A>>>>=0A= >>>>=0A>>>>=0A>>>>2013/10/7 Alexander Shutyaev =0A>>>>= =0A>>>>Hi Nick,=0A>>>>>=0A>>>>>Thanks for the note! We have our cassanra in= stances installed on virtual hosts in VMWare and the clock synchronization = is handled by the latter, so I can't use ntpdate (says that NTP socket is i= n use). Is there any way to check if the clocks are really synchronized? My= best attempt was using three shell windows with commands already typed thu= s requiring only clicking on the window and hitting enter. The results vari= ed by 100-200 msec which I guess is just about the time I need to click and= press enter :)=0A>>>>>=0A>>>>>=0A>>>>>Thanks in advance,=0A>>>>>Alexander= =0A>>>>>=0A>>>>>=0A>>>>>=0A>>>>>2013/10/7 Nikolay Mihaylov = =0A>>>>>=0A>>>>>Hi=0A>>>>>>=0A>>>>>>=0A>>>>>>my two cents - before doing an= ything else, make sure clocks are synchronized to the millisecond.=0A>>>>>>= ntp will do so.=0A>>>>>>=0A>>>>>>=0A>>>>>>Nick.=0A>>>>>>=0A>>>>>>=0A>>>>>>= =0A>>>>>>On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev wrote:=0A>>>>>>=0A>>>>>>Hi all,=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>We have = encountered the following problem with cassandra.=0A>>>>>>>=0A>>>>>>>=0A>>>= >>>>* We use cassandra v2.0.0 from Datastax community repo.=0A>>>>>>>=0A>>>= >>>>=0A>>>>>>>* We have 3 nodes in a cluster, all of them are seed provider= s.=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* We have a single keyspace with replicatio= n factor =3D 3:=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>CREATE KEYSPACE bof WITH repli= cation =3D {=0A>>>>>>>=C2=A0 'class': 'SimpleStrategy',=0A>>>>>>>=C2=A0 're= plication_factor': '3'=0A>>>>>>>};=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* We use Da= tastax Java CQL Driver v1.0.3 in our application.=0A>>>>>>>=0A>>>>>>>=0A>>>= >>>>* We have not modified any consistency settings in our app, so I assume= we have the default QUORUM (2 out of 3 in our case) consistency for reads = and writes.=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* We have 400+ tables which can be= divided in two groups (main and uids). All tables in a group have the same= definition, they vary only by name. The sample definitions are:=0A>>>>>>>= =0A>>>>>>>=0A>>>>>>>CREATE TABLE bookingfile (=0A>>>>>>>=C2=A0 key text,=0A= >>>>>>>=C2=A0 entity_created timestamp,=0A>>>>>>>=C2=A0 entity_createdby te= xt,=0A>>>>>>>=C2=A0 entity_entitytype text,=0A>>>>>>>=C2=A0 entity_modified= timestamp,=0A>>>>>>>=C2=A0 entity_modifiedby text,=0A>>>>>>>=C2=A0 entity_= status text,=0A>>>>>>>=C2=A0 entity_uid text,=0A>>>>>>>=C2=A0 entity_update= policy text,=0A>>>>>>>=C2=A0 version_created timestamp,=0A>>>>>>>=C2=A0 ver= sion_createdby text,=0A>>>>>>>=C2=A0 version_data blob,=0A>>>>>>>=C2=A0 ver= sion_dataformat text,=0A>>>>>>>=C2=A0 version_datasource text,=0A>>>>>>>=C2= =A0 version_modified timestamp,=0A>>>>>>>=C2=A0 version_modifiedby text,=0A= >>>>>>>=C2=A0 version_uid text,=0A>>>>>>>=C2=A0 version_versionnotes text,= =0A>>>>>>>=C2=A0 version_versionnumber int,=0A>>>>>>>=C2=A0 versionscount i= nt,=0A>>>>>>>=C2=A0 PRIMARY KEY (key)=0A>>>>>>>) WITH=0A>>>>>>>=C2=A0 bloom= _filter_fp_chance=3D0.010000 AND=0A>>>>>>>=C2=A0 caching=3D'KEYS_ONLY' AND= =0A>>>>>>>=C2=A0 comment=3D'' AND=0A>>>>>>>=C2=A0 dclocal_read_repair_chanc= e=3D0.000000 AND=0A>>>>>>>=C2=A0 gc_grace_seconds=3D864000 AND=0A>>>>>>>=C2= =A0 index_interval=3D128 AND=0A>>>>>>>=C2=A0 read_repair_chance=3D0.100000 = AND=0A>>>>>>>=C2=A0 replicate_on_write=3D'true' AND=0A>>>>>>>=C2=A0 populat= e_io_cache_on_flush=3D'false' AND=0A>>>>>>>=C2=A0 default_time_to_live=3D0 = AND=0A>>>>>>>=C2=A0 speculative_retry=3D'NONE' AND=0A>>>>>>>=C2=A0 memtable= _flush_period_in_ms=3D0 AND=0A>>>>>>>=C2=A0 compaction=3D{'class': 'SizeTie= redCompactionStrategy'} AND=0A>>>>>>>=C2=A0 compression=3D{'sstable_compres= sion': 'LZ4Compressor'};=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>CREATE TABLE bookingf= ile_uids (=0A>>>>>>>=C2=A0 date text,=0A>>>>>>>=C2=A0 timeanduid text,=0A>>= >>>>>=C2=A0 deleted boolean,=0A>>>>>>>=C2=A0 PRIMARY KEY (date, timeanduid)= =0A>>>>>>>) WITH=0A>>>>>>>=C2=A0 bloom_filter_fp_chance=3D0.010000 AND=0A>>= >>>>>=C2=A0 caching=3D'KEYS_ONLY' AND=0A>>>>>>>=C2=A0 comment=3D'' AND=0A>>= >>>>>=C2=A0 dclocal_read_repair_chance=3D0.000000 AND=0A>>>>>>>=C2=A0 gc_gr= ace_seconds=3D864000 AND=0A>>>>>>>=C2=A0 index_interval=3D128 AND=0A>>>>>>>= =C2=A0 read_repair_chance=3D0.100000 AND=0A>>>>>>>=C2=A0 replicate_on_write= =3D'true' AND=0A>>>>>>>=C2=A0 populate_io_cache_on_flush=3D'false' AND=0A>>= >>>>>=C2=A0 default_time_to_live=3D0 AND=0A>>>>>>>=C2=A0 speculative_retry= =3D'NONE' AND=0A>>>>>>>=C2=A0 memtable_flush_period_in_ms=3D0 AND=0A>>>>>>>= =C2=A0 compaction=3D{'class': 'SizeTieredCompactionStrategy'} AND=0A>>>>>>>= =C2=A0 compression=3D{'sstable_compression': 'LZ4Compressor'};=0A>>>>>>>=0A= >>>>>>>=0A>>>>>>>CREATE INDEX BookingFile_uids_deleted ON bookingfile_uids = (deleted);=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* We don't have any problems with t= he tables from the main group.=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* As for the ta= bles from the uids=C2=A0group we have noticed that sometimes deletes from t= hese tables do not do their job. They don't fail, they just do nothing. We = have confirmed this by adding a select query after deletes. Most times ever= ything is ok and select returns 0 records. But sometimes (~5 out of 100,000= ) it returns the supposedly deleted row.=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* We = have logged the ExecutionInfo objects with query tracing that are returned = by Datastax's driver. Here are the details=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>DEL= ETE FROM bookingfile_uids WHERE date=3DC20131006 AND timeAndUid=3D195248590= _4762ce41-d2d2-448d-be8c-c7fcb6b7394e=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>Executio= nInfo: [=0A>>>>>>>triedHosts=3D/10.10.30.23;=0A>>>>>>>queriedHost=3D/10.10.= 30.23;=0A>>>>>>>achievedConsistencyLevel=3Dnull;=0A>>>>>>>queryTrace=3D=0A>= >>>>>>Message received from /10.10.30.23 on /10.10.30.19[Thread-56] at Sun = Oct 06 19:55:57 MSK 2013=0A>>>>>>>Acquiring switchLock read lock on /10.10.= 30.19[MutationStage:49] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Appending = to commitlog on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57 MSK 2= 013=0A>>>>>>>Adding to bookingfile_uids memtable on /10.10.30.19[MutationSt= age:49] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Enqueuing response to /10.= 10.30.23 on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57 MSK 2013= =0A>>>>>>>Sending message to /10.10.30.23 on /10.10.30.19[WRITE-/10.10.30.2= 3] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Message received from /10.10.30= .23 on /10.10.30.20[Thread-34] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Acq= uiring switchLock read lock on /10.10.30.20[MutationStage:43] at Sun Oct 06= 19:55:57 MSK 2013=0A>>>>>>>Appending to commitlog on /10.10.30.20[Mutation= Stage:43] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Adding to bookingfile_ui= ds memtable on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:57 MSK 20= 13=0A>>>>>>>Enqueuing response to /10.10.30.23 on /10.10.30.20[MutationStag= e:43] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Sending message to /10.10.30= .23 on /10.10.30.20[WRITE-/10.10.30.23] at Sun Oct 06 19:55:57 MSK 2013=0A>= >>>>>>Determining replicas for mutation on /10.10.30.23[Native-Transport-Re= quests:1387368] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Sending message to= /10.10.30.19 on /10.10.30.23[WRITE-/10.10.30.19] at Sun Oct 06 19:55:57 MS= K 2013=0A>>>>>>>Acquiring switchLock read lock on /10.10.30.23[MutationStag= e:46] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>>>Sending message to /10.10.30= .20 on /10.10.30.23[WRITE-/10.10.30.20] at Sun Oct 06 19:55:57 MSK 2013=0A>= >>>>>>Message received from /10.10.30.20 on /10.10.30.23[Thread-5] at Sun O= ct 06 19:55:57 MSK 2013=0A>>>>>>>Processing response from /10.10.30.20 on /= 10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013=0A>>>>>= >>Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct 0= 6 19:55:57 MSK 2013=0A>>>>>>>Processing response from /10.10.30.19 on /10.1= 0.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:57 MSK 2013;=0A>>>>>>>]= =0A>>>>>>>=0A>>>>>>>=0A>>>>>>>SELECT * FROM bookingfile_uids WHERE date=3DC= 20131006 AND timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e re= turned 1 record=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>the same query 1 sec= ond later:=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>DELETE FROM bookingfile_uids WHERE = date=3DC20131006 AND timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7fcb6b= 7394e=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>ExecutionInfo: [=0A>>>>>>>triedHosts=3D/= 10.10.30.20;=0A>>>>>>>queriedHost=3D/10.10.30.20;=0A>>>>>>>achievedConsiste= ncyLevel=3Dnull;=0A>>>>>>>queryTrace=3D=0A>>>>>>>Message received from /10.= 10.30.20 on /10.10.30.19[Thread-57] at Sun Oct 06 19:55:58 MSK 2013=0A>>>>>= >>Determining replicas for mutation on /10.10.30.20[Native-Transport-Reques= ts:1387705] at Sun Oct 06 19:55:58 MSK 2013=0A>>>>>>>Acquiring switchLock r= ead lock on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:58 MSK 2013= =0A>>>>>>>Appending to commitlog on /10.10.30.20[MutationStage:43] at Sun O= ct 06 19:55:58 MSK 2013=0A>>>>>>>Adding to bookingfile_uids memtable on /10= .10.30.20[MutationStage:43] at Sun Oct 06 19:55:58 MSK 2013=0A>>>>>>>Sendin= g message to /10.10.30.19 on /10.10.30.20[WRITE-/10.10.30.19] at Sun Oct 06= 19:55:58 MSK 2013=0A>>>>>>>Sending message to /10.10.30.23 on /10.10.30.20= [WRITE-/10.10.30.23] at Sun Oct 06 19:55:58 MSK 2013=0A>>>>>>>Message recei= ved from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct 06 19:55:58 MSK = 2013=0A>>>>>>>Processing response from /10.10.30.19 on /10.10.30.20[Request= ResponseStage:6] at Sun Oct 06 19:55:58 MSK 2013=0A>>>>>>>Message received = from /10.10.30.20 on /10.10.30.23[Thread-18] at Sun Oct 06 19:55:58 MSK 201= 3=0A>>>>>>>]=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>SELECT * FROM bookingfile_uids WH= ERE date=3DC20131006 AND timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7f= cb6b7394e returned 0 records.=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>* Cassandra's sy= stem.log on all 3 nodes lists nothing special during these queries - just s= ome compaction related INFO entries.=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>Can anyon= e help with this? What is our next step?=0A>>>>>>>=0A>>>>>>>=0A>>>>>>>Thank= s in advance,=0A>>>>>>>Alexander=0A>>>>>>=0A>>>>>=0A>>>>=0A>>>=0A>>=0A>=0A>= =0A>=0A>-- =0A>=0A>-----------------=0A>Nate McCall=0A>Austin, TX=0A>@zznat= e=0A>=0A>Co-Founder & Sr. Technical Consultant=0A>Apache Cassandra Consulti= ng=0A>http://www.thelastpickle.com --277821269-1025352039-1382027456=:2836 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
A= couple questions:
1) How did you determine that the record is deleted on on= ly one node? A= re you looking for tombstones, or the original entry that was inserted? Not= e that when an item is deleted, the original entry can still be in an SSTABLE somewhere, = and the tombstone can be in another SSTABLE until those tables are compacted together.<= /span>

2= ) When you did the global daily check, are you sure you are not getting range ghosts? I assume = they are still possible on 2.0 
(http://www.datastax.com/docs/0.= 7/getting_started/using_cli, search for range ghosts).
<= br style=3D"font-family: 'Helvetica Neue', 'Segoe UI', Helvetica, Arial, 'L= ucida Grande', sans-serif; font-size: 13px; ">Thanks,
-Mike
<= /div>


On Th= ursday, October 17, 2013 6:36 AM, Alexander Shutyaev <shutyaev@gmail.com= > wrote:
Hi Daniel, Nate.

Thanks for your answers. We have gc_grace_seconds=3D864000 (which is the de= fault, I believe). We've also checked the clocks - they are synchronized.
=0A


2013/10/16 N= ate McCall <nate@thelastpickle.com>
=0A
This is almost a= guaranteed sign that the clocks are off in your cluster. If you run the se= lect query a couple of times in a row right after deletion, do you see the = data appear again?
=0A
=0A

On Wed, Oct 16, 2013 at 12:12 AM, Alexand= er Shutyaev <shutyaev@gmail.com> wrote:
=0A=0A
Hi all,
<= br clear=3D"none">
Unfortunately, we still have a problem. I've m= odified my code, so that it explicitly sets the consistency level to QUO= RUM for each query. However, we found out a few cases when the record i= s deleted on only 1 node of 3. In this cases the delete query= executed ok, and the select query that we do right after delete ret= urned 0 rows. Later when we ran a global daily check select r= eturned 1 row. How can that be? What can we be missing?
=0A=0A= =0A


2013/10/7 Jon= Haddad <jon@jonhaddad.com>
=0A
=0A
I = haven't used VMWare but it seems odd that it would lock up the ntp port. &n= bsp;try "ps aux | grep ntp" to see if ntpd it's already running.
=

=0A
On Oct 7, 2013, at 12:23 AM, Alexander= Shutyaev <shutyaev@g= mail.com> wrote:

<= div dir=3D"ltr">Hi Micha=C5=82,<= /span>
=0A=0A=0A
=0A
I di= dn't notice your message at first.. Well this seems like a real cause candi= date.. I'll add an explicit consistency level QUORUM and see if that helps.= Thanks
=0A=0A=0A=0A


2013/10/7 Alexander Shutyaev <shutyaev@gmail.com>
=0A=0A=0A=0A
Hi Nick,

Thanks for the no= te! We have our cassanra instances installed on virtual hosts in VMWare and= the clock synchronization is handled by the latter, so I can't use ntpdate= (says that NTP socket is in use). Is there any way to check if the clocks = are really synchronized? My best attempt was using three shell windows with= commands already typed thus requiring only clicking on the window and hitt= ing enter. The results varied by 100-200 msec which I guess is just about t= he time I need to click and press enter :)
=0A=0A=0A=0A=0A
Thanks in advance,
Alexander


2013/10/7 Nikolay Mihaylov <nmmm@nmmm.nu&= gt;
=0A=0A=0A=0A=0A
Hi

my two = cents - before doing anything else, make sure clocks are synchronized to th= e millisecond.
=0A=0A=0A=0A=0A
ntp will do so.

Nick.
=0A=0A

On Mon, Oct 7, 2013 at 9:02 AM, Alexander Shutyaev <shutyaev@= gmail.com> wrote:
=0A=0A=0A=0A=0A=0A=0A
Hi all,

We have encountered the following problem with cas= sandra.

* We use cassandra v2.0.= 0 from Datastax community repo.
=0A
=
* We have 3 nodes in a cluster, all of them are seed prov= iders.

* We have a single keyspa= ce with replication factor =3D 3:

<= /div>=0A
CREATE KEYSPACE bof WITH replication =3D {
  'class': 'SimpleStrategy',
  'replicatio= n_factor': '3'
};

=0A
* We use Datastax Java CQL Driver v1.0.3 in our application.<= /div>

* We have not modified any consi= stency settings in our app, so I assume we have the default QUORUM (2 out of 3 in our case) consistency for reads and writes.
= =0A=0A=0A=0A=0A=0A=0A=0A

* We have 400+ t= ables which can be divided in two groups (main and uids). All= tables in a group have the same definition, they vary only by name. The sa= mple definitions are:
=0A

CR= EATE TABLE bookingfile (
  key text,
  entity_created timestamp,
  entity_createdby = text,
  entity_entitytype text,
=0A=0A=0A=0A= =0A=0A=0A=0A
  entity_modified timestamp,
&nbs= p; entity_modifiedby text,
  entity_status text,<= /div>
  entity_uid text,
  entity_updatep= olicy text,
  version_created timestamp,
=0A= =0A=0A=0A=0A=0A=0A=0A
  version_createdby text,
<= b>  version_data blob,
  version_dataformat text= ,
  version_datasource text,
  = version_modified timestamp,
=0A=0A=0A=0A=0A=0A=0A=0A = version_modifiedby text,
  version_uid text,
  version_versionnotes text,
  version= _versionnumber int,
  versionscount int,
  PRIMARY KEY (key)
=0A=0A=0A=0A=0A=0A=0A=0A
) WI= TH
  bloom_filter_fp_chance=3D0.010000 AND
<= div>  caching=3D'KEYS_ONLY' AND
  comment=3D'= ' AND
  dclocal_read_repair_chance=3D0.000000 AND=
=0A=0A=0A=0A=0A=0A=0A=0A
  gc_grace_seconds=3D864000 AND<= /b>
  index_interval=3D128 AND
  re= ad_repair_chance=3D0.100000 AND
  replicate_on_write= =3D'true' AND
  populate_io_cache_on_flush=3D'false' = AND
=0A=0A=0A=0A=0A=0A=0A=0A
  default_time_to_live=3D= 0 AND
  speculative_retry=3D'NONE' AND
=   memtable_flush_period_in_ms=3D0 AND
  compa= ction=3D{'class': 'SizeTieredCompactionStrategy'} AND
=0A=0A=0A=0A= =0A=0A=0A=0A
  compression=3D{'sstable_compression': 'LZ4Compre= ssor'};

CREATE TABLE bo= okingfile_uids (
  date text,
 = timeanduid text,
=0A=0A=0A=0A=0A=0A=0A=0A
  deleted b= oolean,
  PRIMARY KEY (date, timeanduid)
) WITH
  bloom_filter_fp_chance=3D0.010000 AND
  caching=3D'KEYS_ONLY' AND
=0A=0A=0A=0A= =0A=0A=0A=0A  comment=3D'' AND
  dclocal_read= _repair_chance=3D0.000000 AND
  gc_grace_seconds=3D86= 4000 AND
  index_interval=3D128 AND
=   read_repair_chance=3D0.100000 AND
=0A=0A=0A=0A=0A=0A=0A=0A<= div>  replicate_on_write=3D'true' AND
  popul= ate_io_cache_on_flush=3D'false' AND
  default_time_to= _live=3D0 AND
  speculative_retry=3D'NONE' AND=0A=0A=0A=0A=0A=0A=0A=0A
  memtable_flush_period_in_ms=3D0 A= ND
  compaction=3D{'class': 'SizeTieredCompactionStra= tegy'} AND
  compression=3D{'sstable_compression': 'L= Z4Compressor'};
=0A=0A=0A=0A=0A=0A=0A=0A

CREATE INDEX BookingFile_uids_deleted ON bookingfile_uid= s (deleted);

* We don't h= ave any problems with the tables from the main group.
=0A
<= br clear=3D"none">
* As for the tables from the uids = group we have noticed that sometimes deletes from these tables do not do th= eir job. They don't fail, they just do nothing. We have confirmed this by a= dding a select query after deletes. Most times everything is ok and select = returns 0 records. But sometimes (~5 out of 100,000) it returns the suppose= dly deleted row.
=0A=0A=0A=0A=0A=0A=0A=0A

* We have logged the ExecutionInfo objects with query tracing that ar= e returned by Datastax's driver. Here are the details

DELETE FROM bookingfile_uids WHERE date=3DC20131006= AND timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e
= =0A=0A=0A=0A=0A=0A=0A=0A

Executio= nInfo: [
triedHosts=3D/10.10.30.23;
<= div>queriedHost=3D/10.10.30.23;
=0A=0A=0A=0A=0A=0A= =0A
=0AachievedConsistencyLevel=3Dnull;
queryTrace= =3D
=09Messag= e received from /10.10.30.23 on /10.10.30.19[Thread= -56] at Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
= =09Acquiring switchLock rea= d lock on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57= MSK 2013
=09= Appending to commitlog on /10.10.30.19[MutationStage:49] at Sun= Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Adding to bookingfile_uids memtable= on /10.10.30.19[MutationStage:49] at Sun Oct 06 19:55:57 MSK 2= 013
=09Enqueu= ing response to /10.10.30.23 on /10.10.30.19[Mutati= onStage:49] at Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A= =0A
=09Sending message = to /10.10.30.23 on /10.10.30.19[WRITE-/10= .10.30.23] at Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A= =0A=0A
=09Message recei= ved from /10.10.30.23 on /10.10.30.20[Thread-34] at= Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Acquiring switchLock read lock = on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:57 MSK 20= 13
=09Appendi= ng to commitlog on /10.10.30.20[MutationStage:43] at Sun Oct 06= 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Adding to bookingfile_uids memtable on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:57 MSK 2013=
=09Enqueuing res= ponse to /10.10.30.23 on /10.10.30.20[MutationStage= :43] at Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A
= =09Determining replicas for= mutation on /10.10.30.23[Native-Transport-Requests:1387368] at= Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A
=09Sending message to /10.10.= 30.19 on /10.10.30.23[WRITE-/10.10.30.19] at Su= n Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Acquiring switchLock read lock on = /10.10.30.23[MutationStage:46] at Sun Oct 06 19:55:57 MSK 2013<= /b>
=09Sending me= ssage to /10.10.30.20 on /10.10.30.23[WRITE-/10.10.30.20] at Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A= =0A=0A=0A
=09Message re= ceived from /10.10.30.20 on /10.10.30.23[Thread-5] = at Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Processing response from /10.10.30.20 on /10.10.30.23[RequestResponseStage:4] at= Sun Oct 06 19:55:57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Message received from /10.10.30.19 on /10.10.30.23[Thread-7] at Sun Oct 06 19:55= :57 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Processing response from /10.10.30.19 on /10.10.30.23[RequestResponseStage:4] at Sun Oct 06 19:55:5= 7 MSK 2013;
=0A=0A=0A=0A=0A=0A=0A=0A
]

SELECT * FROM bookingfile_uids WHERE date=3DC201= 31006 AND timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e retur= ned 1 record

the same query 1 second later:
=0A=0A=0A=0A=0A=0A=0A=0A

DELETE FROM bookingfile_uids WHERE date=3DC201= 31006 AND timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e

ExecutionInfo: [
=
=0AtriedHosts=3D/10.10.30.20;
queriedH= ost=3D/10.10.30.20;
achievedConsistencyLevel= =3Dnull;
=0A=0A=0A=0A=0A=0A=0A
queryTrace=3D
=0A=09Message received from= /10.10.30.20 on /10.10.30.19[Thread-57] at Sun Oct= 06 19:55:58 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Determining replicas for mutation on /= 10.10.30.20[Native-Transport-Requests:1387705] at Sun Oct 06 19= :55:58 MSK 2013
=0A=0A=0A=0A=0A=0A=0A
=09Acquiring switchLock read lock on /10.10= .30.20[MutationStage:43] at Sun Oct 06 19:55:58 MSK 2013
=0A=09Appending to commitlo= g on /10.10.30.20[MutationStage:43] at Sun Oct 06 19:55:58 MSK = 2013
=09Addin= g to bookingfile_uids memtable on /10.10.30.20[MutationStage:43= ] at Sun Oct 06 19:55:58 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A=0A
=0A=0A=0A=0A=0A=0A=0A=0A
=09Message received from /10.10.30.19 on /10.10.30.20[Thread-4] at Sun Oct 06 19= :55:58 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Processing response from /10.10.30.19= on /10.10.30.20[RequestResponseStage:6] at Sun Oct 06 19:5= 5:58 MSK 2013
=0A=0A=0A=0A=0A=0A=0A=0A
=09Message received from /10.10.30.20 = on /10.10.30.23[Thread-18] at Sun Oct 06 19:55:58 MSK 2013<= /div>=0A=0A=0A=0A=0A=0A=0A=0A
]

=
SELECT * FROM bookingfile_uids WHERE date=3DC20131006 AND= timeAndUid=3D195248590_4762ce41-d2d2-448d-be8c-c7fcb6b7394e returned 0 rec= ords.

* Cassandra's system.log = on all 3 nodes lists nothing special during these queries - just some compa= ction related INFO entries.
=0A=0A=0A=0A=0A=0A=0A=0A

Can anyone help with this? What is our next step?

Thanks in advance,
Alexand= er
=0A

=0A

=0A

=0A


= =0A



<= font color=3D"#888888">--
-----------------
Nate McCall
Austin, = TX
@zznate

Co-Founde= r & Sr. Technical Consultant
=0AApache Cassandra Cons= ulting
=0Ahttp://www.thelastpickle.com
=0A
=0A



--277821269-1025352039-1382027456=:2836--