incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Cassandra reverting deletes?
Date Thu, 29 Apr 2010 04:51:49 GMT
Good! :)

Can you reproduce w/o map/reduce, with raw get_range_slices?

On Wed, Apr 28, 2010 at 3:56 PM, Joost Ouwerkerk <joost@openplaces.org> wrote:
> Yes! Reproduced on single-node cluster:
>
> 10/04/28 16:30:24 INFO mapred.JobClient:     ROWS=274884
> 10/04/28 16:30:24 INFO mapred.JobClient:     TOMBSTONES=951083
>
> 10/04/28 16:42:49 INFO mapred.JobClient:     ROWS=166580
> 10/04/28 16:42:49 INFO mapred.JobClient:     TOMBSTONES=1059387
>
> On Wed, Apr 28, 2010 at 10:43 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> It sounds like either there is a fairly obvious bug, or you're doing
>> something wrong. :)
>>
>> Can you reproduce against a single node?
>>
>> On Tue, Apr 27, 2010 at 5:14 PM, Joost Ouwerkerk <joost@openplaces.org> wrote:
>>> Update: I ran a test whereby I deleted ALL the rows in a column
>>> family, using a consistency level of ALL.  To do this, I mapped the
>>> ColumnFamily and called remove on each row id.  There were 1.5 million
>>> rows, so 1.5 million rows were deleted.
>>>
>>> I ran a counter job immediately after.  This job maps the same column
>>> family and tests if any data is returned.  If not, it considers the
>>> row a "tombstone".  If yes, it considers the row not deleted.  Below
>>> are the hadoop counters for those jobs.  Note the fluctuation in the
>>> number of rows with data over time, and the increase in time to map
>>> the column family after the destroy job.  No other clients were
>>> accessing cassandra during this time.
>>>
>>> I'm thoroughly confused.
>>>
>>> Count: started 13:02:30 EDT, finished 13:11:33 EDT (9 minutes 2 seconds):
>>>   ROWS:        1,542,479
>>>   TOMBSTONES:  69
>>>
>>> Destroy: started 16:48:45 EDT, finished 17:07:36 EDT (18 minutes 50 seconds)
>>>   DESTROYED:  1,542,548
>>>
>>> Count: started 17:15:42 EDT, finished 17:31:03 EDT (15 minutes 21 seconds)
>>>   ROWS 876,464
>>>   TOMBSTONES   666,084
>>>
>>> Count: started 17:31:32, finished 17:47:16 (15mins, 44 seconds)
>>>   ROWS 1,451,665
>>>   TOMBSTONES   90,883
>>>
>>> Count: started 17:52:34, finished 18:10:28 (17mins, 53 seconds)
>>>   ROWS 1,425,644
>>>   TOMBSTONES   116,904
>>>
>>> On Tue, Apr 27, 2010 at 5:37 PM, Joost Ouwerkerk <joost@openplaces.org>
wrote:
>>>> Clocks are in sync:
>>>>
>>>> cluster04:~/cassandra$ dsh -g development "date"
>>>> Tue Apr 27 17:36:33 EDT 2010
>>>> Tue Apr 27 17:36:33 EDT 2010
>>>> Tue Apr 27 17:36:33 EDT 2010
>>>> Tue Apr 27 17:36:33 EDT 2010
>>>> Tue Apr 27 17:36:34 EDT 2010
>>>> Tue Apr 27 17:36:34 EDT 2010
>>>> Tue Apr 27 17:36:34 EDT 2010
>>>> Tue Apr 27 17:36:34 EDT 2010
>>>> Tue Apr 27 17:36:34 EDT 2010
>>>> Tue Apr 27 17:36:35 EDT 2010
>>>> Tue Apr 27 17:36:35 EDT 2010
>>>> Tue Apr 27 17:36:35 EDT 2010
>>>>
>>>> On Tue, Apr 27, 2010 at 5:35 PM, Nathan McCall <nate@vervewireless.com>
wrote:
>>>>> Have you confirmed that your clocks are all synced in the cluster?
>>>>> This may be the result of an unintentional read-repair occurring if
>>>>> that were the case.
>>>>>
>>>>> -Nate
>>>>>
>>>>> On Tue, Apr 27, 2010 at 2:20 PM, Joost Ouwerkerk <joost@openplaces.org>
wrote:
>>>>>> Hmm... Even after deleting with cl.ALL, I'm getting data back for
some
>>>>>> rows after having deleted them.  Which rows return data is
>>>>>> inconsistent from one run of the job to the next.
>>>>>>
>>>>>> On Tue, Apr 27, 2010 at 1:44 PM, Joost Ouwerkerk <joost@openplaces.org>
wrote:
>>>>>>> To check that rows are gone, I check that KeySlice.columns is
empty.  And as
>>>>>>> I mentioned, immediately after the delete job, this returns the
expected
>>>>>>> number.
>>>>>>> Unfortunately I reproduced with QUORUM this morning.  No node
outages.  I am
>>>>>>> going to try ALL to see if that changes anything, but I am starting
to
>>>>>>> wonder if I'm doing something else wrong.
>>>>>>> On Mon, Apr 26, 2010 at 9:45 PM, Jonathan Ellis <jbellis@gmail.com>
wrote:
>>>>>>>>
>>>>>>>> How are you checking that the rows are gone?
>>>>>>>>
>>>>>>>> Are you experiencing node outages during this?
>>>>>>>>
>>>>>>>> DC_QUORUM is unfinished code right now, you should avoid
using it.
>>>>>>>> Can you reproduce with normal QUORUM?
>>>>>>>>
>>>>>>>> On Sat, Apr 24, 2010 at 12:23 PM, Joost Ouwerkerk <joost@openplaces.org>
>>>>>>>> wrote:
>>>>>>>> > I'm having trouble deleting rows in Cassandra.  After
running a job that
>>>>>>>> > deletes hundreds of rows, I run another job that verifies
that the rows
>>>>>>>> > are
>>>>>>>> > gone.  Both jobs run correctly.  However, when I run
the verification
>>>>>>>> > job an
>>>>>>>> > hour later, the rows have re-appeared.  This is not
a case of "ghosting"
>>>>>>>> > because the verification job actually checks that there
is data in the
>>>>>>>> > columns.
>>>>>>>> >
>>>>>>>> > I am running a cluster with 12 nodes and a replication
factor of 3.  I
>>>>>>>> > am
>>>>>>>> > using DC_QUORUM consistency when deleting.
>>>>>>>> >
>>>>>>>> > Any ideas?
>>>>>>>> > Joost.
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jonathan Ellis
>>>>>>>> Project Chair, Apache Cassandra
>>>>>>>> co-founder of Riptano, the source for professional Cassandra
support
>>>>>>>> http://riptano.com
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message