cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laxmikant Upadhyay <laxmikant....@gmail.com>
Subject Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction
Date Fri, 07 Feb 2020 05:23:15 GMT
Hi,
Just an update, We deleted obsolete sstables and it worked fine. However I
am not able to find out any jira for same issue.

On Wed, Jan 22, 2020 at 3:58 PM manish khandelwal <
manishkhandelwal03@gmail.com> wrote:

> Thanks Jeff.
>
> There was no restart between "Compacting" and "Compacted" logs but I
> observed that full repair (-pr) was running at that time with errors.
>
> *Caused by: java.lang.RuntimeException: java.io.IOException: Cannot
> proceed on repair because a neighbor (/aa.bb.cc.dd) is dead: session failed*
>
> Does anyone remember any JIRA ticket related to obsolete sstables not
> being deleted after compaction?
>
> Regards
> Manish
>
>
>
>
>
> On Wed, Jan 22, 2020 at 11:37 AM Jeff Jirsa <jjirsa@gmail.com> wrote:
>
>>
>>
>> On Tue, Jan 21, 2020 at 8:58 PM manish khandelwal <
>> manishkhandelwal03@gmail.com> wrote:
>>
>>> Thanks Nitan,
>>>
>>>  Thanks for your reply.
>>>
>>> I am using following methodology to find obsolete sstables and just want
>>> to make sure that I don't delete live data if I delete them .
>>>
>>> In the following logs I searched for sstable "
>>> keyspace-columnfamily-jb-456789" and found that this "*CompactionExecutor:1957"
>>> *thread compacted  keyspace-columnfamily-jb-123456-Data.db ,
>>> keyspace-columnfamily-jb-234567 -Data.db , keyspace-columnfamily-jb-
>>> 345678-Data.db. These files are still present in my data directory so I am
>>> assuming that they are obsolete. I*s my assumption correct*?
>>>
>>
>> The lines from 'Compacting' are the ones obsoleted IF and ONLY IF you see
>> a completed "Compacted" line for the same thread without a restart in
>> between.
>>
>>
>>>
>>> INFO [CompactionExecutor:1957] 2020-01-20 06:44:56,721
>>> CompactionTask.java (line 120) Compacting
>>> [SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-123456-Data.db*'),
>>> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-234567-Data.db*'),
>>> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-345678-Data.db*')]
>>>  INFO [CompactionExecutor:1957] 2020-01-20 12:45:23,270
>>> ColumnFamilyStore.java (line 795) Enqueuing flush of
>>> Memtable-compactions_in_progress@519967741(0/0 serialized/live bytes, 1
>>> ops)
>>>  INFO [*CompactionExecutor:1957*] 2020-01-20 12:45:23,502
>>> CompactionTask.java (line 296) Compacted 3 sstables to
>>> [/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-456789*,].  136,795,757,524 bytes to
>>> 100,529,812,389 (~73% of original) in 21,626,781ms = 4.433055MB/s.
>>>  1,738,999,743 total partitions merged to 1,274,232,528.  Partition merge
>>> counts were {1:1049583261, 2:309997005, 3:23140824, }
>>>
>>>
>> In this case,
>> /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-123456-*
>> , /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-234567-*,
>> and /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-345678-*
>> are all obsolete and should be gc'd "soon". If they're not being gc'd,
>> there's something wrong and you should figure out what's going on. The
>> cases where this happened in 2.0.x (which is what you're running) were
>> usually pretty nasty bugs, and consider this a reason why you should be
>> upgrading.
>>
>> Note that if you just `rm` those files, you'll probably throw
>> FileNotFound exceptions and break the node until you restart, which is bad.
>> You'd have to stop the host, confirm everything is shut down, then remove
>> that 137GB worth of input files if they still exist.
>>
>> Also, please upgrade to 2.1.20. Your life will probably be much easier
>> because of it.
>>
>> As with all things, these are personal opinions, I cant guarantee they're
>> safe, manually mucking around with database data files is scary, make sure
>> you have a backup, practice in a lab, etc.
>>
>>
>>> Regards
>>> Manish
>>>
>>>
>>> On Tue, Jan 21, 2020 at 9:09 PM Nitan Kainth <nitankainth@gmail.com>
>>> wrote:
>>>
>>>> If you are certain that you don’t need data, your plan is good. Make
>>>> sure to delete all the files for any given sequence number ie data, index,
>>>> toc etc
>>>>
>>>> Regards,
>>>>
>>>> Nitan
>>>>
>>>> Cell: 510 449 9629
>>>>
>>>> On Jan 21, 2020, at 5:36 AM, manish khandelwal <
>>>> manishkhandelwal03@gmail.com> wrote:
>>>>
>>>> 
>>>> Hi Team
>>>>
>>>> I am observing some obsolete files in Cassandra 2.0.14 which are
>>>> already compacted but not removed from the system after compaction.
>>>> As per CASSANDRA-7872
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-7872> , after GC
>>>> grace period has passed the sstables are open for read again and can lead
>>>> to data resurrection. I am facing disk crunch  (90% full ) as well and need
>>>> to remove those obsolete files ASAP.
>>>>
>>>>
>>>> To avoid this what should be our strategy? I am thinking on following
>>>> lines
>>>> 1. Stop the Cassandra server.
>>>> 2. Remove the obsolete files manually.
>>>> 3. Start the Cassandra server.
>>>>
>>>> Regards
>>>> Manish
>>>>
>>>>
>>>>
>>>>
>>>>

-- 

regards,
Laxmikant Upadhyay

Mime
View raw message