cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wxn002@zjqunshuo.com" <wxn...@zjqunshuo.com>
Subject Re: Large temporary files generated during cleaning up
Date Mon, 26 Jun 2017 05:27:14 GMT
Thanks for the reply. I tried with "./nodetool cleanup -j 1". It's very useful because the
command reduces the amount of free space needed while only one sstable is being processed
at a time.
 
From: Alain RODRIGUEZ
Date: 2017-06-20 17:03
To: user
Subject: Re: Large temporary files generated during cleaning up
Hi Simon,

I know for sure that clean up (like compaction) need to copy the entire SSTable  (Data + index)
excepted from the part being evicted by the cleanup. As SSTables are immutable, to manipulate
(remove) data, cleanup like compaction need to copy the data we want to keep before removing
the old SSTable. Given this it is understandable that you have tmp files taking almost as
much space as the original once. That's why you can read in many places that it is good to
have over 50 % of the disk space available at any moment.

That being said:

- I am not sure why you have one tmp + one tmplink file for each sstable (like 'tmplink-lb-59517-big-Data.db'
+ 'tmp-lb-59517-big-Data.db'). There must be a reason I am not aware of (I don't believe it's
a bug, it would be quite gross). Maybe someone else knows about this?

- If new SSTable size is almost equal (or equal) to the old SSTable size, it means that there
were not much data to clean up. Remember that cleanup only deletes data outside of the token
ranges owned by each node (primary + replicas). This only occurs when adding nodes or messing
with the token ranges.

And do I have choice to do cleaning with less disk space?

I would say yes, as it seems you are using C* 2.1+, there is something you could try to make
the parallel process of cleanup to be sequential instead (as it was in pre C*2.1) by doing
this:

'nodetool cleanup -j 1' 

More informations: http://cassandra.apache.org/doc/latest/tools/nodetool/cleanup.html

It could reduce the amount of free space needed as only one cleanup would run at any moment,
meaning only 1 SSTable will be processed at the time. In these conditions, the maximum amount
of extra disk space used by this process would be related to the biggest existing SSTable.

The other thing I would explore is the reason why Cassandra is maintaining 'tmplink-lb-59517-big-Data.db'
and 'tmp-lb-59517-big-Data.db'. I didn't cleanup for a while and I don't know your Cassandra
version which makes it hard to investigate properly.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-06-20 6:51 GMT+01:00 wxn002@zjqunshuo.com <wxn002@zjqunshuo.com>:
Hi,
Cleaning up is generating temporary files which are occupying large disk space. I noticed
for every source sstable file, it is generating 4 temporary files, and two of them is almost
as large as the source sstable file. If there are two concurrent cleaning tasks running, I
have to leave the remaining disk space at least as two times large as the sum size of the
two sstable files being cleaned up.
Is it expected? And do I have choice to do cleaning with less disk space?

Below is the temporary files generated during cleaning up:
-rw-r--r-- 2 root root 798M Jun 20 13:34 tmplink-lb-59516-big-Index.db
-rw-r--r-- 2 root root 798M Jun 20 13:34 tmp-lb-59516-big-Index.db
-rw-r--r-- 2 root root 219G Jun 20 13:34 tmplink-lb-59516-big-Data.db
-rw-r--r-- 2 root root 219G Jun 20 13:34 tmp-lb-59516-big-Data.db
-rw-r--r-- 2 root root 978M Jun 20 13:33 tmplink-lb-59517-big-Index.db
-rw-r--r-- 2 root root 978M Jun 20 13:33 tmp-lb-59517-big-Index.db
-rw-r--r-- 2 root root 245G Jun 20 13:34 tmplink-lb-59517-big-Data.db
-rw-r--r-- 2 root root 245G Jun 20 13:34 tmp-lb-59517-big-Data.db

Cheers,
-Simon

Mime
View raw message