Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of blueflycn@gmail.com designates
 209.85.219.48 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CABNXB2CrngNvr+3ejF2w8evA3ERY6pc9uhFh4FaHkJwOMyWs2w@mail.gmail.com>
References: 
 <CAD_Xd-K7KyCS7ofp5M_RAfC=htqpuQOx=tq6GZsOK_vty5o+GQ@mail.gmail.com>
	<CAD_Xd-K8_3J33bts_At0kAYTZY-3480as0LOb-R9He93gi3Pjw@mail.gmail.com>
	<CABNXB2Cm3T7De=zZ6XEJoHSc3Z_sugipT9v+d-1KpBNXYypCYw@mail.gmail.com>
	<CAD_Xd-KF4fDim71unLM835X_veHYPV9rVujb7Djtokp0HKcoyA@mail.gmail.com>
	<CABNXB2DA=hDqqT=n2Gp=Bto=q47NSsUCqCJcyDiu_f03JURxCg@mail.gmail.com>
	<CAD_Xd-LkqZjs9-48JwPbnAMN6F98Df5CiC24Cy2kK6rYcQ625A@mail.gmail.com>
	<CABNXB2CrngNvr+3ejF2w8evA3ERY6pc9uhFh4FaHkJwOMyWs2w@mail.gmail.com>
Date: Sun, 4 May 2014 17:10:56 +0800
Message-ID: 
 <CAD_Xd-LoKqt9ggf-t22qXF9f3GXskqqSm91xAN=Yx=YK8sEkZQ@mail.gmail.com>
Subject: Re: Cassandra 2.0.7 keeps reporting errors due to no space left on
 device
From: Yatong Zhang <blueflycn@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=089e0149cdc87e181904f88f64fb

--089e0149cdc87e181904f88f64fb
Content-Type: text/plain; charset=UTF-8

I am using the latest 2.0.7. The 'nodetool tpstats' shows as:

[root@storage5 bin]# ./nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked
> All time blocked
> ReadStage                         0         0         628220
> 0                 0
> RequestResponseStage              0         0        3342234
> 0                 0
> MutationStage                     0         0        3172116
> 0                 0
> ReadRepairStage                   0         0          47666
> 0                 0
> ReplicateOnWriteStage             0         0              0
> 0                 0
> GossipStage                       0         0         756024
> 0                 0
> AntiEntropyStage                  0         0              0
> 0                 0
> MigrationStage                    0         0              0
> 0                 0
> MemoryMeter                       0         0           6652
> 0                 0
> MemtablePostFlusher               0         0           7042
> 0                 0
> FlushWriter                       0         0           4023
> 0                 0
> MiscStage                         0         0              0
> 0                 0
> PendingRangeCalculator            0         0             27
> 0                 0
> commitlog_archiver                0         0              0
> 0                 0
> InternalResponseStage             0         0              0
> 0                 0
> HintedHandoff                     0         0             28
> 0                 0
>
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> PAGED_RANGE                  0
> BINARY                       0
> READ                         0
> MUTATION                     0
> _TRACE                       0
> REQUEST_RESPONSE             0
> COUNTER_MUTATION             0
>

 And here is another type of error, and these errors seem to occur after
'disk is full'

ERROR [SSTableBatchOpen:2] 2014-04-30 13:47:48,348 CassandraDaemon.java
> (line 198) Exception in thread Thread[SSTableBatchOpen:2,5,main]
> org.apache.cassandra.io.sstable.CorruptSSTableException:
> java.io.EOFException
>         at
> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:110)
>         at
> org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:64)
>         at
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:42)
>         at
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:458)
>         at
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:422)
>         at
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:203)
>         at
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:184)
>         at
> org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:264)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.EOFException
>         at
> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
>         at java.io.DataInputStream.readUTF(DataInputStream.java:589)
>         at java.io.DataInputStream.readUTF(DataInputStream.java:564)
>         at
> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:85)
>         ... 12 more
>


On Sun, May 4, 2014 at 4:59 PM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> The symptoms looks like there are pending compactions stacking up or
> failed compactions so temporary files (-tmp-Data.db) are not properly
> cleaned up.
>
>  What is your Cassandra version ? Can you do a "nodetool tpstats" and look
> into Cassandra logs to see whether there are issues with compactions ?
>
> I've found one discussion thread that have the same symptoms:
> http://comments.gmane.org/gmane.comp.db.cassandra.user/22089
>
>
>
>
> On Sun, May 4, 2014 at 10:39 AM, Yatong Zhang <blueflycn@gmail.com> wrote:
>
>> Yes after a while the disk fills up again. So I changed the compaction
>> strategy from 'sized tiered' to 'leveled' to reduce the disk usage when
>> compacting, but the problem still occurs.
>>
>> This table has lots of write and a relative very small read, and no
>> update. here is the schema of the table:
>>
>> CREATE TABLE mydb.images (
>>   image_id uuid PRIMARY KEY,
>>   available boolean,
>>   message text,
>>   raw_data blob,
>>   time_created timestamp,
>>   url text
>> ) WITH
>>   bloom_filter_fp_chance=0.010000 AND
>>   caching='KEYS_ONLY' AND
>>   comment='' AND
>>   dclocal_read_repair_chance=0.000000 AND
>>   gc_grace_seconds=864000 AND
>>   read_repair_chance=0.100000 AND
>>   replicate_on_write='true' AND
>>   populate_io_cache_on_flush='false' AND
>>   compaction={'sstable_size_in_mb': '192', 'class':
>> 'LeveledCompactionStrategy'} AND
>>   compression={'sstable_compression': 'LZ4Compressor'};
>>
>>
>> On Sun, May 4, 2014 at 4:31 PM, DuyHai Doan <doanduyhai@gmail.com> wrote:
>>
>>> And after a while the /data6 drive fills up again right ?
>>>
>>>  One question, can you please give the CQL3 definition of your "mydb-images-tmp"
>>> table ?
>>>
>>> What is the access pattern for this table ? Lots of write ? Lots of
>>> update ?
>>>
>>>
>>>
>>>
>>> On Sun, May 4, 2014 at 10:00 AM, Yatong Zhang <blueflycn@gmail.com>wrote:
>>>
>>>> after restarting or 'cleanup' the big tmp file has gone and all looks
>>>> like fine:
>>>>
>>>> -rw-r--r-- 1 root root  19K Apr 30 13:58
>>>>> mydb_oe-images-tmp-jb-96242-CompressionInfo.db
>>>>> -rw-r--r-- 1 root root 145M Apr 30 13:58
>>>>> mydb_oe-images-tmp-jb-96242-Data.db
>>>>> -rw-r--r-- 1 root root  64K Apr 30 13:58
>>>>> mydb_oe-images-tmp-jb-96242-Index.db
>>>>>
>>>>
>>>> [root@node5 images]# df -hl
>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>> /dev/sda1        49G  7.5G   39G  17% /
>>>> tmpfs           7.8G     0  7.8G   0% /dev/shm
>>>> /dev/sda3       3.6T  1.3T  2.1T  38% /data1
>>>> /dev/sdb1       3.6T  1.4T  2.1T  39% /data2
>>>> /dev/sdc1       3.6T  466G  3.0T  14% /data3
>>>> /dev/sdd1       3.6T  1.3T  2.2T  38% /data4
>>>> /dev/sde1       3.6T  1.3T  2.2T  38% /data5
>>>> /dev/sdf1       3.6T  662M  3.4T   1% /data6
>>>>
>>>> I didn't perform repair, not even for one time
>>>>
>>>>
>>>> On Sun, May 4, 2014 at 2:37 PM, DuyHai Doan <doanduyhai@gmail.com>wrote:
>>>>
>>>>> Hello Yatong
>>>>>
>>>>> "If I restart the node or using 'cleanup', it will resume to normal."
>>>>> --> what does df -hl shows for /data6 when you restart or cleanup the node ?
>>>>>
>>>>> By the way, a single SSTable of 3.6Tb is kind of huge. Do you perform
>>>>> manual repair frequently ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, May 4, 2014 at 1:51 AM, Yatong Zhang <blueflycn@gmail.com>wrote:
>>>>>
>>>>>> My Cassandra cluster has plenty of free space, for now only about 30%
>>>>>> of space are used
>>>>>>
>>>>>>
>>>>>> On Sun, May 4, 2014 at 6:36 AM, Yatong Zhang <blueflycn@gmail.com>wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> It was strange that the 'xxx-tmp-xxx.db' file kept increasing until
>>>>>>> Cassandra throw exceptions with 'No space left on device'. I am using CQL 3
>>>>>>> to create a table to store data about 200K ~ 500K per record. I have 6
>>>>>>> harddisks per node and cassandra was configured with 6 data
>>>>>>> directories(ext4 file systems, Centos 6.5):
>>>>>>>
>>>>>>> data_file_directories:
>>>>>>>>     - /data1/cass
>>>>>>>>     - /data2/cass
>>>>>>>>     - /data3/cass
>>>>>>>>     - /data4/cass
>>>>>>>>     - /data5/cass
>>>>>>>>     - /data6/cass
>>>>>>>>
>>>>>>>
>>>>>>> And every directory is on a standalone disk. But I just found when
>>>>>>> the error occurred:
>>>>>>>
>>>>>>> [root@node5 images]# ll -hl
>>>>>>>> total 3.6T
>>>>>>>> drwxr-xr-x 4 root root 4.0K Jan 20 09:44 snapshots
>>>>>>>> -rw-r--r-- 1 root root 456M Apr 30 13:42
>>>>>>>> mydb-images-tmp-jb-91068-CompressionInfo.db
>>>>>>>> -rw-r--r-- 1 root root 3.5T Apr 30 13:42
>>>>>>>> mydb-images-tmp-jb-91068-Data.db
>>>>>>>> -rw-r--r-- 1 root root    0 Apr 30 13:42
>>>>>>>> mydb-images-tmp-jb-91068-Filter.db
>>>>>>>> -rw-r--r-- 1 root root 2.0G Apr 30 13:42
>>>>>>>> mydb-images-tmp-jb-91068-Index.db
>>>>>>>>
>>>>>>>
>>>>>>> [root@node5 images]# df -hl
>>>>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>>>>> /dev/sda1        49G  7.5G   39G  17% /
>>>>>>> tmpfs           7.8G     0  7.8G   0% /dev/shm
>>>>>>> /dev/sda3       3.6T  1.3T  2.1T  38% /data1
>>>>>>> /dev/sdb1       3.6T  1.4T  2.1T  39% /data2
>>>>>>> /dev/sdc1       3.6T  466G  3.0T  14% /data3
>>>>>>> /dev/sdd1       3.6T  1.3T  2.2T  38% /data4
>>>>>>> /dev/sde1       3.6T  1.3T  2.2T  38% /data5
>>>>>>> /dev/sdf1       3.6T  3.6T     0 100% /data6
>>>>>>>
>>>>>>> *mydb-images-tmp-jb-91068-Data.db *almost occupied all the disk
>>>>>>> space (4T harddisk with 3.6T actual usable size) and the error looks like:
>>>>>>>
>>>>>>> INFO [FlushWriter:4174] 2014-05-04 05:15:15,744 Memtable.java (line
>>>>>>>> 403) Completed flushing
>>>>>>>> /data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16942-Data.db
>>>>>>>> (42 bytes) for commitlog position ReplayPosition(segmentId=1398900356204,
>>>>>>>> position=25024609)
>>>>>>>>  INFO [CompactionExecutor:3689] 2014-05-04 05:15:15,745
>>>>>>>> CompactionTask.java (line 115) Compacting
>>>>>>>> [SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16940-Data.db'),
>>>>>>>> SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16942-Data.db'),
>>>>>>>> SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16941-Data.db'),
>>>>>>>> SSTableReader(path='/data3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16939-Data.db')]
>>>>>>>> ERROR [CompactionExecutor:1245] 2014-05-04 05:15:15,745
>>>>>>>> CassandraDaemon.java (line 198) Exception in thread
>>>>>>>> Thread[CompactionExecutor:1245,1,main]
>>>>>>>> FSWriteError in
>>>>>>>> /data2/cass/mydb/images/mydb-images-tmp-jb-92181-Filter.db
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(SSTableWriter.java:475)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.util.FileUtils.closeQuietly(FileUtils.java:212)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.sstable.SSTableWriter.abort(SSTableWriter.java:301)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:209)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:197)
>>>>>>>>         at
>>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>>>>         at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>         at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>         at java.lang.Thread.run(Thread.java:744)
>>>>>>>> Caused by: java.io.IOException: No space left on device
>>>>>>>>         at java.io.FileOutputStream.write(Native Method)
>>>>>>>>         at java.io.FileOutputStream.write(FileOutputStream.java:295)
>>>>>>>>         at
>>>>>>>> java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilterSerializer.java:34)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.utils.Murmur3BloomFilter$Murmur3BloomFilterSerializer.serialize(Murmur3BloomFilter.java:44)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.utils.FilterFactory.serialize(FilterFactory.java:41)
>>>>>>>>         at
>>>>>>>> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.close(SSTableWriter.java:468)
>>>>>>>>         ... 13 more
>>>>>>>> ERROR [CompactionExecutor:1245] 2014-05-04 05:15:15,800
>>>>>>>> StorageService.java (line 367) Stopping gossiper
>>>>>>>>  WARN [CompactionExecutor:1245] 2014-05-04 05:15:15,800
>>>>>>>> StorageService.java (line 281) Stopping gossip by operator request
>>>>>>>>  INFO [CompactionExecutor:1245] 2014-05-04 05:15:15,800
>>>>>>>> Gossiper.java (line 1271) Announcing shutdown
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I have changed my table to "LeveledCompactionStrategy" to reduce the
>>>>>>> disk size needed when compaction, with:
>>>>>>>
>>>>>>> ALTER TABLE images WITH compaction = { 'class' :
>>>>>>>> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : '192' };
>>>>>>>>
>>>>>>>
>>>>>>> But the problem still exists: the file keep increasing, and after
>>>>>>> about 2 or 3 days cassandra will fail due to 'No space left on device'
>>>>>>> error.  If I restart the node or using 'cleanup', it will resume to normal.
>>>>>>>
>>>>>>> I don't know is it because my configuration or it's just a bug, so
>>>>>>> would any one please help to solve this issue?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

--089e0149cdc87e181904f88f64fb
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I am using the latest 2.0.7. The &#39;nodetool tpstats&#39=
; shows as:<br><br><blockquote style=3D"margin:0px 0px 0px 0.8ex;border-lef=
t:1px solid rgb(204,204,204);padding-left:1ex" class=3D"gmail_quote"><span =
style=3D"font-family:courier new,monospace">[root@storage5 bin]# ./nodetool=
 tpstats<br>
Pool Name=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Active=C2=A0=C2=A0 Pending=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Completed=C2=A0=C2=A0 Blocked=C2=A0 All time=
 blocked<br>ReadStage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 628220=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>RequestResponseStage=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 3342234=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>
MutationStage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 3172116=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 0<br>ReadRepairStage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 47666=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>ReplicateOnWriteStage=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>
GossipStage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 756024=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>AntiEntropyStage=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>Migration=
Stage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>
MemoryMeter=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 6652=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>MemtablePostFlusher=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 7042=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>FlushWriter=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 4023=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 0<br>
MiscStage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>PendingRa=
ngeCalculator=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 27=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>commitlog_ar=
chiver=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0<br>
InternalResponseStage=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
 0<br>HintedHandoff=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 28=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br><br>Message type=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Dropped<br>
RANGE_SLICE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>READ_REPAIR=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 0<br>PAGED_RANGE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>BINARY=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>READ=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>MUTATION=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>_TRACE=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0<br>
REQUEST_RESPONSE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0<br>COUNTER_MUTATION=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0</span><br></blockquote><div><br></di=
v><div>=C2=A0And here is another type of error, and these errors seem to oc=
cur after &#39;disk is full&#39;<br><br><blockquote style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class=
=3D"gmail_quote">
ERROR [SSTableBatchOpen:2] 2014-04-30 13:47:48,348 CassandraDaemon.java (li=
ne 198) Exception in thread Thread[SSTableBatchOpen:2,5,main]<br>org.apache=
.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.compr=
ess.CompressionMetadata.&lt;init&gt;(CompressionMetadata.java:110)<br>=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.compress=
.CompressionMetadata.create(CompressionMetadata.java:64)<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.util.CompressedPooli=
ngSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:42)<br=
>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.sstab=
le.SSTableReader.load(SSTableReader.java:458)<br>=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.sstable.SSTableReader.load(SS=
TableReader.java:422)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.=
apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:203)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.sstab=
le.SSTableReader.open(SSTableReader.java:184)<br>=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.sstable.SSTableReader$1.run(S=
STableReader.java:264)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at jav=
a.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTa=
sk.run(FutureTask.java:262)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 a=
t java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java=
:1145)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurren=
t.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.j=
ava:744)<br>Caused by: java.io.EOFException<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 at java.io.DataInputStream.readUnsignedShort(DataInputStrea=
m.java:340)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.io.DataIn=
putStream.readUTF(DataInputStream.java:589)<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.io.DataInputStream.readU=
TF(DataInputStream.java:564)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =
at org.apache.cassandra.io.compress.CompressionMetadata.&lt;init&gt;(Compre=
ssionMetadata.java:85)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 12=
 more<br></blockquote><br>
</div></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">O=
n Sun, May 4, 2014 at 4:59 PM, DuyHai Doan <span dir=3D"ltr">&lt;<a href=3D=
"mailto:doanduyhai@gmail.com" target=3D"_blank">doanduyhai@gmail.com</a>&gt=
;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>The symptoms look=
s like there are pending compactions stacking up or failed compactions so t=
emporary files (-tmp-Data.db) are not properly cleaned up.<br>
<br></div>=C2=A0What is your Cassandra version ? Can you do a &quot;nodetoo=
l tpstats&quot; and look into Cassandra logs to see whether there are issue=
s with compactions ?<br>
<br></div>I&#39;ve found one discussion thread that have the same symptoms:=
 <a href=3D"http://comments.gmane.org/gmane.comp.db.cassandra.user/22089" t=
arget=3D"_blank">http://comments.gmane.org/gmane.comp.db.cassandra.user/220=
89</a><br>
<br><br>
</div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_extra"><b=
r><br><div class=3D"gmail_quote">On Sun, May 4, 2014 at 10:39 AM, Yatong Zh=
ang <span dir=3D"ltr">&lt;<a href=3D"mailto:blueflycn@gmail.com" target=3D"=
_blank">blueflycn@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Yes after a while the =
disk fills up again. So I changed the compaction strategy from &#39;sized t=
iered&#39; to &#39;leveled&#39; to reduce the disk usage when compacting, b=
ut the problem still occurs.<br>


<br></div>This table has lots of write and a relative very small read, and =
no update. here is the schema of the table:<br><div><div><br>CREATE TABLE m=
ydb.images (<br>=C2=A0 image_id uuid PRIMARY KEY,<br>=C2=A0 available boole=
an,<br>


=C2=A0 message text,<br>=C2=A0 raw_data blob,<br>=C2=A0 time_created timest=
amp,<br>=C2=A0 url text<br>) WITH<br>=C2=A0 bloom_filter_fp_chance=3D0.0100=
00 AND<br>=C2=A0 caching=3D&#39;KEYS_ONLY&#39; AND<br>=C2=A0 comment=3D&#39=
;&#39; AND<br>=C2=A0 dclocal_read_repair_chance=3D0.000000 AND<br>


=C2=A0 gc_grace_seconds=3D864000 AND<br>=C2=A0 read_repair_chance=3D0.10000=
0 AND<br>=C2=A0 replicate_on_write=3D&#39;true&#39; AND<br>=C2=A0 populate_=
io_cache_on_flush=3D&#39;false&#39; AND<br>=C2=A0 compaction=3D{&#39;sstabl=
e_size_in_mb&#39;: &#39;192&#39;, &#39;class&#39;: &#39;LeveledCompactionSt=
rategy&#39;} AND<br>


=C2=A0 compression=3D{&#39;sstable_compression&#39;: &#39;LZ4Compressor&#39=
;};<br></div></div></div><div><div><div class=3D"gmail_extra"><br><br><div =
class=3D"gmail_quote">On Sun, May 4, 2014 at 4:31 PM, DuyHai Doan <span dir=
=3D"ltr">&lt;<a href=3D"mailto:doanduyhai@gmail.com" target=3D"_blank">doan=
duyhai@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>And after a while=
 the /data6 drive fills up again right ?<br><br></div>=C2=A0One question, c=
an you please give the CQL3 definition of your &quot;<span style=3D"color:r=
gb(255,0,0)">mydb-images-tmp&quot; <span style=3D"color:rgb(0,0,0)">table ?=
<br>


<br></span></span></div><span style=3D"color:rgb(255,0,0)"><span style=3D"c=
olor:rgb(0,0,0)">What is the access pattern for this table ? Lots of write =
? Lots of update ?<br><br><br></span></span></div><div><div>
<div class=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Sun, May 4, 2014 at 10:00 AM, Yatong =
Zhang <span dir=3D"ltr">&lt;<a href=3D"mailto:blueflycn@gmail.com" target=
=3D"_blank">blueflycn@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


<div dir=3D"ltr"><div>after restarting or &#39;cleanup&#39; the big tmp fil=
e has gone and all looks like fine:<br><br><blockquote style=3D"margin:0px =
0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" clas=
s=3D"gmail_quote">


-rw-r--r-- 1 root root=C2=A0 19K Apr 30 13:58 mydb_oe-images-tmp-jb-96242-C=
ompressionInfo.db<br>
-rw-r--r-- 1 root root 145M Apr 30 13:58 mydb_oe-images-tmp-jb-96242-Data.d=
b<br>-rw-r--r-- 1 root root=C2=A0 64K Apr 30 13:58 mydb_oe-images-tmp-jb-96=
242-Index.db<br></blockquote><div><br>[root@node5 images]# df -hl<br>
Filesystem=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Size=C2=A0 Used Avail Use% Mounted=
 on<br>

/dev/sda1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 49G=C2=A0 7.5G=C2=A0=C2=
=A0 39G=C2=A0 17% /<br>tmpfs=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 7.8G=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0 7.8G=C2=A0=C2=A0 0% /d=
ev/shm<br>/dev/sda3=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 1.3T=C2=
=A0 2.1T=C2=A0 38% /data1<br>/dev/sdb1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =
3.6T=C2=A0 1.4T=C2=A0 2.1T=C2=A0 39% /data2<br>/dev/sdc1=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 466G=C2=A0 3.0T=C2=A0 14% /data3<br>


/dev/sdd1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 1.3T=C2=A0 2.2T=C2=
=A0 38% /data4<br>/dev/sde1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 =
1.3T=C2=A0 2.2T=C2=A0 38% /data5<br></div><span style=3D"color:rgb(255,0,0)=
">/dev/sdf1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 662M=C2=A0 3.4T=
=C2=A0=C2=A0 1% /data6</span><br><br></div>I didn&#39;t perform repair, not=
 even for one time<br>


</div><div><div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quot=
e">On Sun, May 4, 2014 at 2:37 PM, DuyHai Doan <span dir=3D"ltr">&lt;<a hre=
f=3D"mailto:doanduyhai@gmail.com" target=3D"_blank">doanduyhai@gmail.com</a=
>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>Hello Yatong<br><=
br></div>&quot;If I restart the node or using &#39;cleanup&#39;, it will re=
sume to normal.&quot; --&gt; what does df -hl shows for /data6 when you res=
tart or cleanup the node ?<br>


<br></div><div>By the way, a single SSTable of 3.6Tb is kind of huge. Do yo=
u perform manual repair frequently ?<br><br><br></div><div><br></div></div>=
<div><div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">
On Sun, May 4, 2014 at 1:51 AM, Yatong Zhang <span dir=3D"ltr">&lt;<a href=
=3D"mailto:blueflycn@gmail.com" target=3D"_blank">blueflycn@gmail.com</a>&g=
t;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">My Cassandra cluster has pl=
enty of free space, for now only about 30% of space are used<br></div><div>
<div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Sun, =
May 4, 2014 at 6:36 AM, Yatong Zhang <span dir=3D"ltr">&lt;<a href=3D"mailt=
o:blueflycn@gmail.com" target=3D"_blank">blueflycn@gmail.com</a>&gt;</span>=
 wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hi there,<br><br>It wa=
s strange that the &#39;xxx-tmp-xxx.db&#39; file kept increasing until Cass=
andra throw exceptions with &#39;No space left on device&#39;. I am using C=
QL 3 to create a table to store data about 200K ~ 500K per record. I have 6=
 harddisks per node and cassandra was=20
configured with 6 data directories(ext4 file systems, Centos 6.5):<br><br><=
blockquote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,=
204,204);padding-left:1ex" class=3D"gmail_quote">data_file_directories:<br>


=C2=A0=C2=A0=C2=A0 - /data1/cass<br>=C2=A0=C2=A0=C2=A0 - /data2/cass<br>=C2=
=A0=C2=A0=C2=A0 - /data3/cass<br>=C2=A0=C2=A0=C2=A0 - /data4/cass<br>
=C2=A0=C2=A0=C2=A0 - /data5/cass<br>=C2=A0=C2=A0=C2=A0 - /data6/cass<br></b=
lockquote><br>And every directory is on a standalone disk. But I just found=
 when the error occurred: <br><br><blockquote style=3D"margin:0px 0px 0px 0=
.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class=3D"gmai=
l_quote">


[root@node5 images]# ll -hl<br>total 3.6T<br>drwxr-xr-x 4 root root 4.0K Ja=
n 20 09:44 snapshots<br>-rw-r--r-- 1 root root 456M Apr 30 13:42 mydb-image=
s-tmp-jb-91068-CompressionInfo.db<br><span style=3D"color:rgb(255,0,0)">-rw=
-r--r-- 1 root root 3.5T Apr 30 13:42 mydb-images-tmp-jb-91068-Data.db</spa=
n><br>


-rw-r--r-- 1 root root=C2=A0=C2=A0=C2=A0 0 Apr 30 13:42 mydb-images-tmp-jb-=
91068-Filter.db<br>-rw-r--r-- 1 root root 2.0G Apr 30 13:42 mydb-images-tmp=
-jb-91068-Index.db<br></blockquote><div><br>[root@node5 images]# df -hl<br>=
Filesystem=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Size=C2=A0 Used Avail Use% Mounted=
 on<br>


/dev/sda1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 49G=C2=A0 7.5G=C2=A0=C2=
=A0 39G=C2=A0 17% /<br>tmpfs=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 7.8G=C2=A0=C2=A0=C2=A0=C2=A0 0=C2=A0 7.8G=C2=A0=C2=A0 0% /d=
ev/shm<br>/dev/sda3=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 1.3T=C2=
=A0 2.1T=C2=A0 38% /data1<br>/dev/sdb1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =
3.6T=C2=A0 1.4T=C2=A0 2.1T=C2=A0 39% /data2<br>/dev/sdc1=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 466G=C2=A0 3.0T=C2=A0 14% /data3<br>


/dev/sdd1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 1.3T=C2=A0 2.2T=C2=
=A0 38% /data4<br>/dev/sde1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 =
1.3T=C2=A0 2.2T=C2=A0 38% /data5<br><span style=3D"color:rgb(255,0,0)">/dev=
/sdf1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 3.6T=C2=A0 3.6T=C2=A0=C2=A0=C2=A0=
=C2=A0 0 100% /data6 </span><br></div><br><b>mydb-images-tmp-jb-91068-Data.=
db </b>almost<b> </b>occupied all the disk space (4T harddisk with 3.6T act=
ual usable size) and the error looks like:<br>


<br><blockquote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex" class=3D"gmail_quote">INFO [FlushWriter:417=
4] 2014-05-04 05:15:15,744 Memtable.java (line 403) Completed flushing /dat=
a3/cass/system/compactions_in_progress/system-compactions_in_progress-jb-16=
942-Data.db (42 bytes) for commitlog position ReplayPosition(segmentId=3D13=
98900356204, position=3D25024609)<br>


=C2=A0INFO [CompactionExecutor:3689] 2014-05-04 05:15:15,745 CompactionTask=
.java (line 115) Compacting [SSTableReader(path=3D&#39;/data3/cass/system/c=
ompactions_in_progress/system-compactions_in_progress-jb-16940-Data.db&#39;=
), SSTableReader(path=3D&#39;/data3/cass/system/compactions_in_progress/sys=
tem-compactions_in_progress-jb-16942-Data.db&#39;), SSTableReader(path=3D&#=
39;/data3/cass/system/compactions_in_progress/system-compactions_in_progres=
s-jb-16941-Data.db&#39;), SSTableReader(path=3D&#39;/data3/cass/system/comp=
actions_in_progress/system-compactions_in_progress-jb-16939-Data.db&#39;)]<=
br>


ERROR [CompactionExecutor:1245] 2014-05-04 05:15:15,745 CassandraDaemon.jav=
a (line 198) Exception in thread Thread[CompactionExecutor:1245,1,main]<br>=
FSWriteError in /data2/cass/mydb/images/mydb-images-tmp-jb-92181-Filter.db<=
br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.sstab=
le.SSTableWriter$IndexWriter.close(SSTableWriter.java:475)<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.util.FileUtils.cl=
oseQuietly(FileUtils.java:212)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 at org.apache.cassandra.io.sstable.SSTableWriter.abort(SSTableWriter.ja=
va:301)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.db.compa=
ction.CompactionTask.runWith(CompactionTask.java:209)<br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.util.DiskAwareRunnable.=
runMayThrow(DiskAwareRunnable.java:48)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable=
.java:28)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.db.compa=
ction.CompactionTask.executeInternal(CompactionTask.java:60)<br>=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.db.compaction.Abs=
tractCompactionTask.execute(AbstractCompactionTask.java:59)<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.db.compaction.Compac=
tionManager$BackgroundCompactionTask.run(CompactionManager.java:197)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.Executor=
s$RunnableAdapter.call(Executors.java:471)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask.run(FutureTask.java:262)=
<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.Thre=
adPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPo=
olExecutor$Worker.run(ThreadPoolExecutor.java:615)<br>=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.java:744)<br>Caused by=
: java.io.IOException: No space left on device<br>=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 at java.io.FileOutputStream.write(Native Method)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.io.FileOutputStream.writ=
e(FileOutputStream.java:295)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =
at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)<br>=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.utils.BloomFil=
terSerializer.serialize(BloomFilterSerializer.java:34)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.utils.Mu=
rmur3BloomFilter$Murmur3BloomFilterSerializer.serialize(Murmur3BloomFilter.=
java:44)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassan=
dra.utils.FilterFactory.serialize(FilterFactory.java:41)<br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.io.sstable.SSTableWrite=
r$IndexWriter.close(SSTableWriter.java:468)<br>


=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 13 more<br>ERROR [Compaction=
Executor:1245] 2014-05-04 05:15:15,800 StorageService.java (line 367) Stopp=
ing gossiper<br>=C2=A0WARN [CompactionExecutor:1245] 2014-05-04 05:15:15,80=
0 StorageService.java (line 281) Stopping gossip by operator request<br>


=C2=A0INFO [CompactionExecutor:1245] 2014-05-04 05:15:15,800 Gossiper.java =
(line 1271) Announcing shutdown<br></blockquote><br><br></div>I have change=
d my table to &quot;LeveledCompactionStrategy&quot; to reduce the disk size=
 needed when compaction, with:<br>


<br><blockquote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb=
(204,204,204);padding-left:1ex" class=3D"gmail_quote">ALTER TABLE images WI=
TH compaction =3D { &#39;class&#39; : &#39;LeveledCompactionStrategy&#39;, =
&#39;sstable_size_in_mb&#39; : &#39;192&#39; };<br>


</blockquote><div><br></div><div>But the problem still exists: the file kee=
p increasing, and after about 2 or 3 days cassandra will fail due to &#39;N=
o space left on device&#39; error.=C2=A0 If I restart the node or using =
9;cleanup&#39;, it will resume to normal.<br>


<br></div><div>I don&#39;t know is it because my configuration or it&#39;s =
just a bug, so would any one please help to solve this issue?<br><br>Thanks=
<br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--089e0149cdc87e181904f88f64fb--