incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Lee" <mail.list.steel.men...@gmail.com>
Subject problem about bootstrapping when used in huge node
Date Tue, 23 Feb 2010 06:33:39 GMT
HI, guys:

 

I have a 15 node cluster, each node has 12 SATA disk which is 1TB, I make soft RAID5 on 11
disk to create a large data partition(md0):

 

[root@xxxx ~]# mount

/dev/sda2 on / type ext2 (rw)

none on /proc type proc (rw)

none on /sys type sysfs (rw)

none on /dev/pts type devpts (rw,gid=5,mode=620)

usbfs on /proc/bus/usb type usbfs (rw)

/dev/sda3 on /home type ext2 (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

/dev/md0 on /home/store0 type ext2 (rw)

 

[root@xxxx ~]# cat /proc/mdstat 

Personalities : [raid5] 

md0 : active raid5 sdf1[4] sdb1[0] sdl1[10] sdk1[9] sdj1[8] sdi1[7] sdh1[6] sdg1[5] sde1[3]
sdd1[2] sdc1[1]

      9767599360 blocks level 5, 64k chunk, algorithm 2 [11/11] [UUUUUUUUUUU]

      

unused devices: <none>

[root@xxx ~]# df

Filesystem           1K-blocks      Used Available Use% Mounted on

/dev/sda2              9076396   4153868   4922528  46% /

/dev/sda3            951353164    761568 902265668   1% /home

/dev/md0             9729280288 3131116128 6598164160  33% /home/store0

 

 

storage-conf.xml:

……

  <CommitLogDirectory>/home/store0/cassandra/commitlog</CommitLogDirectory>

  <DataFileDirectories>

      <DataFileDirectory>/home/store0/cassandra/data</DataFileDirectory>

  </DataFileDirectories>

  <CalloutLocation>/home/store0/cassandra/callouts</CalloutLocation>

  <StagingFileDirectory>/home/store0/cassandra/staging</StagingFileDirectory>

……

 

Problem is:

 

(1)     A cluster cannot be enlarge(add more node into cluster) if it already used more than
half capacity:

If every node has data more than it’s half capacity , the admin may not bootstrapping new
node into cluster, 

because old nodes must strip data belong to new node through anti-compaction, the process
will create a large tmp SSTable 

file (for streaming), which may large than free disk space ( of one node ). 

(2)     Any node cannot have load large than 3TB:

Because the tmp SSTable file may large than 2TB, which is invalid on ext2/3 fs.

 

Question is:

(1)     Is cassandra designed to waste half of it’s capacity?

(2)     How to use node has 12 1TB disk??

 

---------END----------

 


Mime
View raw message