cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <...@filez.com>
Subject Estimation of memtable size are wrong
Date Fri, 23 Mar 2012 08:44:58 GMT
I wonder why are memtable estimations so bad.

1. its not possible to run them more often? There should be some limit - 
run live/serialized calculation at least once per hour. They took just 
few seconds.
2. Why not use data from FlusherWriter to update estimations? Flusher 
knows number of ops and serialized size after sstable is written to 
disk. These values should be used for updating memtable live/serialized 
ratio.

  INFO [OptionalTasks:1] 2012-03-23 09:33:51,765 MeteredFlusher.java 
(line 62) flushing high-traffic column family CFS(Keyspace='whois', 
ColumnFamily='ipbans') (estimated 105363280 bytes)
  INFO [OptionalTasks:1] 2012-03-23 09:33:51,796 ColumnFamilyStore.java 
(line 704) Enqueuing flush of 
Memtable-ipbans@481336682(1317041/105363280 serialized/live bytes, 16755 
ops)
  ** Here should be noted that live/serialized size is ESTIMATED!! **
  INFO [FlushWriter:314] 2012-03-23 09:33:51,796 Memtable.java (line 
246) Writing Memtable-ipbans@481336682(1317041/105363280 serialized/live 
bytes, 16755 ops)
  INFO [FlushWriter:314] 2012-03-23 09:33:51,799 Memtable.java (line 
283) Completed flushing 
/var/lib/cassandra/data/whois/ipbans-hc-16775-Data.db (1355 bytes)


Mime
View raw message