cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan <cne...@yahoo.com>
Subject Re: Controlling the MAX SIZE of sstables after compaction
Date Mon, 26 Jan 2015 19:20:10 GMT
Parth  et al; 
the folks at Netflix seem to have built a solution for your problem. The Netflix Tech Blog:
Aegisthus - A Bulk Data Pipeline out of Cassandra

|   |
|   |  |   |   |   |   |   |
| The Netflix Tech Blog: Aegisthus - A Bulk Data Pipeline ...By Charles Smith and Jeff Magnusson
 |
|  |
| View on techblog.netflix.com | Preview by Yahoo |
|  |
|   |


May want to chase Jeff Magnuson & check if the solution is open sourced.   Pl.   report
back to this forum if you get an answer to the problem. 
hope this helps. Jan 
C* Architect 

     On Monday, January 26, 2015 11:25 AM, Robert Coli <rcoli@eventbrite.com> wrote:
   

 On Sun, Jan 25, 2015 at 10:40 PM, Parth Setya <setya.parth@gmail.com> wrote:

1. Is there a way to configure the size of sstables created after compaction?


No, won'tfix : https://issues.apache.org/jira/browse/CASSANDRA-4897.
You could use the "sstablesplit" utility on your One Big SSTable to split it into files of
your preferred size. 
2. Is there a better approach to generate the report?


The major compaction isn't too bad, but something that understands SSTables as an input format
would be preferable to sstable2json. 
3. What are the flaws with this approach?

sstable2json is slow and transforms your data to JSON.
=Rob

   
Mime
View raw message