cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sankalp kohli (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-10643) Implement compaction for a specific token range
Date Thu, 28 Jul 2016 22:24:20 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

sankalp kohli updated CASSANDRA-10643:
--------------------------------------
       Reviewer:   (was: Jason Brown)
    Description: 
We see repeated cases in production (using LCS) where small number of users generate a large
number repeated updates or tombstones. Reading data of such users brings in large amounts
of data in to java process. Apart from the read itself being slow for the user, the excessive
GC affects other users as well. 

Our solution so far is to move from LCS to SCS and back. This takes long and is an over kill
if the number of outliers is small. For such cases, we can implement the point compaction
of a token range. We make the nodetool compact take a starting and ending token range and
compact all the SSTables that fall with in that range. We can refuse to compact if the number
of sstables is beyond a max_limit.

Example: 
nodetool -st 3948291562518219268 -et 3948291562518219269 compact keyspace table


  was:

We see repeated cases in production (using LCS) where small number of users generate a large
number repeated updates or tombstones. Reading data of such users brings in large amounts
of data in to java process. Apart from the read itself being slow for the user, the excessive
GC affects other users as well. 

Our solution so far is to move from LCS to SCS and back. This takes long and is an over kill
if the number of outliers is small. For such cases, we can implement the point compaction
of a token range. We make the nodetool compact take a starting and ending token range and
compact all the SSTables that fall with in that range. We can refuse to compact if the number
of sstables is beyond a max_limit.

Example: 
nodetool -st 3948291562518219268 -et 3948291562518219269 compact keyspace table



> Implement compaction for a specific token range
> -----------------------------------------------
>
>                 Key: CASSANDRA-10643
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10643
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Vishy Kasar
>            Assignee: Vishy Kasar
>              Labels: lcs
>         Attachments: 10643-trunk-REV01.txt
>
>
> We see repeated cases in production (using LCS) where small number of users generate
a large number repeated updates or tombstones. Reading data of such users brings in large
amounts of data in to java process. Apart from the read itself being slow for the user, the
excessive GC affects other users as well. 
> Our solution so far is to move from LCS to SCS and back. This takes long and is an over
kill if the number of outliers is small. For such cases, we can implement the point compaction
of a token range. We make the nodetool compact take a starting and ending token range and
compact all the SSTables that fall with in that range. We can refuse to compact if the number
of sstables is beyond a max_limit.
> Example: 
> nodetool -st 3948291562518219268 -et 3948291562518219269 compact keyspace table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message