cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward Capriolo (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (CASSANDRA-1746) Cleanups should be less impacting
Date Sat, 30 Apr 2011 14:59:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027335#comment-13027335
] 

Edward Capriolo edited comment on CASSANDRA-1746 at 4/30/11 2:58 PM:
---------------------------------------------------------------------

Sorry to re-open. I was thinking about this more. Hinted handoff is a best-effort system.
Additionally, new options added to hinted handoff give the options to disable hinted handoff
entirely or more interesting to this debate, stop collecting handoffs after a while.

If we are willing to stop delivering handoffs, minor compactions that remove them are not
much different. 

Write operations at ANY are a problem, but not many use cases are writing at ANY. If someone
is writing at ANY they can chose not to use this feature.

Also common knowledge says "you do not need to run major compaction anymore" because it will
create one large SSTable which will take longer to remove in the next round of tombstoning.
However a user can not avoid to do this, because once they join a node and call cleanup they
will have that one big table.



      was (Author: appodictic):
    Sorry to re-open. I was thinking about this more. Hinted handoff is a best-effort system.
Additionally, new options added to hinted handoff give the options to disable hinted handoff
entirely or more interesting to this debate, stop collecting handoffs after a while.

If we are willing to stop delivering handoffs, minor compactions that remove them are not
much different. 

Write operations at ANY are a problem, but not many use cases are writing at ANY. If someone
is writing at ANY they can chose not to use this feature.

Also common knowledge says "you do not need to run major compaction anymore" because it will
create one large SSTable which will take longer to remove in the next round of tombstoning.
However a user can not help NOT to do this because once they join a node and call cleanup
they will have that one big table they were trying to avoid.


  
> Cleanups should be less impacting
> ---------------------------------
>
>                 Key: CASSANDRA-1746
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1746
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Priority: Minor
>
> When a new node is added its neighbours require cleanup. Cleanup is very performance
impacting and for larger data sets takes a long time. You really do not get all the benefits
of the new node until the neighbours are cleaned up.
> Suggestion:
> Configuration option that can be changed from JMX compaction_auto_cleanup := {true,false}
set to false by default.
> During non major compaction if compaction_auto_cleanup flag is set to TRUE, we look at
the natural endpoints for the key we are compacting. If the key does not belong on this machine
we can remove it. 
> This would save us from the heavy hammer of cleanup compaction. It would also be less
book keeping for administrators.  
> Most people would want to leave this at false, join new node, wait a few days. If the
node has not failed by now, it likely will not. Set the flag to true and cleanup will happen
over time. Users can still force clean up if they wish.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message