incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donald Smith <Donald.Sm...@audiencescience.com>
Subject Setting gc_grace_seconds to zero and skipping "nodetool repair (was RE: Timeseries with TTL)
Date Mon, 07 Apr 2014 18:00:44 GMT
This statement is significant: “BTW if you never delete and only ttl your values at a constant
value, you can set gc=0 and forget about periodic repair of the table, saving some space,
IO, CPU, and an operational step.”

Setting gc_grace_seconds to zero has the effect of not storing hinted handoffs (which prevent
deleted data from reappearing), I believe.   “Periodic repair” refers to running “nodetool
repair” (aka Anti-Entropy).

I too have wondered if setting gc_grace_seconds to zero and skipping “nodetool repair”
are safe.

We’re using C* 2.0.6. In the 2.0.X versions, with vnodes, “nodetool repair …” is very
slow (see https://issues.apache.org/jira/browse/CASSANDRA-5220 and https://issues.apache.org/jira/browse/CASSANDRA-6611).
   We found read repairs via “nodetool repair” unacceptably slow, even when we restricted
it to one table, and often the repairs hung or failed.  We also tried subrange repairs and
the other options.

Our app does no deletes and only rarely updates a row (if there was bad data that needs to
be replaced).  So it’s very tempting to set gc_grace_seconds = 0 in the table definitions
and skip read repairs.

But there is Cassandra documentation that warns that read repairs are necessary even if you
don’t do deletes. For example, http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
says:

     Note: If deletions never occur, you should still schedule regular repairs. Be aware that
setting a column to null is a delete.

The apache wiki  https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair
says:

Unless your application performs no deletes, it is strongly recommended that production clusters
run nodetool repair periodically on all nodes in the cluster.

*IF* your operations team is sufficiently on the ball, you can get by without repair as long
as you do not have hardware failure -- in that case, HintedHandoff<https://wiki.apache.org/cassandra/HintedHandoff>
is adequate to repair successful updates that some replicas have missed. Hinted handoff is
active for max_hint_window_in_ms after a replica fails.

Full repair or re-bootstrap is necessary to re-replicate data lost to hardware failure (see
below).
So, if there are hardware failures, “nodetool repair” is needed.  And http://planetcassandra.org/general-faq/
says:

Anti-Entropy Node Repair – For data that is not read frequently, or to update data on a
node that has been down for an extended period, the node repair process (also referred to
as anti-entropy repair) ensures that all data on a replica is made consistent. Node repair
(using the nodetool utility) should be run routinely as part of regular cluster maintenance
operations.

If RF=2, ReadConsistency is ONE and data failed to get replicated to the second node, then
during a read might the app incorrectly return “missing data”?

It seems to me that the need to run “nodetool repair” reflects a design bug; it should
be automated.

Don

From: Laing, Michael [mailto:michael.laing@nytimes.com]
Sent: Sunday, April 06, 2014 11:31 AM
To: user@cassandra.apache.org
Subject: Re: Timeseries with TTL

Since you are using LeveledCompactionStrategy there is no major/minor compaction - just compaction.

Leveled compaction does more work - your logs don't look unreasonable to me - the real question
is whether your nodes can keep up w the IO. SSDs work best.

BTW if you never delete and only ttl your values at a constant value, you can set gc=0 and
forget about periodic repair of the table, saving some space, IO, CPU, and an operational
step.

If your nodes cannot keep up the IO, switch to SizeTieredCompaction and monitor read response
times. Or add SSDs.

In my experience, for smallish nodes running C* 2 without SSDs, LeveledCompactionStrategy
can cause the disk cache to churn, reducing read performance substantially. So watch out for
that.

Good luck,

Michael

On Sun, Apr 6, 2014 at 10:25 AM, Vicent Llongo <villosil@gmail.com<mailto:villosil@gmail.com>>
wrote:
Hi,

Most of the queries to that table are just getting a range of values for a metric:
SELECT val FROM metrics_5min WHERE uid = ? AND metric = ? AND ts >= ? AND ts <= ?

I'm not sure from the logs what kind of compactions they are. This is what I see in system.log
(grepping for that specific table):

...
INFO [CompactionExecutor:742] 2014-04-06 13:30:11,223 CompactionTask.java (line 105) Compacting
[SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14991-Data.db'),
SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14990-Data.db')]
INFO [CompactionExecutor:753] 2014-04-06 13:35:22,495 CompactionTask.java (line 105) Compacting
[SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14992-Data.db'),
SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14993-Data.db')]
INFO [CompactionExecutor:770] 2014-04-06 13:41:09,146 CompactionTask.java (line 105) Compacting
[SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14995-Data.db'),
SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14994-Data.db')]
INFO [CompactionExecutor:783] 2014-04-06 13:46:21,250 CompactionTask.java (line 105) Compacting
[SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14996-Data.db'),
SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14997-Data.db')]
INFO [CompactionExecutor:798] 2014-04-06 13:51:28,369 CompactionTask.java (line 105) Compacting
[SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14998-Data.db'),
SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14999-Data.db')]
INFO [CompactionExecutor:816] 2014-04-06 13:57:17,585 CompactionTask.java (line 105) Compacting
[SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-15000-Data.db'),
SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-15001-Data.db')]
...

As you can see every ~5 minutes there's a compaction going on.


On Sun, Apr 6, 2014 at 4:33 PM, Sergey Murylev <sergeymurylev@gmail.com<mailto:sergeymurylev@gmail.com>>
wrote:
Hi Vincent,



Is that a good pattern for Cassandra? Is there some compaction tunings I should take into
account?
Actually it depends on how you use Cassandra :). If you use it as key-value storage TTL works
fine. But if you would use rather complex CQL queries to this table I not sure that it would
be good.



With this structure is obvious that after one week inserting data, from that moment there's
gonna be new expired columns every 5 minutes in that table. Because of that I've noticed that
this table is being compacted every 5 minutes.
Compaction doesn't triggered when some column expired. It triggered on gc_grace_seconds timeout
and according compaction strategy. You can see more detailed description of LeveledCompactionStrategy
in following article: Leveled compaction in Cassandra<http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra>.

There are 2 types of compaction: minor and major, which kind of compaction do you see and
how come to conclusion that compaction triggered every 5 minutes? If you see major compaction
that situation is very bad otherwise it is normal case.

--
Thanks,
Sergey


On 06/04/14 15:48, Vicent Llongo wrote:
Hi there,
I have this table where I'm inserting timeseries values with a TTL of 86400*7 (1week):

CREATE TABLE metrics_5min (
  object_id varchar,
  metric varchar,
  ts timestamp,
  val double,
  PRIMARY KEY ((object_id, metric), ts)
)
WITH gc_grace_seconds = 86400
AND compaction = {'class': 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 100};

With this structure is obvious that after one week inserting data, from that moment there's
gonna be new expired columns every 5 minutes in that table. Because of that I've noticed that
this table is being compacted every 5 minutes.

Is that a good pattern for Cassandra? Is there some compaction tunings I should take into
account?
Thanks!




Mime
View raw message