incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steppacher Ralf <ralf.steppac...@derivativepartners.com>
Subject RE: How does a healthy node look like?
Date Fri, 10 May 2013 16:20:05 GMT
Re timeouts:
I receive the following exception from Hector: me.prettyprint.hector.api.exceptions.HTimedOutException:
TimedOutException(acknowledged_by:0) 
I assumed that this is a server-side timeout. Also because increasing the xxx_request_timeout_in_ms
parameter values made the exception go away.
I went from 20 to 60 seconds for the timeouts. Now I am not getting any HTimedOutException
any more.

Re cores:
Yes, we have one node on a server with 6 cores.

Re tombstones:
Deletion is a new trick for us. Up until two weeks ago we always truncated all column families
in the early morning and already then the write timeouts occured. 
No, we do not do range slices over deleted rows. We also set the gc_grace parameter to 0 for
all columns families, as we are running a single node at the moment. So even if we were to
do range slices over deleted rows, the tombstones should be very short lived?

Re cfstats/cfhistograms:
They are attached. The histogram I created for the column families that store the event type
that occurs most often.

Re GC logging:
I went all in and activated all output. I ran it through gcviewer but it complained a lot
about non-parsable lines, so I am not sure how reliable the output is. It claims that on average
about 500MB are collected, but at the same time that the average heap usage after GC is only
about 100MB

Side node: We added more RAM to the machine, so now Cassandra starts with a bit more than
8GB by default.


Thanks!
Ralf

________________________________________
From: aaron morton [aaron@thelastpickle.com]
Sent: Monday, May 06, 2013 10:43
To: user@cassandra.apache.org
Subject: Re: How does a healthy node look like?

Confirm if your write timeouts are client side socket time outs or the TimedOutException from
the server.

Typically write latency is related to GC problems, like you are seeing.

I'm unsure how much CPU resources each cassandra instance has. Is there one node on a machine
with 6 cores ?
How many rows are on the node and how wide are the rows ? cfstats or cfhistgrams will help.
Enable the gc logging, or use something like Data Stax OpsCentre, to see how low the heap
gets after a CMS GC.

> The write-timeouts correlate with the hours of high (ca. >450/h) "GC for ParNew".
I never saw any read-timeouts. I set all timeouts to 20 seconds in cassandra.yaml.
That'll do it.

> To do so we iterate over all rows in the three time-line column families and load the
value of the column that is most recent given a cut-off timestamp.
…
> Every night we delete all events that are older than 2 days. Again in batches of 100
rows.
Are you deleting rows from the CF's that you then do a range slice on ?
The tombstones may be hurting you on the range scans, can you remove them ?

Hope that helps.

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/05/2013, at 9:25 PM, Steppacher Ralf <ralf.steppacher@derivativepartners.com> wrote:

> Sure, I can do that.
>
> My main concern is write latency and the write timeouts we are experiencing. Read latency
is secondary, as long as we do not introduce timeouts on read and do not exceed our sampling
intervals (see below).
>
> We are running Cassandra 1.2.1 on Ubuntu 12.04 with JDK 1.7.0_17 (64bit).
> The hardware is virtual but so far we are the only tenant on the physical host.
>
> Hardware:
> - 1x6 cores with 2.3GHz
> - 30GB RAM
> - 1 physical disk for both the tx log and the data files
> - 2 x 1GB Ethernet combined into one virtual interface
>
> Cassandra Config:
> Cassandra runs with
> - 7.5GB of heap and
> - 600MB of new gen space
> as calculated by the cassandra-env script.
> I have adjusted all cassandra.yaml settings where clear guidance is given, e.g. <factor>
x <num_cores>.
> I have tried to increase and decrease heap (between 6 and 8GB) and new gen size (between
300 and 1.1GB).
> I have tried compaction_throughput_mb_per_sec values between 16 and 48.
> I have disabled key caches.
>
> Unfortunately Cassandra has to share the host with other Java processes, the most resource
demanding being ActiveMQ 5.8.
>
> Log Output:
> Over the course of a day (08:00 to 22:00) I see in the logs
> - 280 and 760 "GC for ParNew" per hour (most around 300/h)
> - 60 and 180 "Completed flushing" per hour (most around 100/h)
> - 17 and 46 "Compacted N sstables to" per hour (most around 35/h)
>
> Data Model:
> The data model is made up of 6 column families. 3 are dynamic to capture the time-line
of 3 event types; each event creates a new column and the value is the row key of the event.
3 have a static schema and store the event itself.
> The largest event messages has 16 attributes. All are short text identifiers, floating
point numbers and timestamps. For storage in Cassandra every attribute is converted to a string
and stored with the utf8 validator.
>
> Timeouts and Memory pressure:
> The write-timeouts correlate with the hours of high (ca. >450/h) "GC for ParNew".
I never saw any read-timeouts. I set all timeouts to 20 seconds in cassandra.yaml.
> Cassandra comes under memory pressure ("Flushing CFS X to relieve memory pressure") between
3 and 5 times a day. The tendency is for it to happen in the afternoon and evening. But also
sometimes right after 08:00 in the morning. In about 75% of the cases it flushes one of the
event column families, in 25% a time-line column family.
>
> Write Load:
> We collect events for a theoretical universe of 2.2 million items -> there are a 
max of 2.2 million rows in each of the time-line column families, but I never saw an estimated
row count in the cfstats of more than 1 million.
> Roughly 1/3 of the entities receive a maximum of 3 events, one of each event type, in
a 15 minutes interval from 08:00 to 22:00. The other 2/3 receive 3 events 3 times a day. About
16'000 entities receive only one event type, but about once in 3 minutes.
> On a typical day the load adds up to about 70 to 80 million messages.
> Not all messages are original though. The sources will re-send an event in every interval
if there are no new events. The noise ratio I do not know. I guestimate it to be at least
50%. In case of a repeat the existing time-line column and event row are updated with their
previous values.
>
> Read Load:
> In one hour intervals we sample a time coherent snapshot of the events. To do so we iterate
over all rows in the three time-line column families and load the value of the column that
is most recent given a cut-off timestamp. The value is the row key of the actual event, which
we then load as well. We do that in batches of 100 rows at a time.
>
> Deletes:
> Every night we delete all events that are older than 2 days. Again in batches of 100
rows.
>
>
> Thanks for helping!
> Ralf
>
>
> From: Alain RODRIGUEZ [arodrime@gmail.com]
> Sent: Thursday, May 02, 2013 09:12
> To: user@cassandra.apache.org
> Subject: Re: How does a healthy node look like?
>
> Well, maybe should you describe us your hardware and the C* release toi are using. Also
give us some metrics.
> Le 30 avr. 2013 18:48, "Steppacher Ralf" <ralf.steppacher@derivativepartners.com>
a écrit :
> Hi,
>
> I have troubles finding some quantitative information as to how a healthy Cassandra node
should look like (CPU usage, number of flushes,SSTables, compactions, GC), given a certain
hardware spec and read/write load. I have troubles gauging our first and only Cassandra node,
whether it needs tuning or is simply overloaded.
> If anyone could point me to some data that would be very helpful.
>
> (So far I have run the node with the default settings in cassandra.yaml and cassandra-env.
The log claims that the server is occasionally under memory pressure and I get frequent timeouts
for writes.  I see what I think are many flushes, compactions and GCs in the log. Some toying
with heap and new gen sizes, key cache, and the compaction throughput settings did not improve
the overall situation much.)
>
>
> Thanks!
> Ralf


Mime
View raw message