cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-11028) Streaming errors caused by corrupt tables need more logging
Date Mon, 18 Jan 2016 19:08:40 GMT


Paulo Motta commented on CASSANDRA-11028:

Thanks for the report [~autocracy]. While working on CASSANDRA-10961 I added more detailed
debug logging to stream writer and reader, printing source sstable, keyspace, table and faulty
partition key in case of error/corruption on receiver side, so this should be improved in
upcoming releases. Some additional logging was also added on CASSANDRA-9294.

> Streaming errors caused by corrupt tables need more logging
> -----------------------------------------------------------
>                 Key: CASSANDRA-11028
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jeff Ferland
> Example output: ERROR [STREAM-IN-/] 2016-01-17 16:01:38,431
- [Stream #e6ca4590-bc66-11e5-84be-571ffcecc993] Streaming error occurred
> java.lang.IllegalArgumentException: Unknown type 0
> In some cases logging shows a message more like:
> ERROR [STREAM-IN-/] 2016-01-05 14:44:38,690 - [Stream
#472d28e0-b347-11e5-8b40-bb4d80df86f4] Streaming error occurred
> Too many retries for Header (cfId: 6b262d58-8730-36ca-8e3e-f0a40beaf92f,
#0, version: ka, estimated keys: 58880, transfer size: 2159040, compressed?: true, repairedAt:
> In the majority of cases, however, no information identifying the column family is shown,
and never identifying the source file that was being streamed.
> Errors do no stop the streaming process, but do mark the streaming as failed at the end.
This usually results in a log message pattern like:
> INFO  [StreamReceiveTask:252] 2016-01-18 04:45:01,190 -
[Stream #e6ca4590-bc66-11e5-84be-571ffcecc993] Session with / is complete
> WARN  [StreamReceiveTask:252] 2016-01-18 04:45:01,215 -
[Stream #e6ca4590-bc66-11e5-84be-571ffcecc993] Stream failed
> ERROR [main] 2016-01-18 04:45:01,217 - Exception encountered
during startup
> ... which is highly confusing given the error occurred hours before.
> Request: more detail in logging messages for stream failure indicating what column family
was being used, and if possible a clarification between network issues and corrupt file issues.
> Actual cause of errors / solution is running nodetool scrub on the offending node. It's
rather expensive scrubbing the whole space blindly versus targeting issue tables. In our particular
case, out of order keys were caused by a bug in a previous version of Cassandra.
>     WARN  [CompactionExecutor:19552] 2016-01-18 16:02:10,155 -
378490 out of order rows found while scrubbing SSTableReader(path='/mnt/cassandra/data/keyspace/cf-888a52f96d1d389790ee586a6100916c/keyspace-cf-ka-133-Data.db');
Those have been written (in order) to a new sstable (SSTableReader(path='/mnt/cassandra/data/keyspace/cf-888a52f96d1d389790ee586a6100916c/keyspace-cf-ka-179-Data.db'))

This message was sent by Atlassian JIRA

View raw message