cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9102) Consistency levels such as non-local quorum need better tests
Date Tue, 23 Jun 2015 08:58:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597362#comment-14597362
] 

Stefania edited comment on CASSANDRA-9102 at 6/23/15 8:57 AM:
--------------------------------------------------------------

Thanks for your input. I agree with you on testing races in the kitchen sink harness. The
reason why I used parallel threads in TestAccuracy is to have the test complete in a reasonable
amount of time. Each thread uses different partitions.

As for unit testing, I attached code coverage results after running some relevant dtests:
*consistecy_test.py, paxos_test.py, batch_test.py, counter_test.py and secondary_indexes_test.py*.

I analysed the files you mentioned and here are some quick observations:

||File||Percentage||Missing coverage||
|AbstractReadExecutor|97%|RetryType.ALWAYS|
|ReadCallback|55%|Exception handling or generation (ReadTimeout, ReadFailure, DigestMismatch)|
|StorageProxy|56%|Exception handling (timeouts, failures), hinting due to exception handling
and max limit of hints reached, logged batches and triggers, some estimation of ranges due
to indexes and range slices, describe cluster (a nodetool command), truncation of data, paxos
contentions, jmx methods (but I think we have some nodetool tests I did not run and they are
trivial)|
|AbstractRowResolver|90%|OK|
|RowDigestResolver|88%|OK|
|RowDataResolver|79%|replyCount == 1|
|RangeSliceResponseResolver|19%|Everything|

The percentages aren't totally correct because they refer to the main class, but the missing
components were analysed by looking at the entire files. Also, it assumes there was no contention
between different Cassandra processes sharing the same jacoco coverage file, but in this case
the coverage would have been better than reported.

I think we can easily add dtests to cover logged batching, triggers, range slices and single
replica replies. For the handling of exceptions, we would have to add some debug options to
instruct a process to timeout or return a failure. It's easy to do this via a system property
and it already exists for write failures, but this requires restarting the node. The alternative
would be to use JMX.

However, even with timeout and failure generation, I am not sure we would be able to test
the paxos contentions very well. Plus dtests are always slower and harder to debug than unit
tests. Perhaps we should byte the bullet and adopt a mock framework for MessagingService,
at which point it should be possible to test StorageProxy via unit tests.

What do you think?


was (Author: stefania):
Thanks for your input. I agree with you on testing races in the kitchen sink harness. The
reason why I used parallel threads in TestAccuracy is to have the test complete in a reasonable
amount of time. Each thread uses different partitions.

As for unit testing, I attached code coverage results after running some relevant dtests:
*consistecy_test.py, paxos_test.py, batch_test.py, counter_test.py and secondary_indexes_test.py*.

I analysed the files you mentioned and here are some quick observations:

||File||Percentage||Missing coverage||
|AbstractReadExecutor|97%|RetryType.ALWAYS|
|ReadCallback|66%|Exception handling or generation (ReadTimeout, ReadFailure, DigestMismatch)|
|StorageProxy|56%|Exception handling (timeouts, failures), hinting due to exception handling
and max limit of hints reached, logged batches and triggers, some estimation of ranges due
to indexes and range slices, describe cluster (a nodetool command), truncation of data, paxos
contentions, jmx methods (but I think we have some nodetool tests I did not run and they are
trivial)|
|AbstractRowResolver|90%|OK|
|RowDigestResolver|88%|OK|
|RowDataResolver|79%|replyCount == 1|
|RangeSliceResponseResolver|19%|Everything|

The percentages aren't totally correct because they refer to the main class, but the missing
components were analysed by looking at the entire files. Also, it assumes there was no contention
between different Cassandra processes sharing the same jacoco coverage file, but in this case
the coverage would have been better than reported.

I think we can easily add dtests to cover logged batching, triggers, range slices and single
replica replies. For the handling of exceptions, we would have to add some debug options to
instruct a process to timeout or return a failure. It's easy to do this via a system property
and it already exists for write failures, but this requires restarting the node. The alternative
would be to use JMX.

However, even with timeout and failure generation, I am not sure we would be able to test
the paxos contentions very well. Plus dtests are always slower and harder to debug than unit
tests. Perhaps we should byte the bullet and adopt a mock framework for MessagingService,
at which point it should be possible to test StorageProxy via unit tests.

What do you think?

> Consistency levels such as non-local quorum need better tests
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-9102
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9102
>             Project: Cassandra
>          Issue Type: Test
>            Reporter: Ariel Weisberg
>            Assignee: Stefania
>         Attachments: jacoco.diff, jacoco.tar.gz
>
>
> We didn't catch unit testing for this functionality. There is dtest consistency_test
but it doesn't cover non-local functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message