cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Mackinlay (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2870) dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException
Date Wed, 13 Jul 2011 14:41:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064609#comment-13064609
] 

Patrick Mackinlay commented on CASSANDRA-2870:
----------------------------------------------

In the default configuration of 0.7.6-2 (and other versions) LOCAL_QUORUM reads dont work.
This is not a minor bug and should be fixed in the next release.
By default configuration I mean the tar ball that is distributed by the cassandra website.
The fact that it is not a regression just shows that this functionality was never properly
tested.


> dynamic snitch + read repair off can cause LOCAL_QUORUM reads to return spurious UnavailableException
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2870
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2870
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7.8, 0.8.2
>
>         Attachments: 2870.txt
>
>
> When Read Repair is off, we want to avoid doing requests to more nodes than necessary
to satisfy the ConsistencyLevel.  ReadCallback does this here:
> {code}
>         this.endpoints = repair || resolver instanceof RowRepairResolver
>                        ? endpoints
>                        : endpoints.subList(0, Math.min(endpoints.size(), blockfor));
// min so as to not throw exception until assureSufficient is called
> {code}
> You can see that it is assuming that the "endpoints" list is sorted in order of preferred-ness
for the read.
> Then the LOCAL_QUORUM code in DatacenterReadCallback checks to see if we have enough
nodes to do the read:
> {code}
>         int localEndpoints = 0;
>         for (InetAddress endpoint : endpoints)
>         {
>             if (localdc.equals(snitch.getDatacenter(endpoint)))
>                 localEndpoints++;
>         }
>         if (localEndpoints < blockfor)
>             throw new UnavailableException();
> {code}
> So if repair is off (so we truncate our endpoints list) AND dynamic snitch has decided
that nodes in another DC are to be preferred over local ones, we'll throw UE even if all the
replicas are healthy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message