cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kurt Greaves <k...@instaclustr.com>
Subject Re: lots of DigestMismatchException in cassandra3
Date Wed, 23 Nov 2016 03:58:48 GMT
dclocal_read_repair_chance and read_repair_chance are only really relevant
when using a consistency level <QUORUM. If you're setting them to 0 and the
mismatch is disappearing, that just means that you're not read repairing
any data. Whether or not that's OK is up to you, can you handle
inconsistencies in your data? If not, I'd recommend not doing that (and
probably use QUORUM). It sounds like there is a separate issue causing the
inconsistencies in your data, leading to the high number of mismatches. Are
you seeing dropped mutations on the nodes (run nodetool tpstats to see).

If you're writing at CL=ONE/LOCAL_ONE you may not be "seeing" failures, but
writes might not be propagating to all nodes.

Kurt Greaves
kurt@instaclustr.com
www.instaclustr.com

On 23 November 2016 at 02:50, <Adeline.Pan@thomsonreuters.com> wrote:

> Hi Kurt,
>
> Thank you for the suggestion. I ran repair on all the 4 nodes, and after
> the repair, the error “Corrupt empty row found in unfiltered partition”
> disappeared, but the “Mismatch” stopped for a little while and came up
> again.
>
> When we changed both the “dclocal_read_repair_chance” and the
> “read_repair_chance” to 0.0, the “Mismatch” stopped. Is it OK to do that?
> Does it mean when the inconsistence found in reading data, Cassandra
> wouldn’t do the repair and we will just get the inconsistent data? And you
> said the cause is not all replicas receiving all the writes, I think it is
> reasonable but the strange thing is I didn’t notice any failed writing ,
> another cause I can think of is there are insert, update, delete on the
> same record at the same time , is it a possibility?
>
>
>
> --
>
> Regards, Adeline
>
>
>
>
>
>
>
> *From:* kurt Greaves [mailto:kurt@instaclustr.com]
> *Sent:* Wednesday, November 23, 2016 6:51 AM
> *To:* Pan, Adeline (TR Technology & Ops)
> *Cc:* user@cassandra.apache.org
> *Subject:* Re: lots of DigestMismatchException in cassandra3
>
>
>
> Yes it could potentially impact performance if there are lots of them. The
> mismatch would occur on a read, the error occurs on a write which is why
> the times wouldn't line up. The cause for the messages as I mentioned is
> when there is a digest mismatch between replicas. The cause is inconsistent
> deta/not all replicas receiving all writes. You should run a repair and see
> if the number of mismatches is reduced.
>
>
> Kurt Greaves
>
> kurt@instaclustr.com
>
> www.instaclustr.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.instaclustr.com&d=CwMFaQ&c=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q&r=552xSDXEzKpvsyZM5wpE0TGEUDzVsX35L-K72hRjpLc&m=8lqiPNb8HiRlBNyddnGZahh0KiP-7P0MfAnjUHI0c84&s=E6-7Hti1G8DXfJZttqNy6gwGb56o65eS5Zhjm4deFFk&e=>
>
>
>
> On 22 November 2016 at 06:30, <Adeline.Pan@thomsonreuters.com> wrote:
>
> Hi Kurt,
>
> Thank you for the information, but the error “Corrupt empty row found in
> unfiltered partition” seems not related to the “Mismatch”; the time they
> occurred didn’t match. We use “QUORUM” consistency level for both read and
> write and I didn’t notice any failed writing in the log. Any other cause
> you can think of?  Would it cause performance issue when lots of this
> “Mismatch” happened?
>
>
>
> --
>
> Regards, Adeline
>
>
>
>
>
>
>
> *From:* kurt Greaves [mailto:kurt@instaclustr.com]
> *Sent:* Monday, November 21, 2016 5:02 PM
> *To:* user@cassandra.apache.org
> *Cc:* tommy.stendahl@ericsson.com
> *Subject:* Re: lots of DigestMismatchException in cassandra3
>
>
>
> Actually, just saw the error message in those logs and what you're looking
> at is probably https://issues.apache.org/jira/browse/CASSANDRA-12694
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D12694&d=CwMFaQ&c=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q&r=552xSDXEzKpvsyZM5wpE0TGEUDzVsX35L-K72hRjpLc&m=Km5uRGlDf2EjQFx7dbIrLzNfL6khh5OKA2sJk59l8-w&s=tMf24yohd0jRGCBo_pzYdRMw52h3NCImPOGXjy1SAsc&e=>
>
>
> Kurt Greaves
>
> kurt@instaclustr.com
>
> www.instaclustr.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.instaclustr.com&d=CwMFaQ&c=4ZIZThykDLcoWk-GVjSLm9hvvvzvGv0FLoWSRuCSs5Q&r=552xSDXEzKpvsyZM5wpE0TGEUDzVsX35L-K72hRjpLc&m=Km5uRGlDf2EjQFx7dbIrLzNfL6khh5OKA2sJk59l8-w&s=TwJ80glB0cSS0rW6jU1MGnlLWUtVYL1J7061vp2e_rI&e=>
>
>
>
> On 21 November 2016 at 08:59, kurt Greaves <kurt@instaclustr.com> wrote:
>
> That's a debug message. From the sound of it, it's triggered on read where
> there is a digest mismatch between replicas. As to whether it's normal,
> well that depends on your cluster. Are the nodes reporting lots of dropped
> mutations and are you writing at <QUORUM?
>
>
>
>
>

Mime
View raw message