Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 33480655C for ; Fri, 3 Jun 2011 08:50:30 +0000 (UTC) Received: (qmail 105 invoked by uid 500); 3 Jun 2011 08:50:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 99977 invoked by uid 500); 3 Jun 2011 08:50:27 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 99969 invoked by uid 99); 3 Jun 2011 08:50:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Jun 2011 08:50:27 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.213.44] (HELO mail-yw0-f44.google.com) (209.85.213.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Jun 2011 08:50:21 +0000 Received: by ywp31 with SMTP id 31so875785ywp.31 for ; Fri, 03 Jun 2011 01:49:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.150.188.9 with SMTP id l9mr1572776ybf.158.1307090999371; Fri, 03 Jun 2011 01:49:59 -0700 (PDT) Sender: scode@scode.org Received: by 10.150.95.17 with HTTP; Fri, 3 Jun 2011 01:49:59 -0700 (PDT) X-Originating-IP: [213.114.156.208] In-Reply-To: References: Date: Fri, 3 Jun 2011 10:49:59 +0200 X-Google-Sender-Auth: NJ_EQP_qrBMqzaSdbbVlSKLHclM Message-ID: Subject: Re: sync commitlog in batch mode lose data From: Peter Schuller To: user@cassandra.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > I disable the disk cache of RAID controller, =C2=A0unfortunately it still= lost > some data. Disabling caching shouldn't be necessary so much as ensuring that all layers honor write barriers properly. A battery backed cache that survives a power outtage need not be disabled (and usually if you have battery backed caching you don't want to since it has a considerable performance impact). To re-address your original post: Yes, given QUORUM @ RF=3D2 (meaning that QUORUM is equivalent to ALL), any *successful* write is supposed to be guaranteed to be visible by a subsequent read. In this case even at CL.ONE since RF was 2 and QUORUM was equivalent to ALL. If this is not what you're seeing, likely causes are either (a) a problem with your test, (b) a cassandra bug, or (c) a kernel/hardware misconfiguration or bug that causes fsync() to be broken with respect to power outtages. In order to eliminate (a), can you share the actual test? Even if (a) looks good, you'd be surprised as to how often (c) can be the case. If you are satisfied that the test is correct, one way to eliminate Cassandra as a cause for the problem may be to restart your server by a reset instead of cutting power, so that power supply never disappears from your storage device. If you are no longer able to reproduce the problem, it would indicate that fsync() is at least causing I/O to reach a device (exit the operating system). If it still fails, you're none the wiser. If you're running without battery backed cache, or with battery backed cache, one test you can do is run this (on a system which is otherwise idle): http://distfiles.scode.org/mlref/fsynctime.py The first argument is a filename which will be created/over-written. It will then start printing the number of milliseconds each fsync() takes. If you do not have battery backed caching, you should be seeing numbers in the 5-25 ms range depending on circumstances. If you see very low values, that indicates that fsync() is not working and the writes are not forced to persistent storage. (If battery backed caching exists, you will legitimiately get very low values without it indicating anything is wrong.) --=20 / Peter Schuller