From user-return-64582-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Sat Oct 19 15:36:35 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 47625180654 for ; Sat, 19 Oct 2019 17:36:35 +0200 (CEST) Received: (qmail 80481 invoked by uid 500); 19 Oct 2019 15:36:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 80471 invoked by uid 99); 19 Oct 2019 15:36:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 19 Oct 2019 15:36:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 212A0C0C0C for ; Sat, 19 Oct 2019 15:36:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.25 X-Spam-Level: X-Spam-Status: No, score=0.25 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id RdtVJ_LGQXY0 for ; Sat, 19 Oct 2019 15:36:29 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.160.175; helo=mail-qt1-f175.google.com; envelope-from=clohfink85@gmail.com; receiver= Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 881B2BC50B for ; Sat, 19 Oct 2019 15:27:20 +0000 (UTC) Received: by mail-qt1-f175.google.com with SMTP id o12so13752164qtf.3 for ; Sat, 19 Oct 2019 08:27:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=3921moLUIvirH+OS0/wQUT0fJra13BANxO7MYeV2xhs=; b=C0mhkbF7YygbRJ9IrBBadIAP4DadJiWl1UgKYEoJtj4WqhqPfQl67MTzBE+L4/ar8i 2Rsy+qiBaEnW75s9ReWvSYfk3iQMsWJY6XQ7pC7VcrJf5kJ2fsf/TnyAYesoWQbggWJo U24wBn8oDNoDbU7as8IyPnvQa4RHmylRAPsI2HjJ7vtXlm3fI89FPzJjDGHqEkEL0p6r HYTy6/cw/GoORNwPRfsm8zfnjumuYSKMboMeR71nZTxBkiDiqu7XBVkzCsMZk9COW9ko THXo1BDDBwNvx7c7CIGOUkrWuwobMbQgLPByoyF/KYi/bkHyPkvdqGjBbItJ6zMPe5/G P1kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=3921moLUIvirH+OS0/wQUT0fJra13BANxO7MYeV2xhs=; b=saSMnCIidlw4zU9SHhem2yjkcpzjFR/rNS7s2W17pK5Zt8/0WjgPh5evLTs9Q3lP3/ bqqsL1n1A+M5JTcFQAH7/qYeMlzlgrXsyQR59sXYTe70/LjzZ9Nr1Bl1C6zuOd71ZKS5 VbUzffH6QrdGdSzurM/bMhHrhMxwhy1/s3UhtOvRzJfTywPSG/jVW/DNL4fi/qXtnYeg BHfZ9IN2smrZILvUfXCfooMHnUgIysRKxolQwE28vL9zIZSKm0ghLRIKRGIAnIklWude faTtitocLJa6Js+OkyCgKKEOSOKB+S6SKud43x576OUPF1cbQpydVsIm1Tg7tLS4J13S 6ZWA== X-Gm-Message-State: APjAAAVt2+ZU3REptCZVbBjIuXxxLf5OTmOvSGcwTrEAJHoXayVvnJRv y3Ii+tvRAs/VQdnfS9d3s9PGksD1TuKbX0/7gq+e/w== X-Google-Smtp-Source: APXvYqy5MUR+7BRSd8iu2F8CAhr1vSALMeHfUNInfBtWE4Wf/GNzHYd3nyQDFqlNx06pMqKIF8uxFANTtIpGb7dq5vM= X-Received: by 2002:ac8:5448:: with SMTP id d8mr15727655qtq.287.1571498839883; Sat, 19 Oct 2019 08:27:19 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Chris Lohfink Date: Sat, 19 Oct 2019 10:27:08 -0500 Message-ID: Subject: Re: loosing data during saving data from java To: User Content-Type: multipart/alternative; boundary="0000000000001dc287059545162a" --0000000000001dc287059545162a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable If the writes are being coming fast enough that the commitlog cant keep up it will block applying mutations the the memtable (even with periodic once hit >1.5x flush time). Things will queue up and possibly timeout but they will not be acknowledged until applied. If you do it enough fast enough you can dump a lot into the mutation queue and you can cause the node to OOM or GC thrash, but it wont acknowledge the writes so you wont lose the data. If you firing off async writes and not waiting for acknowledgement and assume they succeeded you may lose data if C* did not succeed (which you will be notified of via a WriteFailure, WriteTimeout, or an OperationTimeout). A simple write like that can be idempotent so you can just try again on failure. Chris On Sat, Oct 19, 2019 at 1:26 AM adrien ruffie wrote: > Thank Jeff =F0=9F=99=82 > > but if you save several data to fast with cassandra repository and if > cassandra doesn't have the same speed and inserts more slowly. > What is the bevahior ? cassandra store the overflow in a additionnal > buffer ? No data can be lost on the cassandra's side ? > > Thank a lot. > > Adrian > ------------------------------ > *De :* Jeff Jirsa > *Envoy=C3=A9 :* samedi 19 octobre 2019 00:41 > *=C3=80 :* cassandra > *Objet :* Re: loosing data during saving data from java > > There is no buffer in cassandra that is known to (or suspected to) > lose acknowledged writes if it's overwhelmed. > > There may be a client bug where you send so many async writes that they > overwhelm a bounded queue, or otherwise get dropped or timeout, but those > would be client bugs, and I'm not sure this list can help you with them. > > > > On Fri, Oct 18, 2019 at 3:16 PM adrien ruffie > wrote: > > Hello all, > > I have a table cassandra where I insert quickly several java entity > about 15.000 entries by minutes. But at the process ending, I only > have for exemple 199.921 entries instead 312.212 > If I truncate the table and relaunch the process, several time I get > 199.354 > or 189.012 entries ... not a really fixed entries saved any time ... > > several coworker tell me, they heard about a buffer which can be > overwhelmed > sometimes, and loosing several entities stacked for insertion ... > right ? > Because I don't understand why this loosing insertion appears ... > And I java code is very simple like below: > > myEntitiesList.forEach(myEntity -> { > try { > myEntitiesRepository.save(myEntity).subscribe(); > } catch (Exception e) { > e.printStackTrace(); > } > } > > And the repository is a: > public interface MyEntityRepository extends ReactiveCassandraRepository yEntity, String> { > } > > > Some one already heard about this problem ? > > Thank you very must and best regards > > Adrian > > --0000000000001dc287059545162a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
If the writes are being coming fast enough that the commit= log cant keep up it will block applying mutations the the memtable (even wi= th periodic once hit >1.5x flush time). Things will queue up and possibl= y timeout but they will not be acknowledged until applied. If you do it eno= ugh fast enough you can dump a lot into the mutation queue and you can caus= e the node to OOM or GC thrash, but it wont acknowledge the writes so you w= ont lose the data.

If you firing off async writes and no= t waiting for acknowledgement and assume they succeeded you may lose data i= f C* did not succeed=C2=A0(which you will be notified of via a WriteFailure= , WriteTimeout, or an OperationTimeout). A simple write like that can be id= empotent so you can just try again on failure.

Chr= is

On Sat, Oct 19, 2019 at 1:26 AM adrien ruffie <adriennolarsen@hotmail.fr> wrote:
Thank Jeff =F0=9F=99= =82

but if you save several data to fast with cassandra repository and if= cassandra doesn't=C2=A0have the same speed and inserts more slowly.
What is the bevahior ? cassandra store the overflow in a additionnal = buffer ? No data can be lost on the cassandra's side ?

Thank a lot.

Adrian

De := Jeff Jirsa <j= jirsa@gmail.com>
Envoy=C3=A9 : samedi 19 octobre 2019 00:41
=C3=80 : cassandra <user@cassandra.apache.org>
Objet : Re: loosing data during saving data from java
=C2=A0
There is no buffer in cassandra that is known to (or suspe= cted to) lose=C2=A0acknowledged=C2=A0writes if it's overwhelmed.

There may be a client bug where you send so many async writes that the= y overwhelm a bounded queue, or otherwise get dropped or timeout, but those= would be client bugs, and I'm not sure this list can help you with the= m.



On Fri, Oct 18, 2019 at 3:16 PM adrien ruffie <adriennolarsen@hotm= ail.fr> wrote:
Hello all,

I have a table cassandra where I insert quickly several java entity
about 15.000 entries by minutes. But at the process ending, I only
have for exemple 199.921 entries instead 312.212
If I truncate the table and relaunch the process, several time I get 199.35= 4
or 189.012 entries ... not a really fixed entries saved any time ...

several coworker tell me, they heard about a buffer which can be overwhelme= d
sometimes, and loosing several entities stacked for insertion ...
right ?
Because I don't understand why this loosing insertion appears ...
And I java code is very simple like below:

myEntitiesList.forEach(myEntity -> {
=C2=A0 try {
=C2=A0 =C2=A0 myEntitiesRepository.save(myEntity).subscribe();
=C2=A0 =C2=A0 } catch (Exception e) {
=C2=A0 =C2=A0 e.printStackTrace();
=C2=A0 =C2=A0 }
=C2=A0 =C2=A0 }

And the repository is a:
public interface MyEntityRepository extends ReactiveCassandraReposito= ry<MyEntity, String> {
}


Some one already heard about this problem ?

Thank you very must and best regards

Adrian
--0000000000001dc287059545162a--