Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E1A9218E04 for ; Wed, 13 Jan 2016 13:51:32 +0000 (UTC) Received: (qmail 63562 invoked by uid 500); 13 Jan 2016 13:51:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 63505 invoked by uid 500); 13 Jan 2016 13:51:07 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 63456 invoked by uid 99); 13 Jan 2016 13:51:05 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jan 2016 13:51:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B4843C17CD for ; Wed, 13 Jan 2016 13:51:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.901 X-Spam-Level: ** X-Spam-Status: No, score=2.901 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=datastax.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id HNnoY6YtsiWd for ; Wed, 13 Jan 2016 13:50:50 +0000 (UTC) Received: from mail-yk0-f177.google.com (mail-yk0-f177.google.com [209.85.160.177]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 3B9F820F18 for ; Wed, 13 Jan 2016 13:50:50 +0000 (UTC) Received: by mail-yk0-f177.google.com with SMTP id a85so410681300ykb.1 for ; Wed, 13 Jan 2016 05:50:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datastax.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=j6yskZYffCMVZQFu9grwj2BHNQ35Ug0f8JCkIzOz+tU=; b=pfzdQGAUso6jcJqBhzP82G+0yym4QRdJXLiHB6Wq/QnV49TQKgO5EQiMWyYLNwWmJs FNzQWnqlSdZrinKTy+kURInYApQRT88JtAEOwUtg9uwUR3sK4CPZDyrNXPQIg8dU/qog TT+ttJnOYWIZlYMrb/QP1Q4Rn+8L5hkvjn6JI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=j6yskZYffCMVZQFu9grwj2BHNQ35Ug0f8JCkIzOz+tU=; b=S7+2X+Rn78Ugt2s/PR23P0U2ZAUxyvWepmC5J0CiXJlNOnUVd5b0WGY5vmKkLE8kcs fA7xVnxDwVU1Tl3WWrEwn04nD/VM0trvSkAPaVkJHbOviJIfUxgsR5Xyv440kFDi5DA9 zCM/ngjRYq8MD08lono81cwCOcBdMtFyo2PlFF8JQ9f46acMRCVU9Xqku8mk0SUlkwyF OHPn+9rVZdPkMbAf1LA2y4T1CaNVITyCoTBMQY+BBGuVsCy3mYvlKS+x90x0di3Ihl23 PaaUof25SaOSm81JqbyylGMpTr2bm9duv9Q1qi/MqWTLFSvDWxC/I0yOmrV5EpX9sC73 jbdA== X-Gm-Message-State: ALoCoQm+FKQo3rJd1oVQ3hdTPanA67RDsmIEPAntYULM01+r9U8CoUip9ccR36D+IrJOryxHpKHCrDNc9AUxdcREXZ/1PO9FOr9Is2BIPI3XEwmNgIaEzE8= X-Received: by 10.129.77.68 with SMTP id a65mr112715096ywb.180.1452693049250; Wed, 13 Jan 2016 05:50:49 -0800 (PST) MIME-Version: 1.0 Received: by 10.13.251.66 with HTTP; Wed, 13 Jan 2016 05:50:29 -0800 (PST) In-Reply-To: <56964077.7080203@pubgrade.com> References: <56964077.7080203@pubgrade.com> From: Alex Popescu Date: Wed, 13 Jan 2016 05:50:29 -0800 Message-ID: Subject: Re: Spark Cassandra Java Connector: records missing despite consistency=ALL To: user Cc: user@spark.apache.org Content-Type: multipart/alternative; boundary=001a1140b7702b5060052937750d --001a1140b7702b5060052937750d Content-Type: text/plain; charset=UTF-8 Dennis, You'll have better chances to get an answer on the spark-cassandra-connector mailing list https://groups.google.com/a/lists.datastax.com/forum/#!forum/spark-connector-user or on IRC #spark-cassandra-connector On Wed, Jan 13, 2016 at 4:17 AM, Dennis Birkholz wrote: > Hi together, > > we Cassandra to log event data and process it every 15 minutes with Spark. > We are using the Cassandra Java Connector for Spark. > > Randomly our Spark runs produce too few output records because no data is > returned from Cassandra for a several minutes window of input data. When > querying the data (with cqlsh), after multiple tries, the data eventually > becomes available. > > To solve the problem, we tried to use consistency=ALL when reading the > data in Spark. We use the > CassandraJavaUtil.javafunctions().cassandraTable() method and have set > "spark.cassandra.input.consistency.level"="ALL" on the config when creating > the Spark context. The problem persists but according to > http://stackoverflow.com/a/25043599 using a consistency level of ONE on > the write side (which we use) and ALL on the READ side should be sufficient > for data consistency. > > I would really appreciate if someone could give me a hint how to fix this > problem, thanks! > > Greets, > Dennis > > P.s.: > some information about our setup: > Cassandra 2.1.12 in a two Node configuration with replication factor=2 > Spark 1.5.1 > Cassandra Java Driver 2.2.0-rc3 > Spark Cassandra Java Connector 2.10-1.5.0-M2 > -- Bests, Alex Popescu | @al3xandru Sen. Product Manager @ DataStax --001a1140b7702b5060052937750d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Dennis,

You'll have better chances = to get an answer on the spark-cassandra-connector mailing list=C2=A0https://groups.google.com/a/lists.datastax.com/forum/#!forum/sp= ark-connector-user or on IRC=C2=A0#spark-cassandra-connector

On Wed, Jan 13, = 2016 at 4:17 AM, Dennis Birkholz <birkholz@pubgrade.com>= wrote:
Hi together,

we Cassandra to log event data and process it every 15 minutes with Spark. = We are using the Cassandra Java Connector for Spark.

Randomly our Spark runs produce too few output records because no data is r= eturned from Cassandra for a several minutes window of input data. When que= rying the data (with cqlsh), after multiple tries, the data eventually beco= mes available.

To solve the problem, we tried to use consistency=3DALL when reading the da= ta in Spark. We use the CassandraJavaUtil.javafunctions().cassandraTable() = method and have set "spark.cassandra.input.consistency.level"=3D&= quot;ALL" on the config when creating the Spark context. The problem p= ersists but according to http://stackoverflow.com/a/25043599 = using a consistency level of ONE on the write side (which we use) and ALL o= n the READ side should be sufficient for data consistency.

I would really appreciate if someone could give me a hint how to fix this p= roblem, thanks!

Greets,
Dennis

P.s.:
some information about our setup:
Cassandra 2.1.12 in a two Node configuration with replication factor=3D2 Spark 1.5.1
Cassandra Java Driver 2.2.0-rc3
Spark Cassandra Java Connector 2.10-1.5.0-M2



--
Bests,
=

Alex Popescu |=C2=A0@al3xandru
Sen. Product Manager @ DataStax

--001a1140b7702b5060052937750d--