Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1BD6F6144 for ; Sat, 30 Jul 2011 00:09:48 +0000 (UTC) Received: (qmail 42275 invoked by uid 500); 30 Jul 2011 00:09:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42100 invoked by uid 500); 30 Jul 2011 00:09:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42086 invoked by uid 99); 30 Jul 2011 00:09:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jul 2011 00:09:44 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryan@twitter.com designates 209.85.210.172 as permitted sender) Received: from [209.85.210.172] (HELO mail-iy0-f172.google.com) (209.85.210.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jul 2011 00:09:38 +0000 Received: by iye7 with SMTP id 7so5463009iye.31 for ; Fri, 29 Jul 2011 17:09:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=twitter.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=gbFCk6GnsenvFipBiZoAX4UAgoBNcMzL26JJmZx6nLU=; b=X1do3SXU9nXNa58KD38+Y7cqE6I7r1zDQO0DJDpKOzOli5u+le+4QzrD/1ftnUIGyB lEsVzLeCCFJuuCY3RCW40QPA5hsNgsXNyqur+kHkozZIB/2qnOa7NRV7GGQEjexNsyK8 QxnUBH28f2vsT24krqhmmJz3B8O43qVPOClFA= Received: by 10.42.245.2 with SMTP id ls2mr174053icb.144.1311984557059; Fri, 29 Jul 2011 17:09:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.241.198 with HTTP; Fri, 29 Jul 2011 17:08:57 -0700 (PDT) In-Reply-To: References: From: Ryan King Date: Fri, 29 Jul 2011 17:08:57 -0700 Message-ID: Subject: Re: Cassandra Pig with network topology and data centers. To: user@cassandra.apache.org, dev@cassandra.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable It'd be great if we had different settings for inter- and intra-DC read rep= air. -ryan On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani wrote: > Yes it's read repair you can lower the read repair chance to tune this. > > > > On Jul 29, 2011, at 6:31 PM, Aaron Griffith = wrote: > >> I currently have a 9 node cassandra cluster setup as follows: >> >> DC1: Six nodes >> DC2: Three nodes >> >> The tokens alternate between the two datacenters. >> >> I have hadoop installed as tasktracker/datanodes on the >> three cassandra nodes in DC2. >> >> There is another non cassandra node that is used as the hadoop namenode = / job >> tracker. >> >> When running pig scripts pointed to a node in DC2 using LOCAL_QUORUM as = read >> consistency I am seeing network and cpu spikes on the nodes in DC1. =C2= =A0I was >> not expecting any impact on those nodes when local quorum is used. >> >> Can read repair be causing the traffic/cpu spikes? >> >> The replication settings for DC1 is 5, and for DC2 is 1. >> >> When looking at the map tasks I am seeing input splits for computers in >> both data centers. =C2=A0I am not sure what this means. =C2=A0My thought= is >> that is should only be getting data from the nodes in DC2. >> >> Thanks >> >> Aaron >> >