Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CCB83200B8D for ; Fri, 9 Sep 2016 00:15:33 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id CB4A8160AD0; Thu, 8 Sep 2016 22:15:33 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EA7F7160AAD for ; Fri, 9 Sep 2016 00:15:32 +0200 (CEST) Received: (qmail 53358 invoked by uid 500); 8 Sep 2016 22:15:26 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 53348 invoked by uid 99); 8 Sep 2016 22:15:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Sep 2016 22:15:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 14FB9C05BE for ; Thu, 8 Sep 2016 22:15:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id zSk3Ur_eAsWH for ; Thu, 8 Sep 2016 22:15:24 +0000 (UTC) Received: from mail-yw0-f172.google.com (mail-yw0-f172.google.com [209.85.161.172]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B0D675F1B3 for ; Thu, 8 Sep 2016 22:15:24 +0000 (UTC) Received: by mail-yw0-f172.google.com with SMTP id u124so24350158ywg.3 for ; Thu, 08 Sep 2016 15:15:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=iRrBTw7wOR+3ikFFOWiTsCijdrXlZ1maf337taaxi70=; b=baM3JtdCX4959pGUXXTVClGQECduC1goetM8y/oi16s8R3M23zyan09nLZqPh/y/zH X9EQyOH6eB7iD53nGhJa3IZBrRrp4K3Syfe6cvkm6NfGNpNQiSyf3eRgGJIFcW8+SQ4o eZiGfqEeOPycpeXIuRw8d9XnLzWqhkKX82Tt6Q/939sTapirhwqZp6sNodQ5izBnEMft gPKcWp3VQdZC+GnyLdJ5wMNfHN+Dni8h9fb5HLUxnuLGpe2S6q//eIrSC2Zor8jZLB52 W+TEKC0SgGOi/qJGviPnGHFsucEMnykMhbLOGBnDbyNkTWlzYBatLzdPAY7N9vp6VMUK rM0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=iRrBTw7wOR+3ikFFOWiTsCijdrXlZ1maf337taaxi70=; b=UhfCdUQLR0DEJJKaMxOXk6ttsiVk50jSytUMRgPLVG8uBZ3e1YQY+C6JG5luU50iUm KRFaz3fY7RDvZqXhIGIRxHL4SXEOfLCe8l9j2Yf6OXk2rIDx7/j/d9Ph10Hdour/e2x1 644b3pZ5MlWfXL9c4Ek1xBs7UlrS6E3QGsx+BeqansSJZnqcZdFb1tSht+QiWQMWFp2W 5HEGXbYjwxL08GMlLwUAKk+QvVRPwvvM8yQNR/EPvOykx0PGVWbcL+Hs+rTF6+6IjmeU kj2M7wg4FiwZ5/2xtixLXg5ILyr3BgxRu/Y8TqFgBFVDyN0ksuViWPw6Ek+8ClUAbFXq 0HUA== X-Gm-Message-State: AE9vXwM31/f8Pgj8ufXGjEb4KWgXQ2i3lo8zKwWHlpRwipjkjowzDnkDcBqnDOeUnwKYo6gPe2SIho9jAPaY7g== X-Received: by 10.129.163.207 with SMTP id a198mr278542ywh.174.1473372924272; Thu, 08 Sep 2016 15:15:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.129.119.7 with HTTP; Thu, 8 Sep 2016 15:15:23 -0700 (PDT) In-Reply-To: References: <20E1C802-5369-4894-8039-360F66F23096@gmail.com> <2264E206-04CB-4894-8F17-31B70D88C3E4@gmail.com> From: Benyi Wang Date: Thu, 8 Sep 2016 15:15:23 -0700 Message-ID: Subject: Re: Is it safe to change RF in this situation? To: user Content-Type: multipart/alternative; boundary=94eb2c128754c5e6f1053c065df3 archived-at: Thu, 08 Sep 2016 22:15:34 -0000 --94eb2c128754c5e6f1053c065df3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks a lot. Will do as you suggested. On Thu, Sep 8, 2016 at 3:08 PM, Hannu Kr=C3=B6ger wrote= : > Ok, so I have to say that I=E2=80=99m not 100% sure how many replicas of = data is > it trying to maintain but it should not blow up (if repair crashes or > something, it=E2=80=99s ok). So it should be safe to do. > > When the repair has run you can start with the plan I suggested and run > repairs afterwards. > > Hannu > > On 8 Sep 2016, at 18:01, Benyi Wang wrote: > > Thanks. What about this situation: > > * Change RF 2 =3D> 3 > * Start repair > * Roll back RF 3 =3D> 2 > * repair is still running > > I'm wondering what the repair is trying to do? The repair is trying to fi= x > as RF=3D2 or still trying to fix like RF=3D3? > > On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kr=C3=B6ger wro= te: > >> Yep, you can fix it by running repair or even faster by changing the >> consistency level to local_quorum and deploying the new version of the a= pp. >> >> Hannu >> >> On 8 Sep 2016, at 17:51, Benyi Wang wrote: >> >> Thanks Hannu, >> >> Unfortunately, we started changing RF from 2 to 3, and did see the empty >> result rate is going higher. I assume that "If the LOCAL_ONE read hit t= he >> new replica which is not there yet, the CQL query will return nothing." = Is >> my assumption correct? >> >> On Thu, Sep 8, 2016 at 11:49 AM, Hannu Kr=C3=B6ger w= rote: >> >>> Hi, >>> >>> If you change RF=3D2 -> 3 first, the LOCAL_ONE reads might hit the new >>> replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QU= ORUM >>> first and then change the RF and then run the repair. LOCAL_QUORUM is >>> effectively ALL in your case (RF=3D2) if you have just one DC, so you c= an >>> change the batch CL later. >>> >>> Cheers, >>> Hannu >>> >>> > On 8 Sep 2016, at 14:42, Benyi Wang wrote: >>> > >>> > * I have a keyspace with RF=3D2; >>> > * The client read the table using LOCAL_ONE; >>> > * There is a batch job loading data into the tables using ALL. >>> > >>> > I want to change RF to 3 and both the client and the batch job use >>> LOCAL_QUORUM. >>> > >>> > My question is "Will the client still read the correct data when the >>> repair is running at the time my batch job loading is running too?" >>> > >>> > Or should I change to LOCAL_QUORUM first? >>> > >>> > Thanks. >>> >>> >> >> > > --94eb2c128754c5e6f1053c065df3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks a lot. Will do as you suggested.

On Thu, Sep 8, 2016 at 3:08 PM= , Hannu Kr=C3=B6ger <hkroger@gmail.com> wrote:
Ok, so I have to s= ay that I=E2=80=99m not 100% sure how many replicas of data is it trying to= maintain but it should not blow up (if repair crashes or something, it=E2= =80=99s ok). So it should be safe to do.

When the repair= has run you can start with the plan I suggested and run repairs afterwards= .
<= div>
Hannu
On 8 Sep 2016, at 18:01, Benyi Wang &l= t;bewang.tech@gm= ail.com> wrote:

Thanks. What about th= is situation:

* Change RF 2 =3D> 3
* Start = repair
* Roll back RF 3 =3D> 2
* repair is still run= ning

I'm wondering what the repair is trying t= o do? The repair is trying to fix as RF=3D2 or still trying to fix like RF= =3D3?

= On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kr=C3=B6ger <hkroger@gmail.com&g= t; wrote:
Yep, you can fix it by running repair or even faster by changin= g the consistency level to local_quorum and deploying the new version of th= e app.

Hannu
<= /span>

On 8 Sep 2016,= at 17:51, Benyi Wang <bewang.tech@gmail.com> wrote:

Thanks Hannu,

Unfortunately, we started changing RF= from 2 to 3, and did see the empty result rate is going higher. I assume t= hat =C2=A0"If the LOCAL_ONE read hit the new replica which is not ther= e yet, the CQL query will return nothing." Is my assumption correct?

On Thu,= Sep 8, 2016 at 11:49 AM, Hannu Kr=C3=B6ger <hkroger@gmail.com> wrote:
Hi,

If you change RF=3D2 -> 3 first, the LOCAL_ONE reads might hit the new r= eplica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUOR= UM first and then change the RF and then run the repair. LOCAL_QUORUM is ef= fectively ALL in your case (RF=3D2) if you have just one DC, so you can cha= nge the batch CL later.

Cheers,
Hannu

> On 8 Sep 2016, at 14:42, Benyi Wang <bewang.tech@gmail.com> wrote:
>
> * I have a keyspace with RF=3D2;
> * The client read the table using LOCAL_ONE;
> * There is a batch job loading data into the tables using ALL.
>
> I want to change RF to 3 and both the client and the batch job use LOC= AL_QUORUM.
>
> My question is "Will the client still read the correct data when = the repair is running at the time my batch job loading is running too?"= ;
>
> Or should I change to LOCAL_QUORUM first?
>
> Thanks.





--94eb2c128754c5e6f1053c065df3--