Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3A5939900 for ; Thu, 10 Nov 2011 23:01:47 +0000 (UTC) Received: (qmail 56529 invoked by uid 500); 10 Nov 2011 23:01:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 56497 invoked by uid 500); 10 Nov 2011 23:01:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 56489 invoked by uid 99); 10 Nov 2011 23:01:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Nov 2011 23:01:45 +0000 X-ASF-Spam-Status: No, hits=1.1 required=5.0 tests=HTML_MESSAGE,MISSING_HEADERS,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of JEREMIAH.JORDAN@morningstar.com designates 216.228.224.32 as permitted sender) Received: from [216.228.224.32] (HELO mx85.morningstar.com) (216.228.224.32) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 10 Nov 2011 23:01:40 +0000 Received: from [172.28.18.112] ([172.28.18.112]) by mx85.morningstar.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 10 Nov 2011 17:01:17 -0600 Message-ID: <4EBC578E.4000406@morningstar.com> Date: Thu, 10 Nov 2011 17:00:30 -0600 From: Jeremiah Jordan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 CC: user@cassandra.apache.org Subject: Re: Data retrieval inconsistent References: <4EBC4293.8090705@morningstar.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------050204040608000902090809" X-OriginalArrivalTime: 10 Nov 2011 23:01:17.0167 (UTC) FILETIME=[A72497F0:01CC9FFC] This is a multi-part message in MIME format. --------------050204040608000902090809 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit No, that is what I thought you wanted. I was thinking your machines in DC1 had extra disk space or something... (I stopped replying to the dev list) On 11/10/2011 04:09 PM, Subrahmanya Harve wrote: > > Thanks Ed and Jeremiah for that useful info. > "I am pretty sure the way you have K1 configured it will be placed across > both DC's as if you had large ring. If you want it only in DC1 you > need to > say DC1:1, DC2:0." > Infact i do want K1 to be available across both DCs as if i had a large > ring. I just do not want them to replicate over across DCs. Also i did try > doing it like you said DC1:1, DC2:0 but wont that mean that, all my data > goes into DC1 irrespective of whether the data is getting into the > nodes of > DC1 or DC2, thereby creating a "hot DC"? Since the volume of data for this > case is huge, that might create a load imbalance on DC1? (Am i missing > something?) > > > On Thu, Nov 10, 2011 at 1:30 PM, Jeremiah Jordan < > jeremiah.jordan@morningstar.com> wrote: > > > I am pretty sure the way you have K1 configured it will be placed across > > both DC's as if you had large ring. If you want it only in DC1 you > need to > > say DC1:1, DC2:0. > > If you are writing and reading at ONE you are not guaranteed to get the > > data if RF > 1. If RF = 2, and you write with ONE, you data could be > > written to server 1, and then read from server 2 before it gets over > there. > > > > The differing on server times will only really matter for TTL's. Most > > everything else works off comparing user supplied times. > > > > -Jeremiah > > > > > > On 11/10/2011 02:27 PM, Subrahmanya Harve wrote: > > > >> > >> I am facing an issue in 0.8.7 cluster - > >> > >> - I have two clusters in two DCs (rather one cross dc cluster) and two > >> keyspaces. But i have only configured one keyspace to replicate > data to the > >> other DC and the other keyspace to not replicate over to the other DC. > >> Basically this is the way i ran the keyspace creation - > >> create keyspace K1 with placement_strategy='org.** > >> apache.cassandra.locator.**SimpleStrategy' and strategy_options = > >> [{replication_factor:1}]; > >> create keyspace K2 with placement_strategy='org.** > >> apache.cassandra.locator.**NetworkTopologyStrategy' and > strategy_options > >> = [{DC1:2, DC2:2}]; > >> > >> I had to do this because i expect that K1 will get a large volume > of data > >> and i do not want this wired over to the other DC. > >> > >> I am writing the data at CL=ONE and reading the data at CL=ONE. I am > >> seeing an issue where sometimes i get the data and other times i do > not see > >> the data. Does anyone know what could be going on here? > >> > >> A second larger question is - i am migrating from 0.7.4 to 0.8.7 , > i can > >> see that there are large changes in the yaml file, but a specific > question > >> i had was - how do i configure disk_access_mode like it used to be > in 0.7.4? > >> > >> One observation i have made is that some nodes of the cross dc cluster > >> are at different system times. This is something to fix but could > this be > >> why data is sometimes retrieved and other times not? Or is there > some other > >> thing to it? > >> > >> Would appreciate a quick response. > >> > > > --------------050204040608000902090809 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit No, that is what I thought you wanted.  I was thinking your machines in DC1 had extra disk space or something...

(I stopped replying to the dev list)

On 11/10/2011 04:09 PM, Subrahmanya Harve wrote:
Re: Data retrieval inconsistent

Thanks Ed and Jeremiah for that useful info.
"I am pretty sure the way you have K1 configured it will be placed across
both DC's as if you had large ring.  If you want it only in DC1 you need to
say DC1:1, DC2:0."
Infact i do want K1 to be available across both DCs as if i had a large
ring. I just do not want them to replicate over across DCs. Also i did try
doing it like you said DC1:1, DC2:0 but wont that mean that, all my data
goes into DC1 irrespective of whether the data is getting into the nodes of
DC1 or DC2, thereby creating a "hot DC"? Since the volume of data for this
case is huge, that might create a load imbalance on DC1? (Am i missing
something?)


On Thu, Nov 10, 2011 at 1:30 PM, Jeremiah Jordan <
jeremiah.jordan@morningstar.com> wrote:

> I am pretty sure the way you have K1 configured it will be placed across
> both DC's as if you had large ring.  If you want it only in DC1 you need to
> say DC1:1, DC2:0.
> If you are writing and reading at ONE you are not guaranteed to get the
> data if RF > 1.  If RF = 2, and you write with ONE, you data could be
> written to server 1, and then read from server 2 before it gets over there.
>
> The differing on server times will only really matter for TTL's.  Most
> everything else works off comparing user supplied times.
>
> -Jeremiah
>
>
> On 11/10/2011 02:27 PM, Subrahmanya Harve wrote:
>
>>
>> I am facing an issue in 0.8.7 cluster -
>>
>> - I have two clusters in two DCs (rather one cross dc cluster) and two
>> keyspaces. But i have only configured one keyspace to replicate data to the
>> other DC and the other keyspace to not replicate over to the other DC.
>> Basically this is the way i ran the keyspace creation  -
>>    create keyspace K1 with placement_strategy='org.**
>> apache.cassandra.locator.**SimpleStrategy' and strategy_options =
>> [{replication_factor:1}];
>>    create keyspace K2 with placement_strategy='org.**
>> apache.cassandra.locator.**NetworkTopologyStrategy' and strategy_options
>> = [{DC1:2, DC2:2}];
>>
>> I had to do this because i expect that K1 will get a large volume of data
>> and i do not want this wired over to the other DC.
>>
>> I am writing the data at CL=ONE and reading the data at CL=ONE. I am
>> seeing an issue where sometimes i get the data and other times i do not see
>> the data. Does anyone know what could be going on here?
>>
>> A second larger question is  - i am migrating from 0.7.4 to 0.8.7 , i can
>> see that there are large changes in the yaml file, but a specific question
>> i had was - how do i configure disk_access_mode like it used to be in 0.7.4?
>>
>> One observation i have made is that some nodes of the cross dc cluster
>> are at different system times. This is something to fix but could this be
>> why data is sometimes retrieved and other times not? Or is there some other
>> thing to it?
>>
>> Would appreciate a quick response.
>>
>

--------------050204040608000902090809--