Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of
 JEREMIAH.JORDAN@morningstar.com designates 216.228.224.32 as permitted
 sender)
Message-ID: <4EBC578E.4000406@morningstar.com>
Date: Thu, 10 Nov 2011 17:00:30 -0600
From: Jeremiah Jordan <jeremiah.jordan@morningstar.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1
MIME-Version: 1.0
CC: user@cassandra.apache.org
Subject: Re: Data retrieval inconsistent
References: 
 <CAKMSimHwcgQhF5hMXS3H4n7La15wtVAYM8vjUXMhNcixCaRaCQ@mail.gmail.com><4EBC4293.8090705@morningstar.com>
 <CAKMSimEP6+dvK0aKUm2UVvr7XPPYqFGQhLJ=zEsFxUoxkwNHqQ@mail.gmail.com>
In-Reply-To: 
 <CAKMSimEP6+dvK0aKUm2UVvr7XPPYqFGQhLJ=zEsFxUoxkwNHqQ@mail.gmail.com>
Content-Type: multipart/alternative;
 boundary="------------050204040608000902090809"

This is a multi-part message in MIME format.
--------------050204040608000902090809
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

No, that is what I thought you wanted.  I was thinking your machines in 
DC1 had extra disk space or something...

(I stopped replying to the dev list)

On 11/10/2011 04:09 PM, Subrahmanya Harve wrote:
>
> Thanks Ed and Jeremiah for that useful info.
> "I am pretty sure the way you have K1 configured it will be placed across
> both DC's as if you had large ring.  If you want it only in DC1 you 
> need to
> say DC1:1, DC2:0."
> Infact i do want K1 to be available across both DCs as if i had a large
> ring. I just do not want them to replicate over across DCs. Also i did try
> doing it like you said DC1:1, DC2:0 but wont that mean that, all my data
> goes into DC1 irrespective of whether the data is getting into the 
> nodes of
> DC1 or DC2, thereby creating a "hot DC"? Since the volume of data for this
> case is huge, that might create a load imbalance on DC1? (Am i missing
> something?)
>
>
> On Thu, Nov 10, 2011 at 1:30 PM, Jeremiah Jordan <
> jeremiah.jordan@morningstar.com> wrote:
>
> > I am pretty sure the way you have K1 configured it will be placed across
> > both DC's as if you had large ring.  If you want it only in DC1 you 
> need to
> > say DC1:1, DC2:0.
> > If you are writing and reading at ONE you are not guaranteed to get the
> > data if RF > 1.  If RF = 2, and you write with ONE, you data could be
> > written to server 1, and then read from server 2 before it gets over 
> there.
> >
> > The differing on server times will only really matter for TTL's.  Most
> > everything else works off comparing user supplied times.
> >
> > -Jeremiah
> >
> >
> > On 11/10/2011 02:27 PM, Subrahmanya Harve wrote:
> >
> >>
> >> I am facing an issue in 0.8.7 cluster -
> >>
> >> - I have two clusters in two DCs (rather one cross dc cluster) and two
> >> keyspaces. But i have only configured one keyspace to replicate 
> data to the
> >> other DC and the other keyspace to not replicate over to the other DC.
> >> Basically this is the way i ran the keyspace creation  -
> >>    create keyspace K1 with placement_strategy='org.**
> >> apache.cassandra.locator.**SimpleStrategy' and strategy_options =
> >> [{replication_factor:1}];
> >>    create keyspace K2 with placement_strategy='org.**
> >> apache.cassandra.locator.**NetworkTopologyStrategy' and 
> strategy_options
> >> = [{DC1:2, DC2:2}];
> >>
> >> I had to do this because i expect that K1 will get a large volume 
> of data
> >> and i do not want this wired over to the other DC.
> >>
> >> I am writing the data at CL=ONE and reading the data at CL=ONE. I am
> >> seeing an issue where sometimes i get the data and other times i do 
> not see
> >> the data. Does anyone know what could be going on here?
> >>
> >> A second larger question is  - i am migrating from 0.7.4 to 0.8.7 , 
> i can
> >> see that there are large changes in the yaml file, but a specific 
> question
> >> i had was - how do i configure disk_access_mode like it used to be 
> in 0.7.4?
> >>
> >> One observation i have made is that some nodes of the cross dc cluster
> >> are at different system times. This is something to fix but could 
> this be
> >> why data is sometimes retrieved and other times not? Or is there 
> some other
> >> thing to it?
> >>
> >> Would appreciate a quick response.
> >>
> >
>

--------------050204040608000902090809
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    No, that is what I thought you wanted.&nbsp; I was thinking your machines
    in DC1 had extra disk space or something...<br>
    <br>
    (I stopped replying to the dev list)<br>
    <br>
    On 11/10/2011 04:09 PM, Subrahmanya Harve wrote:
    <blockquote
cite="mid:CAKMSimEP6+dvK0aKUm2UVvr7XPPYqFGQhLJ=zEsFxUoxkwNHqQ@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <meta name="Generator" content="MS Exchange Server version
        6.5.7651.59">
      <title>Re: Data retrieval inconsistent</title>
      <!-- Converted from text/plain format -->
      <p><font size="2">Thanks Ed and Jeremiah for that useful info.<br>
          "I am pretty sure the way you have K1 configured it will be
          placed across<br>
          both DC's as if you had large ring.&nbsp; If you want it only in
          DC1 you need to<br>
          say DC1:1, DC2:0."<br>
          Infact i do want K1 to be available across both DCs as if i
          had a large<br>
          ring. I just do not want them to replicate over across DCs.
          Also i did try<br>
          doing it like you said DC1:1, DC2:0 but wont that mean that,
          all my data<br>
          goes into DC1 irrespective of whether the data is getting into
          the nodes of<br>
          DC1 or DC2, thereby creating a "hot DC"? Since the volume of
          data for this<br>
          case is huge, that might create a load imbalance on DC1? (Am i
          missing<br>
          something?)<br>
          <br>
          <br>
          On Thu, Nov 10, 2011 at 1:30 PM, Jeremiah Jordan &lt;<br>
          <a class="moz-txt-link-abbreviated" href="mailto:jeremiah.jordan@morningstar.com">jeremiah.jordan@morningstar.com</a>&gt; wrote:<br>
          <br>
          &gt; I am pretty sure the way you have K1 configured it will
          be placed across<br>
          &gt; both DC's as if you had large ring.&nbsp; If you want it only
          in DC1 you need to<br>
          &gt; say DC1:1, DC2:0.<br>
          &gt; If you are writing and reading at ONE you are not
          guaranteed to get the<br>
          &gt; data if RF &gt; 1.&nbsp; If RF = 2, and you write with ONE,
          you data could be<br>
          &gt; written to server 1, and then read from server 2 before
          it gets over there.<br>
          &gt;<br>
          &gt; The differing on server times will only really matter for
          TTL's.&nbsp; Most<br>
          &gt; everything else works off comparing user supplied times.<br>
          &gt;<br>
          &gt; -Jeremiah<br>
          &gt;<br>
          &gt;<br>
          &gt; On 11/10/2011 02:27 PM, Subrahmanya Harve wrote:<br>
          &gt;<br>
          &gt;&gt;<br>
          &gt;&gt; I am facing an issue in 0.8.7 cluster -<br>
          &gt;&gt;<br>
          &gt;&gt; - I have two clusters in two DCs (rather one cross dc
          cluster) and two<br>
          &gt;&gt; keyspaces. But i have only configured one keyspace to
          replicate data to the<br>
          &gt;&gt; other DC and the other keyspace to not replicate over
          to the other DC.<br>
          &gt;&gt; Basically this is the way i ran the keyspace
          creation&nbsp; -<br>
          &gt;&gt;&nbsp;&nbsp;&nbsp; create keyspace K1 with placement_strategy='org.**<br>
          &gt;&gt; apache.cassandra.locator.**SimpleStrategy' and
          strategy_options =<br>
          &gt;&gt; [{replication_factor:1}];<br>
          &gt;&gt;&nbsp;&nbsp;&nbsp; create keyspace K2 with placement_strategy='org.**<br>
          &gt;&gt; apache.cassandra.locator.**NetworkTopologyStrategy'
          and strategy_options<br>
          &gt;&gt; = [{DC1:2, DC2:2}];<br>
          &gt;&gt;<br>
          &gt;&gt; I had to do this because i expect that K1 will get a
          large volume of data<br>
          &gt;&gt; and i do not want this wired over to the other DC.<br>
          &gt;&gt;<br>
          &gt;&gt; I am writing the data at CL=ONE and reading the data
          at CL=ONE. I am<br>
          &gt;&gt; seeing an issue where sometimes i get the data and
          other times i do not see<br>
          &gt;&gt; the data. Does anyone know what could be going on
          here?<br>
          &gt;&gt;<br>
          &gt;&gt; A second larger question is&nbsp; - i am migrating from
          0.7.4 to 0.8.7 , i can<br>
          &gt;&gt; see that there are large changes in the yaml file,
          but a specific question<br>
          &gt;&gt; i had was - how do i configure disk_access_mode like
          it used to be in 0.7.4?<br>
          &gt;&gt;<br>
          &gt;&gt; One observation i have made is that some nodes of the
          cross dc cluster<br>
          &gt;&gt; are at different system times. This is something to
          fix but could this be<br>
          &gt;&gt; why data is sometimes retrieved and other times not?
          Or is there some other<br>
          &gt;&gt; thing to it?<br>
          &gt;&gt;<br>
          &gt;&gt; Would appreciate a quick response.<br>
          &gt;&gt;<br>
          &gt;<br>
        </font>
      </p>
    </blockquote>
  </body>
</html>

--------------050204040608000902090809--