Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of arthur.zubarev@aol.com
 designates 64.12.237.10 as permitted sender)
Message-ID: <51CBB43C.7080402@aol.com>
Date: Wed, 26 Jun 2013 23:40:44 -0400
From: Arthur Zubarev <arthur.zubarev@aol.com>
Reply-To: arthur.zubarev@aol.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:17.0) Gecko/20130623 Thunderbird/17.0.7
MIME-Version: 1.0
To: Tony Anecito <adanecito@yahoo.com>
CC: Robert Coli <rcoli@eventbrite.com>,
 Users-Cassandra <user@cassandra.apache.org>
Subject: Re: Creating an "Index" column...
References: <1372264767.85242.YahooMailNeo@web121805.mail.ne1.yahoo.com>
 <1372267240.54860.YahooMailNeo@web121806.mail.ne1.yahoo.com>
 <CAEDUwd3vadhLbw5wxpgbiQ0J6R7tSNN+QNnW=rgV=wPYtTxxtw@mail.gmail.com>
 <1372269336.582.YahooMailNeo@web121805.mail.ne1.yahoo.com>
 <0A14ED78871B49B3B3805B67FE99142E@vig.local>
 <1372281703.13985.YahooMailNeo@web121805.mail.ne1.yahoo.com>
In-Reply-To: <1372281703.13985.YahooMailNeo@web121805.mail.ne1.yahoo.com>
Content-Type: multipart/alternative;
 boundary="------------030101000600000109040003"
X-AOL-SCOLL-URL_COUNT: 0  
X-AOL-REROUTE: YES 
x-aol-sid: 3039ac1d294251cbb43d3643
X-AOL-IP: 99.238.22.30
X-Virus-Checked: Checked by ClamAV on apache.org
X-Old-Spam-Flag: YES

This is a multi-part message in MIME format.
--------------030101000600000109040003
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Appreciate your thoughts Tony,

in our DW there are composite keys, 500K of them say per customer to 
produce a report for which the client program needs to page through the 
entire set collecting data as it pages through yet to probably another 
desktop db.

At this point the purpose of having a NoSQL has been defeated.

On 06/26/2013 05:21 PM, Tony Anecito wrote:
> Thanks Arthur.
>
> Interesting you think NoSQL does not fit into large volumes of data, 
> That is what it is touted to do.
> I have heard PK's are needed but remember that is what the "key" 
> column is for I thought and composite key support is there also.
>
> The only issue I see is the all that duplicate data and a need to keep 
> it in sync. So for example if the movie title "Superman" changed to 
> "Superman the Man of Steel" you have to go change all those duplicate 
> values. An easy problem to solve but the data modeler has to get past 
> that. lol
>
> Acid transactions is the other but I think then the supplier of info 
> has to think about that one.
>
> I have response times in my RDMS of several hundred microseconds which 
> is the really important requirement for me to keep that the same or 
> better.
>
> Just some thoughts on the matter.
> -Tony
>
> ------------------------------------------------------------------------
> *From:* Arthur Zubarev <Arthur.Zubarev@Aol.com>
> *To:* Tony Anecito <adanecito@yahoo.com>; Robert Coli 
> <rcoli@eventbrite.com>; Users-Cassandra <user@cassandra.apache.org>
> *Sent:* Wednesday, June 26, 2013 3:08 PM
> *Subject:* Re: Creating an "Index" column...
>
> Tony hi,
> Yes, in some scenarios (e.g. a DW), e.g. absence of proper PKs or 
> indexes (just too hard to envision, you need to think of future 
> queries 1st) getting thru large volumes of data makes NoSQL IMHO hard 
> to fit in.
> But you have other choices:
> 1) pagination or
> 2) slice queries.
> Both of that is covered here:
> http://pkghosh.wordpress.com/2012/03/04/cassandra-range-query-made-simple/
> Hope that helps.
> /Arthur
> *From:* Tony Anecito <mailto:adanecito@yahoo.com>
> *Sent:* Wednesday, June 26, 2013 1:55 PM
> *To:* Robert Coli <mailto:rcoli@eventbrite.com> ; Users-Cassandra 
> <mailto:user@cassandra.apache.org>
> *Subject:* Re: Creating an "Index" column...
> Hi Robert,
>
> Actually that is what I did. I did that in my RDMS data model. In 
> Cassandra or NOSQL without join or nested selects I have to do two 
> queries. Also, since batching is not supported on the server side 
> which makes the performance worse.
>
> I just started learning Cassandra but I am learning fast and there are 
> some challenges when moving to a new data model driven by these factors.
>
> Regards,
> -Tony
>
> ------------------------------------------------------------------------
> *From:* Robert Coli <rcoli@eventbrite.com>
> *To:* user@cassandra.apache.org; Tony Anecito <adanecito@yahoo.com>
> *Sent:* Wednesday, June 26, 2013 11:32 AM
> *Subject:* Re: Creating an "Index" column...
>
> On Wed, Jun 26, 2013 at 10:20 AM, Tony Anecito <adanecito@yahoo.com 
> <mailto:adanecito@yahoo.com>> wrote:
> > Never mind I figured it out. I found it via a search for Secondary 
> indexes.
>
> In general unless you actually need atomic update of the row and its
> secondary index, you are probably better off creating your own pseudo
> secondary index column family.
>
> =Rob
>
>
>
>


-- 

Regards,

Arthur


--------------030101000600000109040003
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Appreciate your thoughts Tony,<br>
      <br>
      in our DW there are composite keys, 500K of them say per customer
      to produce a report for which the client program needs to page
      through the entire set collecting data as it pages through yet to
      probably another desktop db. <br>
      <br>
      At this point the purpose of having a NoSQL has been defeated.<br>
      <br>
      On 06/26/2013 05:21 PM, Tony Anecito wrote:<br>
    </div>
    <blockquote
      cite="mid:1372281703.13985.YahooMailNeo@web121805.mail.ne1.yahoo.com"
      type="cite">
      <div style="color:#000; background-color:#fff; font-family:times
        new roman, new york, times, serif;font-size:12pt">
        <div><span>Thanks Arthur.</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><br>
          <span></span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span>Interesting you think
            NoSQL does not fit into large volumes of data, That is what
            it is touted to do.</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span>I have heard PK's are
            needed but remember that is what the "key" column is for I
            thought and composite key support is there also.</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span><br>
          </span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span>The only issue I see
            is the all that duplicate data and a need to keep it in
            sync. So for example if the movie title "Superman" changed
            to "Superman the Man of Steel" you have to go change all
            those duplicate values. An easy problem to solve but the
            data modeler has to get past that. lol</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><br>
          <span></span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span>Acid transactions is
            the other but I think then the supplier of info has to think
            about that one.</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><br>
          <span></span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span>I have response times
            in my RDMS of several hundred microseconds which is the
            really important requirement for me to keep that the same or
            better.</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><br>
          <span></span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;"><span>Just some thoughts on
            the matter.</span></div>
        <div style="color: rgb(0, 0, 0); font-size: 16px; font-family:
          times new roman,new york,times,serif; background-color:
          transparent; font-style: normal;">-Tony<br>
          <span></span></div>
        <div><br>
        </div>
        <div style="font-family: times new roman, new york, times,
          serif; font-size: 12pt;">
          <div style="font-family: times new roman, new york, times,
            serif; font-size: 12pt;">
            <div dir="ltr">
              <hr size="1"> <font face="Arial" size="2"> <b><span
                    style="font-weight:bold;">From:</span></b> Arthur
                Zubarev <a class="moz-txt-link-rfc2396E" href="mailto:Arthur.Zubarev@Aol.com">&lt;Arthur.Zubarev@Aol.com&gt;</a><br>
                <b><span style="font-weight: bold;">To:</span></b> Tony
                Anecito <a class="moz-txt-link-rfc2396E" href="mailto:adanecito@yahoo.com">&lt;adanecito@yahoo.com&gt;</a>; Robert Coli
                <a class="moz-txt-link-rfc2396E" href="mailto:rcoli@eventbrite.com">&lt;rcoli@eventbrite.com&gt;</a>; Users-Cassandra
                <a class="moz-txt-link-rfc2396E" href="mailto:user@cassandra.apache.org">&lt;user@cassandra.apache.org&gt;</a> <br>
                <b><span style="font-weight: bold;">Sent:</span></b>
                Wednesday, June 26, 2013 3:08 PM<br>
                <b><span style="font-weight: bold;">Subject:</span></b>
                Re: Creating an "Index" column...<br>
              </font> </div>
            <div class="y_msg_container"><br>
              <div id="yiv2391874971">
                <div dir="ltr">
                  <div dir="ltr">
                    <div
                      style="FONT-SIZE:12pt;FONT-FAMILY:'Calibri';COLOR:#000000;">
                      <div>Tony hi,</div>
                      <div>&nbsp;</div>
                      <div>Yes, in some scenarios (e.g. a DW), e.g.
                        absence of proper PKs or indexes (just too hard
                        to envision, you need to think of future queries
                        1st) getting thru large volumes of data makes
                        NoSQL IMHO hard to fit in.</div>
                      <div>&nbsp;</div>
                      <div>But you have other choices:</div>
                      <div>&nbsp;</div>
                      <div>1) pagination or</div>
                      <div>2) slice queries.</div>
                      <div>&nbsp;</div>
                      <div>Both of that is covered here:</div>
                      <div>&nbsp;</div>
                      <div><a moz-do-not-send="true" rel="nofollow"
                          target="_blank"
href="http://pkghosh.wordpress.com/2012/03/04/cassandra-range-query-made-simple/"><font
                            face="Times New Roman">http://pkghosh.wordpress.com/2012/03/04/cassandra-range-query-made-simple/</font></a></div>
                      <div>&nbsp;</div>
                      <div>Hope that helps.</div>
                      <div>&nbsp;</div>
                      <div>/Arthur</div>
                      <div
style="FONT-SIZE:small;FONT-FAMILY:'Calibri';FONT-WEIGHT:normal;COLOR:#000000;FONT-STYLE:normal;TEXT-DECORATION:none;DISPLAY:inline;">
                        <div style="FONT:10pt tahoma;">
                          <div>&nbsp;</div>
                          <div style="BACKGROUND:#f5f5f5;">
                            <div style=""><b>From:</b> <a
                                moz-do-not-send="true" rel="nofollow"
                                title="adanecito@yahoo.com"
                                ymailto="mailto:adanecito@yahoo.com"
                                target="_blank"
                                href="mailto:adanecito@yahoo.com">Tony
                                Anecito</a> </div>
                            <div><b>Sent:</b> Wednesday, June 26, 2013
                              1:55 PM</div>
                            <div><b>To:</b> <a moz-do-not-send="true"
                                rel="nofollow"
                                title="rcoli@eventbrite.com"
                                ymailto="mailto:rcoli@eventbrite.com"
                                target="_blank"
                                href="mailto:rcoli@eventbrite.com">Robert
                                Coli</a> ; <a moz-do-not-send="true"
                                rel="nofollow"
                                title="user@cassandra.apache.org"
                                ymailto="mailto:user@cassandra.apache.org"
                                target="_blank"
                                href="mailto:user@cassandra.apache.org">Users-Cassandra</a>
                            </div>
                            <div><b>Subject:</b> Re: Creating an "Index"
                              column...</div>
                          </div>
                        </div>
                        <div>&nbsp;</div>
                      </div>
                      <div
style="FONT-SIZE:small;FONT-FAMILY:'Calibri';FONT-WEIGHT:normal;COLOR:#000000;FONT-STYLE:normal;TEXT-DECORATION:none;DISPLAY:inline;">
                        <div style="FONT-SIZE:12pt;FONT-FAMILY:times new
                          roman, new york, times,
                          serif;COLOR:#000;BACKGROUND-COLOR:#fff;">Hi
                          Robert,<br>
                          <br>
                          Actually that is what I did. I did that in my
                          RDMS data model. In Cassandra or NOSQL without
                          join or nested selects I have to do two
                          queries. Also, since batching is not supported
                          on the server side which makes the performance
                          worse.<br>
                          <br>
                          I just started learning Cassandra but I am
                          learning fast and there are some challenges
                          when moving to a new data model driven by
                          these factors.<br>
                          <br>
                          Regards,<br>
                          -Tony<br>
                          <div><span><br>
                            </span></div>
                          <div>&nbsp;</div>
                          <div style="FONT-SIZE:12pt;FONT-FAMILY:times
                            new roman, new york, times, serif;">
                            <div style="FONT-SIZE:12pt;FONT-FAMILY:times
                              new roman, new york, times, serif;">
                              <div dir="ltr">
                                <hr size="1">
                                <font face="Arial" size="2"><b><span
                                      style="FONT-WEIGHT:bold;">From:</span></b>
                                  Robert Coli
                                  <a class="moz-txt-link-rfc2396E" href="mailto:rcoli@eventbrite.com">&lt;rcoli@eventbrite.com&gt;</a><br>
                                  <b><span style="FONT-WEIGHT:bold;">To:</span></b>
                                  <a class="moz-txt-link-abbreviated" href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>; Tony
                                  Anecito <a class="moz-txt-link-rfc2396E" href="mailto:adanecito@yahoo.com">&lt;adanecito@yahoo.com&gt;</a> <br>
                                  <b><span style="FONT-WEIGHT:bold;">Sent:</span></b>
                                  Wednesday, June 26, 2013 11:32 AM<br>
                                  <b><span style="FONT-WEIGHT:bold;">Subject:</span></b>
                                  Re: Creating an "Index" column...<br>
                                </font></div>
                              <div class="yiv2391874971y_msg_container"><br>
                                On Wed, Jun 26, 2013 at 10:20 AM, Tony
                                Anecito &lt;<a moz-do-not-send="true"
                                  rel="nofollow"
                                  ymailto="mailto:adanecito@yahoo.com"
                                  target="_blank"
                                  href="mailto:adanecito@yahoo.com">adanecito@yahoo.com</a>&gt;
                                wrote:<br>
                                &gt; Never mind I figured it out. I
                                found it via a search for Secondary
                                indexes.<br>
                                <br>
                                In general unless you actually need
                                atomic update of the row and its<br>
                                secondary index, you are probably better
                                off creating your own pseudo<br>
                                secondary index column family.<br>
                                <br>
                                =Rob<br>
                                <br>
                                <br>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
              <br>
              <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 

Regards,

Arthur</pre>
  </body>
</html>

--------------030101000600000109040003--