Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of jshook@gmail.com designates
 209.85.221.192 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=IsLY1Vf6Mg+POhpU1SYzHPWE1cDjn9X/AT234KRRFRYd5noKK9g0TeYwShaG1TdQNF
         NNhwfnGxvT/L/4KfMBt2kblnA+qnLWUZX8Y9HyhyAYIm6GrU9PCwfX+BfF7IKJo0CTsj
         K9Wv6Ni9Do0tkMvbDH8mqutONGQgN4It1O9G4=
MIME-Version: 1.0
In-Reply-To: 
 <FAC2E7C084A5114C9A2B5CE27DABFF6902CD3C01@winxbede38.exchange.xchg>
References: 
 <FAC2E7C084A5114C9A2B5CE27DABFF6902CD3C01@winxbede38.exchange.xchg>
Date: Mon, 3 May 2010 09:35:09 -0500
Message-ID: <s2r3bb46cdd1005030735x983f31e9v2a011c15717dfef@mail.gmail.com>
Subject: Re: Search Sample and Relation question because UDDI as Key
From: Jonathan Shook <jshook@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=00c09f9db04ff9b0590485b17f36

--00c09f9db04ff9b0590485b17f36
Content-Type: text/plain; charset=ISO-8859-1

I am only speaking to your second question.

It may be helpful to think of modeling your storage layout in terms of
* lists
* sets
* hash maps
... and certain combinations of these.

Since there are no schema-defined relations, your relations may appear
implicit between different views or "copies" of your data. The relationship
can be assumed to be explicit to the extent that it is used in that way or
even (in some cases) enforced by a boundary layer in your software.

For accessing data by value, you can try to do your bookkeeping (indexing)
as you go, by maintaining auxiliary maps directly via your application.
Scanning by value is really not a strong point for Cassandra, and in fact is
one of the trade-offs made when moving to a DHT (
http://en.wikipedia.org/wiki/Distributed_hash_table) data store.

There has been discussion around putting some form of value indexing in at
some point in the future, but the plans appear indefinite. Even with this,
it would move workload into the hub which may otherwise be better handled in
a client node.


On Sun, May 2, 2010 at 4:33 PM, CleverCross | Falk Wolsky <
falk.wolsky@clevercross.eu> wrote:

> Hello,
>
> 1) Can you provide a solution or a sample for searching (Column and
> SuperColumn) (Fulltext).
> What is the Way to realize this? Hadoop/MapReduce? See you a posibility to
> build/use a index for columns?
>
> Why this: In a given Data-Model we "must" use UUIDs as Key and have
> actually no chance to seach values from "Columns"? (or not?)
>
> 2) How can we realize a "relation"
>
> For Sample: (http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
> )
> Arin describes good a simple Data-Model to build a Blog. But how can we
> read (filter) all Posts from "BlogEntries" from a single Autor?
> (filter the Supercolumns by a culum inside of a SuperColumn)
>
> The "relation" for Sample is Autor -> BlogEntries...
> To filter the Datas there is a needing to specify in a "get(...)"-Function
> a Column/Value combination...
>
> I know well that cassandra is not a "relational Database"! But without
> these releations the usage is very "limited" (specialized)
>
> Thanks in Advance! - and thx for Cassandra!
> With Hector i build a (Apache)Cocoon-Transformer...
>
> With Kind Regards,
> Falk Wolsky
>

--00c09f9db04ff9b0590485b17f36
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I am only speaking to your second question.<br><br>It may be helpful to thi=
nk of modeling your storage layout in terms of<br>* lists<br>* sets<br>* ha=
sh maps<br>... and certain combinations of these.<br><br>Since there are no=
 schema-defined relations, your relations may appear implicit between diffe=
rent views or &quot;copies&quot; of your data. The relationship can be assu=
med to be explicit to the extent that it is used in that way or even (in so=
me cases) enforced by a boundary layer in your software.<br>
<br>For accessing data by value, you can try to do your bookkeeping (indexi=
ng) as you go, by maintaining auxiliary maps directly via your application.=
 Scanning by value is really not a strong point for Cassandra, and in fact =
is one of the trade-offs made when moving to a DHT (<a href=3D"http://en.wi=
kipedia.org/wiki/Distributed_hash_table">http://en.wikipedia.org/wiki/Distr=
ibuted_hash_table</a>) data store.<br>
<br>There has been discussion around putting some form of value indexing in=
 at some point in the future, but the plans appear indefinite. Even with th=
is, it would move workload into the hub which may otherwise be better handl=
ed in a client node.<br>
<br><br><div class=3D"gmail_quote">On Sun, May 2, 2010 at 4:33 PM, CleverCr=
oss | Falk Wolsky <span dir=3D"ltr">&lt;<a href=3D"mailto:falk.wolsky@cleve=
rcross.eu">falk.wolsky@clevercross.eu</a>&gt;</span> wrote:<br><blockquote =
class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, 204, 204); m=
argin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hello,<br>
<br>
1) Can you provide a solution or a sample for searching (Column and SuperCo=
lumn) (Fulltext).<br>
What is the Way to realize this? Hadoop/MapReduce? See you a posibility to =
build/use a index for columns?<br>
<br>
Why this: In a given Data-Model we &quot;must&quot; use UUIDs as Key and ha=
ve actually no chance to seach values from &quot;Columns&quot;? (or not?)<b=
r>
<br>
2) How can we realize a &quot;relation&quot;<br>
<br>
For Sample: (<a href=3D"http://arin.me/blog/wtf-is-a-supercolumn-cassandra-=
data-model" target=3D"_blank">http://arin.me/blog/wtf-is-a-supercolumn-cass=
andra-data-model</a>)<br>
Arin describes good a simple Data-Model to build a Blog. But how can we rea=
d (filter) all Posts from &quot;BlogEntries&quot; from a single Autor?<br>
(filter the Supercolumns by a culum inside of a SuperColumn)<br>
<br>
The &quot;relation&quot; for Sample is Autor -&gt; BlogEntries...<br>
To filter the Datas there is a needing to specify in a &quot;get(...)&quot;=
-Function a Column/Value combination...<br>
<br>
I know well that cassandra is not a &quot;relational Database&quot;! But wi=
thout these releations the usage is very &quot;limited&quot; (specialized)<=
br>
<br>
Thanks in Advance! - and thx for Cassandra!<br>
With Hector i build a (Apache)Cocoon-Transformer...<br>
<br>
With Kind Regards,<br>
<font color=3D"#888888">Falk Wolsky<br>
</font></blockquote></div><br>

--00c09f9db04ff9b0590485b17f36--