Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from
	:mime-version:content-type:subject:date:in-reply-to:to
	:references:message-id; q=dns; s=thelastpickle.com; b=xm8L5MEhxL
	5lQvECuKWBLON6iYIs7SChYNTMj5rWZ8rV4VEdZxP9SH0/7W6Ig9XA/LOyMOkM4G
	e+wcKtwUmR2nMXzP3TxozOqSu857oiJRFvum4qhfxkuM9BxDWKL2WTrztoJ/Q/SI
	muRRuOObQYb1c+DaLIWcWCK/JeKNiK18Q=
From: aaron morton <aaron@thelastpickle.com>
Mime-Version: 1.0 (Apple Message framework v1257)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_BE78C827-4AA3-434C-A42E-F4FD310E59E4"
Subject: Re: Data Modeling
Date: Tue, 21 Feb 2012 21:08:01 +1300
In-Reply-To: <0EA361E3-A99B-40FA-9938-D5EB5DA23EEF@gmail.com>
To: user@cassandra.apache.org
References: <1329734994547-7300846.post@n2.nabble.com>
 <3C0D4074-C4CD-46B6-912A-A19DAAF86040@thelastpickle.com>
 <0EA361E3-A99B-40FA-9938-D5EB5DA23EEF@gmail.com>
Message-Id: <F99F8383-D46B-4B40-9D2A-EBEDF9A8897E@thelastpickle.com>


--Apple-Mail=_BE78C827-4AA3-434C-A42E-F4FD310E59E4
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

> For CF PropertyValues, instead of <property_value:customer_id> should =
I do <customer_id:property_value> to preserve the same order for each =
property_value ? (there will be custom null value).
Whatever works best for you.=20

> Why is using only columns names faster ? It seems that it's not =
possible to retrieve column names without column values in Hector for =
example, so even after reading your article (great by the way), i don't =
get it.

Not sure what you mean.=20

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 21/02/2012, at 10:17 AM, alexis coudeyras wrote:

> Thanks a lot Aaron,
>=20
> I will try your idea tomorow.
>=20
> For CF PropertyValues, instead of <property_value:customer_id> should =
I do <customer_id:property_value> to preserve the same order for each =
property_value ? (there will be custom null value).
>=20
> Why is using only columns names faster ? It seems that it's not =
possible to retrieve column names without column values in Hector for =
example, so even after reading your article (great by the way), i don't =
get it.
>=20
>=20
> Le 20 f=E9vr. 2012 =E0 20:41, aaron morton a =E9crit :
>=20
>> If you want to read all possible values for a field, where the field =
has 1 million possible values it's going to take time. No matter what =
data model you use.=20
>>=20
>> That said, the first model I would use is:
>>=20
>> CF: Customer
>> Use this as a canonical record of the properties a customer has.=20
>> row_key : <customer_id>
>> cols: <property_name> =3D <property_value>
>>=20
>> CF: PropertyValues
>> Use this to perform to build the reverse index. Column names are a =
composite value of property value and customer ID.
>> row_key: <property_name>
>> cols: <property_value:customer_id> =3D EMPTY
>>=20
>> * To Insert: It is good if you can work out the delta. Just update =
what you need to in the customer, delete the old values from the =
PropertyValues CF and insert the new ones. Note: I would insert when you =
get the new data,=20
>>=20
>> * To Read:
>>>   - I need to retrieve all values of a field (all firstNames, all =
lastNames,
>> Get all the values from the appropriate row.=20
>>> 	- The fastest the better (1 to 3 seconds)
>> Things take time =
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/
>>> 	- It must preserve order : if i retrieve all countries and then =
all
>>> lastName, the nth country and the nth lastName should correspond to =
the same
>>> customer.
>> Can only be guaranteed if every customer has a value for every field. =
Or if you use a custom null value.=20
>>> 	- Sometimes I will have to retrieve all values of multiples =
fields (< 10)
>> There is no provision for server side joins. If you have a query you =
use often it is best to materialise the result .
>>=20
>> Hope that helps.=20
>>=20
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>=20
>> On 20/02/2012, at 11:49 PM, acoudeyras wrote:
>>=20
>>> Hi,
>>>=20
>>> I'm new to Cassandra and i'm looking for the best way to handle my =
use case.
>>>=20
>>> My entities look like :
>>>=20
>>> customers : [{
>>> 	id: 3F2504E0-4F89-11D3-9A0C-0305E82C3301,
>>> 	firstName: "Carl",
>>> 	lastName: "Smith",
>>> 	country:"FR"
>>> },{
>>> 	id:21EC2020-3AEA-1069-A2DD-08002B30309D,
>>> 	firstName: "John",
>>> 	lastName: "Doe"
>>> 	country:"EN"
>>> }]
>>>=20
>>> I will use the term "field" to describe a property of customer =
(lastName for
>>> example).
>>>=20
>>> I will have 1 millions of customers and more than 300 fields =
(firstName,
>>> lastName, ...) for each customer.
>>>=20
>>> I have two requirements :
>>>=20
>>> - I need to retrieve all values of a field (all firstNames, all =
lastNames,
>>> ...).
>>> 	- The fastest the better (1 to 3 seconds)
>>> 	- It must preserve order : if i retrieve all countries and then =
all
>>> lastName, the nth country and the nth lastName should correspond to =
the same
>>> customer.
>>> 	- Sometimes I will have to retrieve all values of multiples =
fields (< 10)
>>>=20
>>> - Datas will be updated (insert, delete, update), every 10 or 20 =
minutes in
>>> bulk, just a small number of entities will change each time. When an =
update
>>> occurs, in input I have the whole entity (a full customer with all =
his
>>> fields). Performance is important, but less than in the previous =
case (10
>>> seconds for updating is ok).
>>>=20
>>> - Retrieving a customer by id or retrieving a list of customer with =
some
>>> specific criteria is *not* a requirement.
>>>=20
>>> ---
>>> Solution 1:
>>>=20
>>> Column Family : customers
>>> One row for each customer : 1 million rows
>>> One column for each field : 300 fields by row.
>>>=20
>>> Benefits : easy to update
>>> Problem : As far as i understand, it doesn't seems to fit with =
cassandra
>>> model, getting all values will be slow.
>>>=20
>>> ---
>>> Solution 2:
>>>=20
>>> Wide Row for the whole entity
>>>=20
>>> Column Family : datas
>>> One row : customers
>>> Composite Columns : (fieldName, ID) =3D fieldValue
>>>=20
>>> Customers : [{
>>> 	("country", "3F2504E0-4F89-11D3-9A0C-0305E82C3301") =3D "FR",
>>> 	("country", "21EC2020-3AEA-1069-A2DD-08002B30309D") =3D "EN",
>>> 	("firstName", "3F2504E0-4F89-11D3-9A0C-0305E82C3301") =3D =
"Carl",
>>> 	("firstName", "21EC2020-3AEA-1069-A2DD-08002B30309D") =3D =
"John",
>>> 	("lastName", "3F2504E0-4F89-11D3-9A0C-0305E82C3301") =3D =
"Smith",
>>> 	("lastName", "21EC2020-3AEA-1069-A2DD-08002B30309D") =3D "Doe",
>>> ...
>>> }]
>>>=20
>>>=20
>>> As far as i understand it seems to be the fastest way to retrieve =
all values
>>> of a field in the same order.
>>> To update, i don't need to read before writing.
>>>=20
>>> Problem : the row will be very large : 300 000 000 of columns. I can =
split
>>> it in different rows based on the value of the specific field, for =
example
>>> country.
>>>=20
>>> ---
>>> Solution 3:
>>>=20
>>> Wide Row by field=20
>>>=20
>>> Column Family : customers
>>> One row by field : so 300 rows
>>> Columns : ID =3D FieldValue
>>>=20
>>> Benefits :
>>> The row will be smaller, 1 000 000 colums.
>>>=20
>>> Problem :
>>> Update seems more expensive, for every customer to update, i need to =
update
>>> 300 rows.
>>>=20
>>> ---
>>>=20
>>> Witch solution seems to be the good one ? Does Cassandra is really a =
good
>>> fit for this use case ?
>>>=20
>>> Thanks
>>>=20
>>> Alexis Coudeyras
>>>=20
>>> --
>>> View this message in context: =
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Data-Mode=
ling-tp7300846p7300846.html
>>> Sent from the cassandra-user@incubator.apache.org mailing list =
archive at Nabble.com.
>>=20
>=20


--Apple-Mail=_BE78C827-4AA3-434C-A42E-F4FD310E59E4
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><blockquote type=3D"cite"><div style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><div>For CF PropertyValues, instead of =
&lt;property_value:customer_id&gt; should I =
do&nbsp;&lt;customer_id:property_value&gt; to preserve the same order =
for each property_value ? (there will be custom null =
value).</div></div></blockquote><div><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; "><div>Whatever works best for =
you.&nbsp;</div><div><br></div><div><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><div>Why is using only columns =
names faster ? It seems that it's not possible to retrieve column names =
without column values in Hector for example, so even after reading your =
article (great by the way), i don't get =
it.</div></div></blockquote></div><div><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; "><div>Not sure what you =
mean.&nbsp;</div><div><br></div><div>Cheers</div><div><br></div></div></di=
v></div></div><div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 21/02/2012, at 10:17 AM, alexis coudeyras =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; ">Thanks a lot =
Aaron,<div><br></div><div>I will try your idea =
tomorow.</div><div><br></div><div>For CF PropertyValues, instead of =
&lt;property_value:customer_id&gt; should I =
do&nbsp;&lt;customer_id:property_value&gt; to preserve the same order =
for each property_value ? (there will be custom null =
value).</div><div><br></div><div>Why is using only columns names faster =
? It seems that it's not possible to retrieve column names without =
column values in Hector for example, so even after reading your article =
(great by the way), i don't get =
it.</div><div><br></div><div><br></div><div><div><div>Le 20 f=E9vr. 2012 =
=E0 20:41, aaron morton a =E9crit :</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><div>If you want to read all =
possible values for a field, where the field has 1 million possible =
values it's going to take time. No matter what data model you =
use.&nbsp;</div><div><br></div><div>That said, the first model I would =
use is:</div><div><br></div><div>CF: Customer</div><div>Use this as a =
canonical record of the properties a customer =
has.&nbsp;</div><div>row_key : &lt;customer_id&gt;</div><div>cols: =
&lt;property_name&gt; =3D =
&lt;property_value&gt;</div><div><br></div><div>CF: =
PropertyValues</div><div>Use this to perform to build the reverse index. =
Column names are a composite value of property value and customer =
ID.</div><div>row_key: &lt;property_name&gt;</div><div>cols: =
&lt;property_value:customer_id&gt; =3D EMPTY</div><div><br></div><div>* =
To Insert: It is good if you can work out the delta. Just update what =
you need to in the customer, delete the old values from the =
PropertyValues CF and insert the new ones. Note: I would insert when you =
get the new data,&nbsp;</div><div><br></div><div>* To =
Read:</div><div></div><blockquote type=3D"cite"><div>&nbsp;&nbsp;- I =
need to retrieve all values of a field (all firstNames, all =
lastNames,</div></blockquote><div>Get all the values from the =
appropriate row.&nbsp;</div><div><blockquote type=3D"cite"><div><span =
class=3D"Apple-tab-span" style=3D"white-space: pre; ">	</span>- The =
fastest the better (1 to 3 seconds)<br></div></blockquote>Things take =
time&nbsp;<a =
href=3D"http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/">http:/=
/thelastpickle.com/2011/07/04/Cassandra-Query-Plans/</a></div><div><blockq=
uote type=3D"cite"><div><span class=3D"Apple-tab-span" =
style=3D"white-space: pre; ">	</span>- It must preserve order : if i =
retrieve all countries and then all<br>lastName, the nth country and the =
nth lastName should correspond to the =
same<br>customer.<br></div></blockquote>Can only be guaranteed if every =
customer has a value for every field. Or if you use a custom null =
value.&nbsp;</div><div><blockquote type=3D"cite"><div><span =
class=3D"Apple-tab-span" style=3D"white-space: pre; ">	</span>- =
Sometimes I will have to retrieve all values of multiples fields (&lt; =
10)<br></div></blockquote>There is no provision for server side joins. =
If you have a query you use often it is best to materialise the result =
.</div><div><br></div><div>Hope that =
helps.&nbsp;</div><div><br></div><div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: =
none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; =
font-family: Helvetica; font-style: normal; font-variant: normal; =
font-weight: normal; letter-spacing: normal; line-height: normal; =
orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com/">http://www.thelastpickle.com</a></d=
iv></div></div></span></div></span></div></span></span>
</div>
<br><div><div>On 20/02/2012, at 11:49 PM, acoudeyras wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div>Hi,<br><br>I'm new to Cassandra and i'm looking for =
the best way to handle my use case.<br><br>My entities look like =
:<br><br>customers : [{<br><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>id: =
3F2504E0-4F89-11D3-9A0C-0305E82C3301,<br><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>firstName: "Carl",<br><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>lastName: =
"Smith",<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>country:"FR"<br>},{<br><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	=
</span>id:21EC2020-3AEA-1069-A2DD-08002B30309D,<br><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>firstName: "John",<br><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>lastName: "Doe"<br><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>country:"EN"<br>}]<br><br>I will use the term "field" to describe =
a property of customer (lastName for<br>example).<br><br>I will have 1 =
millions of customers and more than 300 fields (firstName,<br>lastName, =
...) for each customer.<br><br>I have two requirements :<br><br>- I need =
to retrieve all values of a field (all firstNames, all =
lastNames,<br>...).<br><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>- The fastest the better (1 to 3 =
seconds)<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>- It must preserve order : if i retrieve all countries and then =
all<br>lastName, the nth country and the nth lastName should correspond =
to the same<br>customer.<br><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>- Sometimes I will have to =
retrieve all values of multiples fields (&lt; 10)<br><br>- Datas will be =
updated (insert, delete, update), every 10 or 20 minutes in<br>bulk, =
just a small number of entities will change each time. When an =
update<br>occurs, in input I have the whole entity (a full customer with =
all his<br>fields). Performance is important, but less than in the =
previous case (10<br>seconds for updating is ok).<br><br>- Retrieving a =
customer by id or retrieving a list of customer with some<br>specific =
criteria is *not* a requirement.<br><br>---<br>Solution 1:<br><br>Column =
Family : customers<br>One row for each customer : 1 million rows<br>One =
column for each field : 300 fields by row.<br><br>Benefits : easy to =
update<br>Problem : As far as i understand, it doesn't seems to fit with =
cassandra<br>model, getting all values will be =
slow.<br><br>---<br>Solution 2:<br><br>Wide Row for the whole =
entity<br><br>Column Family : datas<br>One row : customers<br>Composite =
Columns : (fieldName, ID) =3D fieldValue<br><br>Customers : [{<br><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>("country", "3F2504E0-4F89-11D3-9A0C-0305E82C3301") =3D =
"FR",<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>("country", "21EC2020-3AEA-1069-A2DD-08002B30309D") =3D =
"EN",<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>("firstName", "3F2504E0-4F89-11D3-9A0C-0305E82C3301") =3D =
"Carl",<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>("firstName", "21EC2020-3AEA-1069-A2DD-08002B30309D") =3D =
"John",<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>("lastName", "3F2504E0-4F89-11D3-9A0C-0305E82C3301") =3D =
"Smith",<br><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>("lastName", "21EC2020-3AEA-1069-A2DD-08002B30309D") =3D =
"Doe",<br>...<br>}]<br><br><br>As far as i understand it seems to be the =
fastest way to retrieve all values<br>of a field in the same =
order.<br>To update, i don't need to read before writing.<br><br>Problem =
: the row will be very large : 300 000 000 of columns. I can split<br>it =
in different rows based on the value of the specific field, for =
example<br>country.<br><br>---<br>Solution 3:<br><br>Wide Row by field =
<br><br>Column Family : customers<br>One row by field : so 300 =
rows<br>Columns : ID =3D FieldValue<br><br>Benefits :<br>The row will be =
smaller, 1 000 000 colums.<br><br>Problem :<br>Update seems more =
expensive, for every customer to update, i need to update<br>300 =
rows.<br><br>---<br><br>Witch solution seems to be the good one ? Does =
Cassandra is really a good<br>fit for this use case =
?<br><br>Thanks<br><br>Alexis Coudeyras<br><br>--<br>View this message =
in context: <a =
href=3D"http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/D=
ata-Modeling-tp7300846p7300846.html">http://cassandra-user-incubator-apach=
e-org.3065146.n2.nabble.com/Data-Modeling-tp7300846p7300846.html</a><br>Se=
nt from the <a =
href=3D"mailto:cassandra-user@incubator.apache.org">cassandra-user@incubat=
or.apache.org</a> mailing list archive at <a =
href=3D"http://Nabble.com/">Nabble.com</a>.<br></div></blockquote></div><b=
r></div></blockquote></div><br></div></div></blockquote></div><br></body><=
/html>=

--Apple-Mail=_BE78C827-4AA3-434C-A42E-F4FD310E59E4--