incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chidambaran Subramanian <chi...@gmail.com>
Subject Re: Blob vs. "normal" columns (internals) difference?
Date Thu, 04 Apr 2013 01:34:21 GMT
On Thu, Apr 4, 2013 at 6:58 AM, aaron morton <aaron@thelastpickle.com>wrote:

> > 1. Is size getting bigger in either one in storing one Tweet?
> If you store the data in one blob then we only store one column name and
> the blob. If they are in different cols then we store the column names and
> their values.
>
> > 2. Has either choice have impact on read/write performance on large
> scale?
> If you store data in a blob you can only read and update it as a blob, so
> chances are you will be wasting effort as you do read-modify-write
> operations. Unless you have a good reason split things up and store them as
> columns.
>
> If its mostly read only data that can be cached outside Cassandra, storing
it in one column looks like a good idea to me. What is the downside, anyway?



> cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/04/2013, at 1:08 PM, Alan Ristić <alan.ristic@gmail.com> wrote:
>
> > Hi guys,
> >
> > Here is example (fictional) model I have for learning purposes...
> >
> > I'm currently storing the "User" object in a Tweet as blob value. So
> taking JSON of 'User' and storing it as blob. I'm wondering why is this
> better vs. just prefixing and flattening column names?
> >
> > Tweet {
> >  id uuid,
> >  user blob
> > }
> >
> > vs.
> >
> > Tweet {
> >  id uuid,
> >  user_id uuid,
> >  user_name text,
> >  ....
> > }
> >
> > In one or other
> >
> > 1. Is size getting bigger in either one in storing one Tweet?
> > 2. Has either choice have impact on read/write performance on large
> scale?
> > 3. Anything else I should be considering here? Your view/thinking would
> be great.
> >
> > Here is my understanding:
> > For 'ease' of update if for example user changes its name I'm aware I
> need to (re)write whole object in all Tweets in first "blob" example and
> only user_name column in second 'flattened' example. Which brings me that
> If I'd wanted to actually do this "updating/rewriting" for every Tweet I'd
> use second 'flattened' example since payload of only user_name is smaller
> than whole User blob for every Tweet right?
> >
> > Nothing urgent, any input is valuable, tnx guys :)
> >
> >
> >
> > Hvala in lp,
> > Alan Ristić
> >
> > w: personal blog
> >  t: @alanristic
> >  l: linkedin.com/alanristic
> > m: ​068 15 73 88​
>
>

Mime
View raw message