cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: custom reconciling columns?
Date Tue, 28 Jun 2011 22:55:06 GMT
Can you provide some more info:

- how big are the rows, e.g. number of columns and column size  ? 
- how much data are you asking for ? 
- what sort of read query are you using ? 
- what sort of numbers are you seeing ?
- are you deleting columns or using TTL ? 

I would consider issues with the data churn, data model and query before looking at serialisation.


Aaron Morton
Freelance Cassandra Developer

On 29 Jun 2011, at 10:37, Yang wrote:

> I can see that as my user history grows, the reads time proportionally ( or faster than
linear) grows.
> if my business requirements ask me to keep a month's history for each user, it could
become too slow.----- I was suspecting that it's actually the 
> serializing and deserializing that's taking time (I can definitely it's cpu bound)
> On Tue, Jun 28, 2011 at 3:04 PM, aaron morton <> wrote:
> There is no facility to do custom reconciliation for a column. An append style operation
would run into many of the same problems as the Counter type, e.g. not every node may get
an append and there is a chance for lost appends unless you go to all the trouble Counter's
> I would go with using a row for the user and columns for each item. Then you can have
fast no look writes.
> What problems are you seeing with the reads ?
> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> On 29 Jun 2011, at 04:20, Yang wrote:
> > for example, if I have an application that needs to read off a user browsing history,
and I model the user ID as the key,
> > and the history data within the row. with current approach, I could model each visit
as  a column,
> > the possible issue is that *possibly* (I'm still doing a lot of profiling on this
to verify) that a lot of time is spent on serialization into the message and out of the
> > message, plus I do not need the full features provided by the column : for example
I do not need a timestamp on each visit, etc,
> > so it might be faster to put the entire history in a blob, and each visit only takes
up a few bytes in the blob, and
> > my code manipulates the blob.
> >
> > problem is, I still need to avoid the read-before-write, so I send only the latest
visit, and let cassandra do the reconcile, which appends the
> > visit to the blob, so this needs custom reconcile behavior.
> >
> > is there a way to incorporate such custom reconcile under current code framework?
(I see custom sorting, but no custom reconcile)
> >
> > thanks
> > yang

View raw message