incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: custom reconciling columns?
Date Thu, 30 Jun 2011 15:56:26 GMT
thanks.

but then the client application has the responsibility to sort the 3
segments (assuming that I need to order the "user browsing history" in the
example), I guess the total time would not be significantly different.  also
this results in 3 times more seeks while the original way needs only one
seek.  this is probably fine if my cluster is mostly idle, but if it's
mostly busy, the load is going to increase.

now my thinking is that the read path does not really need a map (the thrift
api is a list of columns anyway, sorted), so it's a luxury to construct a
map (in fact a sortedmap) in the internal process. we could very well just
use a sorted list to do the read path, which would be much faster.
(hacking out this idea today ...)

yang

On Thu, Jun 30, 2011 at 8:27 AM, Jeremiah Jordan <
JEREMIAH.JORDAN@morningstar.com> wrote:

> **
> The reason to break it up is that the information will then be on different
> servers, so you can have server 1 spending time retrieving row 1, while you
> have server 2 retrieving row 2, and server 3 retrieving row 3...  So instead
> of getting 3000 things from one server, you get 1000 from 3 servers in
> parallel...
>
>  ------------------------------
> *From:* Yang [mailto:teddyyyy123@gmail.com]
> *Sent:* Wednesday, June 29, 2011 12:07 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: custom reconciling columns?
>
> ok, here is the profiling result. I think this is consistent (having been
> trying to recover how to effectively use yourkit ...)  see attached picture
>
> since I actually do not use the thrift interface, but just directly use the
> thrift.CassandraServer and run my code in the same JVM as cassandra,
> and was running the whole thing on a single box, there is no message
> serialization/deserialization cost. but more columns did add on to more
> time.
>
> the time was spent in the ConcurrentSkipListMap operations that implement
> the memtable.
>
>
> regarding breaking up the row, I'm not sure it would reduce my run time,
> since our requirement is to read the entire rolling window history (we
> already have
> the TTL enabled , so the history is limited to a certain length, but it is
> quite long: over 1000 , in some  cases, can be 5000 or more ) .  I think
> accessing roughly 1000 items is not an uncommon requirement for many
> applications. in our case, each column has about 30 bytes of data, besides
> the meta data such as ttl, timestamp.
> at history length of 3000, the read takes about 12ms (remember this is
> completely in-memory, no disk access)
>
> I just took a look at the expiring column logic, it looks that the
> expiration does not come into play until when the
> CassandraServer.internal_get()===>thriftifyColumns() gets called. so the
> above memtable access time is still spent. yes, then breaking up the row is
> going to be helpful, but only to the degree of preventing accessing
> expired columns (btw ---- if this is actually built into cassandra code it
> would be nicer, so instead of spending multiple key lookups, I locate to the
> row once, and then within the row, there are different "generation" buckets,
> so those old generation buckets that are beyond expiration are not read );
> currently just accessing the 3000 live columns is already quite slow.
>
> I'm trying to see whether there are some easy magic bullets for a drop-in
> replacement for concurrentSkipListMap...
>
> Yang
>
>
>
>
> On Tue, Jun 28, 2011 at 4:18 PM, Nate McCall <nate@datastax.com> wrote:
>
>> I agree with Aaron's suggestion on data model and query here. Since
>> there is a time component, you can split the row on a fixed duration
>> for a given user, so the row key would become userId_[timestamp
>> rounded to day].
>>
>> This provides you an easy way to roll up the information for the date
>> ranges you need since the key suffix can be created without a read.
>> This also benefits from spreading the read load over the cluster
>> instead of just the replicas since you have 30 rows in this case
>> instead of one.
>>
>> On Tue, Jun 28, 2011 at 5:55 PM, aaron morton <aaron@thelastpickle.com>
>> wrote:
>> > Can you provide some more info:
>> > - how big are the rows, e.g. number of columns and column size  ?
>> > - how much data are you asking for ?
>> > - what sort of read query are you using ?
>> > - what sort of numbers are you seeing ?
>> > - are you deleting columns or using TTL ?
>> > I would consider issues with the data churn, data model and query before
>> > looking at serialisation.
>> > Cheers
>> > -----------------
>> > Aaron Morton
>> > Freelance Cassandra Developer
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> > On 29 Jun 2011, at 10:37, Yang wrote:
>> >
>> > I can see that as my user history grows, the reads time proportionally (
>> or
>> > faster than linear) grows.
>> > if my business requirements ask me to keep a month's history for each
>> user,
>> > it could become too slow.----- I was suspecting that it's actually the
>> > serializing and deserializing that's taking time (I can definitely it's
>> cpu
>> > bound)
>> >
>> >
>> > On Tue, Jun 28, 2011 at 3:04 PM, aaron morton <aaron@thelastpickle.com>
>> > wrote:
>> >>
>> >> There is no facility to do custom reconciliation for a column. An
>> append
>> >> style operation would run into many of the same problems as the Counter
>> >> type, e.g. not every node may get an append and there is a chance for
>> lost
>> >> appends unless you go to all the trouble Counter's do.
>> >>
>> >> I would go with using a row for the user and columns for each item.
>> Then
>> >> you can have fast no look writes.
>> >>
>> >> What problems are you seeing with the reads ?
>> >>
>> >> Cheers
>> >>
>> >>
>> >> -----------------
>> >> Aaron Morton
>> >> Freelance Cassandra Developer
>> >> @aaronmorton
>> >> http://www.thelastpickle.com
>> >>
>> >> On 29 Jun 2011, at 04:20, Yang wrote:
>> >>
>> >> > for example, if I have an application that needs to read off a user
>> >> > browsing history, and I model the user ID as the key,
>> >> > and the history data within the row. with current approach, I could
>> >> > model each visit as  a column,
>> >> > the possible issue is that *possibly* (I'm still doing a lot of
>> >> > profiling on this to verify) that a lot of time is spent on
>> serialization
>> >> > into the message and out of the
>> >> > message, plus I do not need the full features provided by the column
>> :
>> >> > for example I do not need a timestamp on each visit, etc,
>> >> > so it might be faster to put the entire history in a blob, and each
>> >> > visit only takes up a few bytes in the blob, and
>> >> > my code manipulates the blob.
>> >> >
>> >> > problem is, I still need to avoid the read-before-write, so I send
>> only
>> >> > the latest visit, and let cassandra do the reconcile, which appends
>> the
>> >> > visit to the blob, so this needs custom reconcile behavior.
>> >> >
>> >> > is there a way to incorporate such custom reconcile under current
>> code
>> >> > framework? (I see custom sorting, but no custom reconcile)
>> >> >
>> >> > thanks
>> >> > yang
>> >>
>> >
>> >
>> >
>>
>
>

Mime
View raw message