incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: questions related to the SSTable file
Date Tue, 17 Sep 2013 14:11:36 GMT
Netflix created file streaming in astyanax into cassandra specifically because writing too
big a column cell is a bad thing.  The limit is really dependent on use case….do you have
servers writing 1000's of 200Meg files at the same time….if so, astyanax streaming may be
a better way to go there where it divides up the file amongst cells and rows.

I know the limit of a row size is really your hard disk space and the column count if I remember
goes into billions though realistically, I think beyond 10 million might slow down a bit….all
I know is we tested up to 10 million columns with no issues in our use-case.

So you mean at this time, I could get 2 SSTable files, both contain column "Blue" for the
same row key, right?

Yes

In this case, I should be fine as value of the "Blue" column contain the timestamp to help
me to find out which is the last change, right?

Yes

In MR world, each file COULD be processed by different Mapper, but will be sent to the same
reducer as both data will be shared same key.

If that is the way you are writing it, then yes

Dean

From: Shahab Yunus <shahab.yunus@gmail.com<mailto:shahab.yunus@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, September 17, 2013 7:54 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: questions related to the SSTable file

derstand if following changes apply to the same row key as above example, additional SSTable
file could be generated. That is

Mime
View raw message