incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dir dir <sikerasa...@gmail.com>
Subject Re: Regarding Cassandra Scalability
Date Sat, 17 Apr 2010 13:10:25 GMT
>I think you might be forgetting just how tiny tweets are. The last numbers
I heard tweeter gets 55,000,000 messages a day. They've been around for
>roughly 4 years.

I read a news in the internet, in the beginning tweeter using RDBMS MySQL
until tweeter
reach amount of tweet 1 million per day. Since the user of
tweeter.comexploding,
tweeter.com decided using Cassandra database and shutdown MySQL database.

I want to ask to the advance user or experience software developer in this
forum,
why tweeter.com choose Cassandra? would you tell me the reason behind
the decision of tweeter.com (from the feature and technical aspect)??  Why
tweeter.com did not
use Oracle 11g or Db4o for example? (please omit the oracle 11g fee license
because
we shall discuss from the feature and technical aspect). Thank you.


Dir.


On Fri, Apr 16, 2010 at 11:17 PM, Mike Gallamore <
mike.e.gallamore@googlemail.com> wrote:

>  On 04/16/2010 01:38 AM, dir dir wrote:
>
> I hear Facebook.com and tweeter.com using cassandra database. In my
> opinion Facebook and
> tweeter have hundreds TB data.  because their user reach hundreds million
> people.
>
> I think you might be forgetting just how tiny tweets are. The last numbers
> I heard tweeter gets 55,000,000 messages a day. They've been around for
> roughly 4 years. Even assuming they always had that number of messages
> (which isn't the case) that still would only be roughly 11TB of data if
> everyone sent the maximum tweet length. Sure add a bit to each message for a
> time stamp and the user that posted it but still I'd be surprised if every
> tweet including meta data was much more than 20TB.
>
> Similarly with Facebook. I think it is the friend list search that they
> really did it with. Regardless how much text is on your Facebook page? Maybe
> 1MB if you are a very very active user. The images I wouldn't think they
> would load directly into Cassandra but I could be wrong, I would suspect
> that they would pull an old database trick and have filesystem store the
> images and the "database" just stores the path to it.
>
> There could be a lot of other data floating around some of which might be
> in Cassandra but I don't know. Just the core data that the sites have
> mentioned that they use Cassandra for I think is probably in the very low
> 10's of TB.
>
> Lastly sites like Facebook and Tweeter count hundreds of millions of users
> but a lot of them are people that sign in, send a few tweets or connect to a
> few friends and then don't use the site again. When the company needs to
> make themselves look valuable they count every single person that ever
> logged in, even if they only did it once or haven't used the site for years.
> They want to sell large numbers because that is what advertisers/potential
> acquirers to base the price on those large numbers.
>
>
> Dir.
>
>
> On Fri, Apr 16, 2010 at 1:28 PM, Linton N <gabrialmarialinton@gmail.com>wrote:
>
>> hi ,
>>          I am working for the past 1 year with hadoop, but quite new to
>> cassandra, I would like to get clarified few things regarding the
>> scalability of Cassandra. Can it scall up to TB of data ?
>>
>> Please provide me some links regarding this..
>>
>>
>> --
>> --
>> With Love
>>  Lin N
>>
>
>
>

Mime
View raw message