cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?
Date Sun, 21 Nov 2010 17:13:58 GMT
On Sun, Nov 21, 2010 at 12:10 PM, André Fiedler
<fiedler.andre@googlemail.com> wrote:
> Facebook Messaging – HBase Comes of Age
>
> http://facility9.com/2010/11/18/facebook-messaging-hbase-comes-of-age
>
> 2010/11/21 David Boxenhorn <david@lookin2.com>
>>
>> Eventual consistency is not good enough for instant messaging.
>>
>> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely <simon.reavely@gmail.com>
>> wrote:
>>>
>>> (Posting this to both user + dev lists)
>>>
>>> I was reviewing the blog post on the facebook engineering blog from nov
>>> 15th
>>> http://www.facebook.com/note.php?note_id=454991608919#
>>> <http://www.facebook.com/note.php?note_id=454991608919#>
>>> The Underlying Technology of Messages
>>> by Kannan Muthukkaruppan <http://www.facebook.com/Kannan>
>>>
>>>
>>>
>>> As a cassandra user I think the key sentence for this community is:
>>> "We found Cassandra's eventual consistency model to be a difficult
>>> pattern
>>> to reconcile for our new Messages infrastructure."
>>>
>>> I think it would be useful to find out more about this statement from
>>> Kannan
>>> and the facebook team. Does anyone have any contacts in the Facebook
>>> team?
>>>
>>> My goal here is to understand usage patterns and whether or not the
>>> Cassandra community can learn from this decision; maybe even understand
>>> whether the Cassandra roadmap should be influenced by this decision to
>>> address a target user base. Of course we might also conclude that its
>>> just
>>> "not a Cassandra use-case"!
>>>
>>> Cheers,
>>> Simon
>>> --
>>> Simon Reavely
>>> simon.reavely@gmail.com
>>
>
>



On Sun, Nov 21, 2010 at 11:40 AM, David Boxenhorn <david@lookin2.com> wrote:
> Eventual consistency is not good enough for instant messaging.
>
> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely <simon.reavely@gmail.com>
> wrote:
>>
>> (Posting this to both user + dev lists)
>>
>> I was reviewing the blog post on the facebook engineering blog from nov
>> 15th
>> http://www.facebook.com/note.php?note_id=454991608919#
>> <http://www.facebook.com/note.php?note_id=454991608919#>
>> The Underlying Technology of Messages
>> by Kannan Muthukkaruppan <http://www.facebook.com/Kannan>
>>
>>
>>
>> As a cassandra user I think the key sentence for this community is:
>> "We found Cassandra's eventual consistency model to be a difficult pattern
>> to reconcile for our new Messages infrastructure."
>>
>> I think it would be useful to find out more about this statement from
>> Kannan
>> and the facebook team. Does anyone have any contacts in the Facebook team?
>>
>> My goal here is to understand usage patterns and whether or not the
>> Cassandra community can learn from this decision; maybe even understand
>> whether the Cassandra roadmap should be influenced by this decision to
>> address a target user base. Of course we might also conclude that its just
>> "not a Cassandra use-case"!
>>
>> Cheers,
>> Simon
>> --
>> Simon Reavely
>> simon.reavely@gmail.com
>
>

Jonathan Ellis pointed out a term that I like using better "Tunable
consistency" . It seems that "eventual consistency" confuses everyone,
that or it is an easy target of an anti Cassandra public relation
campaign. If you want consistency use:

WRITE.ALL + READ.ONE (hinted handoff off)
WRITE.QUORUM + READ.QUORUM
WRITE.ONE + READ.ALL

Also I believe saying HBASE is consistent is not true. This can happen:
Write to region server. -> Region Server acknowledges client-> write
to WAL -> region server fails = write lost

I wonder how facebook will reconcile that. :)

Not trying to be nitpicky, at hadoop world in NYC I got to sit with
lots of the hbase guys and we all had a great time talking about the
mutual issues and happiness both of our communities share.

We can not speak for Facebook, but likely chose HBase because they
have several of hadoop core developers and have a large hadoop
deployment. I would say the decision was probably based on several
things. Current Cassandra release does not do on line schema updates.
I am sure facebook does not want to restart 10,000 cassandra servers
for a schema change. Current release does not have memtable tuning per
column family. The upcoming Cassandra release has support for both of
these things and many many more awesome things.

Facebook is on the high end of how much data they have to manage, and
how many servers they have. Most people do not share that use case. We
can learn that facebook chose software that was good for them based on
their use case and the experience they have in house. Something
everyone should do.

Mime
View raw message