incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagan Ranganathan <>
Subject Re: Queuing System
Date Sun, 23 Feb 2014 08:02:11 GMT
Thanks Joe. That's a nice pointer. Will explore the possibility. I am just concerned about
the Leader swap time window, but may be thats the tradeoff b/n data consistency Vs availability.


---- On Sat, 22 Feb 2014 23:08:00 +0530 Joe Stein &lt;; wrote

 Without them you have no durability.  

 With them you have guarantees... More than any other system with messaging features.  It
is a durable CP commit log.  Works very well for data pipelines with AP systems like Cassandra
which is a different system solving different problems.  When a Kafka leader fails you right
might block and wait for 10ms while a new leader is elected but writes can be guaranteed.

 The consumers then read and process data and write to Cassandra. And then have your app read
from Cassandra for what what was processed.

 These are very typical type architectures at scale

  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  Twitter: @allthingshadoop

On Feb 22, 2014, at 11:49 AM, Jagan Ranganathan &lt;; wrote:

    Hi Joe, 

 If my understanding is right, Kafka does not satisfy the high availability/replication part
well because of the need for leader and In-Sync replicas. 

---- On Sat, 22 Feb 2014 22:02:27 +0530 Joe Stein&lt;; wrote

   If performance and availability for messaging is a requirement then use Apache Kafka
You can pass the same thrift/avro objects through the Kafka commit log or strings or whatever
you want.

  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  Twitter: @allthingshadoop

On Feb 22, 2014, at 11:13 AM, Jagan Ranganathan &lt;; wrote:

    Hi Michael, 

 Yes I am planning to use RabbitMQ for my messaging system. But I wonder which will give better
performance if writing directly into Rabbit with Ack support Vs a temporary Queue in Cassandra
first and then dequeue and publish in Rabbit.

 Complexities involving - Handling scenarios like Rabbit Connection failure etc Vs Cassandra
write performance and replication with hinted handoff support etc, makes me wonder if this
is a better path.

---- On Sat, 22 Feb 2014 21:01:14 +0530  Michael Laing &lt;;
wrote ---- 

   We use RabbitMQ for queuing and Cassandra for persistence. 

 RabbitMQ with clustering and/or federation should meet your high availability needs.


 On Sat, Feb 22, 2014 at 10:25 AM, DuyHai Doan &lt;; wrote:

   Queue-like data structures are known to be one of the worst anti patterns for Cassandra:

 On Sat, Feb 22, 2014 at 4:03 PM, Jagan Ranganathan &lt;; wrote:

  I need to decouple some of the work being processed from the user thread to provide better
user experience. For that I need a queuing system with the following needs,
    High Availability
  No Data Loss
  Better Performance.

 Following are some libraries that were considered along with the limitation I see,
    Redis - Data Loss
  ZooKeeper - Not advised for Queue system.
  TokyoCabinet/SQLite/LevelDB - of this Level DB seem to be performing better. With replication
requirement, I probably have to look at Apache ActiveMQ+LevelDB.

 After checking on the third option above, I kind of wonder if Cassandra with Leveled Compaction
offer a similar system. Do you see any issues in such a usage or is there other better solutions

 Will be great to get insights on this.







View raw message