From user-return-19329-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Aug 1 01:02:47 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5A80B695F for ; Mon, 1 Aug 2011 01:02:47 +0000 (UTC) Received: (qmail 11139 invoked by uid 500); 1 Aug 2011 01:02:44 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 11083 invoked by uid 500); 1 Aug 2011 01:02:43 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 11075 invoked by uid 99); 1 Aug 2011 01:02:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Aug 2011 01:02:43 +0000 X-ASF-Spam-Status: No, hits=3.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a49.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Aug 2011 01:02:38 +0000 Received: from homiemail-a49.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a49.g.dreamhost.com (Postfix) with ESMTP id A40B25E0057 for ; Sun, 31 Jul 2011 18:02:17 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=FCjWRXIlbM TlXRE9WLsMb9/Hd5qDoE3PsOVD+Er48nZNpY3NQEKrb1q23nV2vF8tkSVzueanLD VWMEdqg7kd5oUrxwxRGxJWt8osSV4pjZ/r0M+BWj5Je78sGjudsZ8av1KFiwR4Rv rCs5m2lUhussKmemkuVi7W02bzdSrMJ4M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=pN7smopgkDqf7SCE 4u3ncTuQ4E8=; b=GRM9ZdV6PP0E6+EssHABVDHu9kT33FVbhhAgdjaKhHzRVuzm CWkrFfs8lZqJ68sM7UW+DV7D4lAoiIZmDsrMAq8HSX/x2qVtTqdfDfm9J3Nfnvz+ Lg+Ma/hThmreD9Hx2lyjEX3wb4kdf8FCpTnI4XA71nIHk1JBPdTW2WBpCeI= Received: from [202.126.206.38] (unknown [202.126.206.38]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a49.g.dreamhost.com (Postfix) with ESMTPSA id F19375E0056 for ; Sun, 31 Jul 2011 18:02:16 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1244.3) Content-Type: multipart/alternative; boundary="Apple-Mail=_082F2D81-CE61-4C17-AD33-C3C39C0BFCBB" Subject: Re: Using Cassandra for transaction logging, good idea? Date: Mon, 1 Aug 2011 13:02:15 +1200 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: <3578B739-0F55-4068-9E42-4A10DF37F9A9@thelastpickle.com> X-Mailer: Apple Mail (2.1244.3) --Apple-Mail=_082F2D81-CE61-4C17-AD33-C3C39C0BFCBB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 If you are doing insert only it should be ok. If you want a unique and = roughly ordered Tx id perhaps consider a TimeUUID in the first case, = they are as ordered as the clocks generating the UUID's. Which is about = as good as snowflake does, cannot remember what resolution the two use. =20= Be aware that writes are not isolated, if your write involves multiple = columns readers may see some of the columns before all are written. This = may be an issue if you are doing reads around the are as the new = transactions are written. Cheers =20 ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 1 Aug 2011, at 10:15, Kent N=E4rling wrote: > Sounds interesting. >=20 > Reading a bit on snowflake it seems a bit uncertain if it fulfills the = A & B criterias? >=20 > ie: >=20 > > A, eventually return all known transactions=20 > > B, Not return the same transaction more than once=20 >=20 > Also, any reflections on the general idea to use Cassandra like this? >=20 > It would seem to me that i you set the write consistency very high = then it should be possible to be a reliability comparable to a classic = transactional database? >=20 > Since we are talking business transactions here it is VERY important = that once we write a transaction we know that it will not be lost or = partially written etc. > On the other hand we also know that it is insert only (ie no updates) = the insert operation is atomic so it could fit the cassandra model quite = well? >=20 > Any other people using cassandra to store business data like this? >=20 > On Sun, Jul 31, 2011 at 10:20 PM, Lior Golan [via [hidden email]] = <[hidden email]> wrote: > How about using Snowflake to generate the transaction ids: = https://github.com/twitter/snowflake >=20 > =20 >=20 > From: Kent Narling [mailto:[hidden email]]=20 >=20 > Sent: Thursday, July 28, 2011 5:46 PM > To: [hidden email] >=20 > Subject: Using Cassandra for transaction logging, good idea? >=20 > =20 >=20 > Hi!=20 >=20 >=20 >=20 > I am considering to use cassandra for clustered transaction logging in = a project.=20 >=20 > What I need are in principal 3 functions:=20 >=20 > 1 - Log transaction with a unique (but possibly non-sequential) id=20 > 2 - Fetch transaction with a specific id=20 > 3 - Fetch X new transactions "after" a specific cursor/transaction=20 > This function must be guaranteed to:=20 > A, eventually return all known transactions=20 > B, Not return the same transaction more than once=20 > The order of the transactions fetches does not have to be = strictly time-sorted=20 > but in practice it probably has to be based on some time-oriented = order to be able to support cursors.=20 >=20 > I can see that 1 & 2 are trivial to solve in Cassandra, but is there = any elegant way to solve 3?=20 > Since there might be multiple nodes logging transactions, their clocks = might not be perfectly synchronized (to millisec level) etc so sorting = on time is not stable.=20 > Possibly creating a synchronized incremental id might be one option = but that could create a cluster bottleneck etc?=20 >=20 > Another alternative might be to use cassandra for 1 & 2 and then store = an ordered list of id:s in a standard DB. This might be a reasonable = compromise since 3 is less critical from a HA point of view, but maybe = someone can point me to a more elegant solution using Cassandra?=20 >=20 >=20 >=20 > If you reply to this email, your message will be added to the = discussion below: > = http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cas= sandra-for-transaction-logging-good-idea-tp6630109p6639001.html > To start a new topic under [hidden email], email [hidden email]=20 > To unsubscribe from [hidden email], click here. >=20 >=20 > View this message in context: Re: Using Cassandra for transaction = logging, good idea? > Sent from the cassandra-user@incubator.apache.org mailing list archive = at Nabble.com. --Apple-Mail=_082F2D81-CE61-4C17-AD33-C3C39C0BFCBB Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 If = you are doing insert only it should be ok. If you want a unique and = roughly ordered Tx id perhaps consider a TimeUUID in the first case, = they are as ordered as the clocks generating the UUID's. Which is about = as good as snowflake does, cannot remember what resolution the two use. =  

Be aware that writes are not isolated, if your = write involves multiple columns readers may see some of the columns = before all are written. This may be an issue if you are doing reads = around the are as the new transactions are = written.

Cheers
 
http://www.thelastpickle.com

On 1 Aug 2011, at 10:15, Kent N=E4rling wrote:

Sounds = interesting.

Reading a bit on snowflake it seems a = bit uncertain if it fulfills the A & B = criterias?

ie:

>     A, eventually return all known = transactions 
>     B, Not return the = same transaction more than = once 


Also, any reflections on the = general idea to use Cassandra like this?

It would seem to me that i you set the write = consistency very high then it should be possible to be a reliability = comparable to a classic transactional = database?

Since we are talking business = transactions here it is VERY important that once we write a transaction = we know that it will not be lost or partially written etc.
On the other hand we also know that it is insert only (ie no = updates) the insert operation is atomic so it could fit the cassandra = model quite well?

Any other people using = cassandra to store business data like this?

On Sun, Jul 31, 2011 at 10:20 PM, = Lior Golan [via [hidden = email]] <[hidden = email]> wrote:

How about using Snowflake to = generate the transaction ids: https://github.com/twitter/snowflake

 

From: Kent Narling [mailto:[hidden = email]]

Sent: Thursday, July 28, 2011 5:46 = PM
To: [hidden = email]

Subject: Using Cassandra for transaction logging, good = idea?

 

Hi! 



I am considering to use = cassandra for clustered transaction logging in a = project. 

What I need are in principal 3 = functions: 

1 - Log transaction with a unique (but possibly = non-sequential) id 
2 - Fetch transaction with a = specific id 
3 - Fetch X new transactions "after" a = specific cursor/transaction 
     This function must be guaranteed = to: 
     A, eventually return all = known transactions 
     B, Not = return the same transaction more than once 
  =    The order of the transactions fetches does not have to be = strictly time-sorted 
     but in practice it probably has to be based on = some time-oriented order to be able to support = cursors. 

I can see that 1 & 2 are trivial = to solve in Cassandra, but is there any elegant way to solve = 3? 
Since there might be multiple nodes logging transactions, their = clocks might not be perfectly synchronized (to millisec level) etc so = sorting on time is not stable. 
Possibly creating a = synchronized incremental id might be one option but that could create a = cluster bottleneck etc? 

Another alternative might be to use cassandra for 1 & 2 = and then store an ordered list of id:s in a standard DB. This might be a = reasonable compromise since 3 is less critical from a HA point of view, = but maybe someone can point me to a more elegant solution using = Cassandra? 

=09


To start a new topic under [hidden = email], email [hidden = email]
To unsubscribe from [hidden = email], click here.

=09

View this message in context: R= e: Using Cassandra for transaction logging, good idea?
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

= --Apple-Mail=_082F2D81-CE61-4C17-AD33-C3C39C0BFCBB--