Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 46846 invoked from network); 28 Jul 2010 21:31:13 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 Jul 2010 21:31:13 -0000 Received: (qmail 74700 invoked by uid 500); 28 Jul 2010 21:31:12 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74591 invoked by uid 500); 28 Jul 2010 21:31:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74583 invoked by uid 99); 28 Jul 2010 21:31:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jul 2010 21:31:11 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a59.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jul 2010 21:31:05 +0000 Received: from homiemail-a59.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTP id 78844564069 for ; Wed, 28 Jul 2010 14:30:43 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:message-id:content-type:mime-version:in-reply-to:date; q=dns; s=thelastpickle.com; b=symFA1iV54F9MZgVBZYQG5wwFuLu7qAak yfoun5zOk+mS3o7+GzNIZEw4kAimTVHk2CfvwMip7Cnm1baQlW11Yx7RU+yfTud3 ZzuzdhGwblYW8ST6XLNguICv8GW5cFcrULfNVUxSNJPA3Gn5HdMQx7e8qxV2Gifn af3k2IhnDU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:message-id:content-type:mime-version:in-reply-to: date; s=thelastpickle.com; bh=NN3H1UHr8HH6pcONvF+WS3ca+qA=; b=cc RzVbpSbfNH18fFmnDQkpcBQpGfp5cZNY227hbgjIW9SGwPVEE/4JInX4QtGdFrxX 8uFtRPydffCh10rK7x6PWBnzEg+oIvso1v5cSc/5Oz7yTrxSiMTJZ/4aDrVynJUz A5ToiqtkpYw7Ad/AKA0RpuVUmXMC9xvD8z1BfozxI= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTPSA id 65C5D564061 for ; Wed, 28 Jul 2010 14:30:43 -0700 (PDT) To: user@cassandra.apache.org From: Aaron Morton Subject: Re: Cassandra vs MongoDB X-Mailer: MobileMe Mail (1C262608) Message-id: <3b4537b4-07bc-e29a-b5c2-a4059067d691@me.com> Content-Type: multipart/alternative; boundary=Apple-Webmail-42--c1e372ba-b6e5-c1cc-d74a-893853e16120 MIME-Version: 1.0 In-Reply-To: Date: Wed, 28 Jul 2010 14:30:43 -0700 (PDT) --Apple-Webmail-42--c1e372ba-b6e5-c1cc-d74a-893853e16120 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8; format=flowed =0A> If you are looking to store web logs and then do ad hoc queries you m= ight/should be using Hadoop (depending on how big your logs are)=0A =0AI a= gree, take a look at the Cloudera Hadopp 3 CDH3, they include an app calle= d Flume for moving data...=0A=0A"As a result, we designed and built Flume.= Flume is a distributed service that makes it very easy to collect and agg= regate your data into a persistent store such as HDFS. Flume can read data= from almost any source =E2=80=93 log files, Syslog packets, the standard = output of any Unix process =E2=80=93 and can deliver it to a batch process= ing system like Hadoop or a real-time data store like HBase. All this can = be configured dynamically from a single, central location =E2=80=93 no mor= e tedious configuration file editing and process restarting. Flume will co= llect the data from wherever existing applications are storing it, and whi= sk it away for further analysis and processing."=0A=0A(I wonder if this co= uld deliver into Cassanda :) )=0A=EF=BB=BF=0AIf it's straight log file pro= cessing Hadoop may be a better fit.=0A=0AAaron --Apple-Webmail-42--c1e372ba-b6e5-c1cc-d74a-893853e16120 Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--c1e372ba-b6e5-c1cc-d74a-893853e16120 --Apple-Webmail-86--c1e372ba-b6e5-c1cc-d74a-893853e16120 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252;

If you are looking to store w= eb logs and then do ad hoc queries you might/should be using Hadoop (depen= ding on how big your logs are)
 I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an app = called Flume for moving data...

"As a result, we designed and built= Flume. Flume is a distributed service=0A that makes it very easy to colle= ct and aggregate your data into a =0Apersistent store such as HDFS. Flume = can read data from almost any =0Asource =96 log files, Syslog packets, the= standard output of any Unix =0Aprocess =96 and can deliver it to a batch = processing system like Hadoop or=0A a real-time data store like HBase. All= this can be configured =0Adynamically from a single, central location =96= no more tedious =0Aconfiguration file editing and process restarting. Flu= me will collect =0Athe data from wherever existing applications are storin= g it, and whisk =0Ait away for further analysis and processing."

(I= wonder if this could deliver into Cassanda :) )

If it's straight l= og file processing Hadoop may be a better fit.

Aaron
--Apple-Webmail-86--c1e372ba-b6e5-c1cc-d74a-893853e16120-- --Apple-Webmail-42--c1e372ba-b6e5-c1cc-d74a-893853e16120--