Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 62301 invoked from network); 28 Jul 2010 21:59:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 Jul 2010 21:59:57 -0000 Received: (qmail 21412 invoked by uid 500); 28 Jul 2010 21:59:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 21286 invoked by uid 500); 28 Jul 2010 21:59:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21278 invoked by uid 99); 28 Jul 2010 21:59:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jul 2010 21:59:54 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jeremy.hanna1234@gmail.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jul 2010 21:59:48 +0000 Received: by gxk1 with SMTP id 1so2256867gxk.31 for ; Wed, 28 Jul 2010 14:59:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:content-type:mime-version :subject:from:in-reply-to:date:content-transfer-encoding:message-id :references:to:x-mailer; bh=4Il0PLVd/ThAa7k8FlXmppJPHvv9OTj28O8gpDJWL3c=; b=cyAIFK1CJbEJ6lyfxhrPd14pnxkkNiER+3CcHI2UymOC7/df7Ecfqr1DL8tHyP9BOT 0suxF1gmXKNJrnz4I7H/RI20HHmQeVbxM51AwjbCBesiDok3NWUsiPjsTrRP14+XAcfd H48/WCUSOLgXovdsoTXg74LSC9E3o84FZxzL0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=fqt9vuTdPLH1t90jo73QGA4wFkdY+yeDVOEICaLvaF3MgcpejOaY0Thvizc0TzXvwq Ohyx0AqLd+oG4g3jGYF0LKAEf+rIGi1suIew9hK/8AfGA5eD+EwyfEt/Ui7ElQGswkZO WVfROhgxfwChmrLk5iTZNtnkghmidwLj0D7ZA= Received: by 10.150.1.9 with SMTP id 9mr5981917yba.5.1280354367436; Wed, 28 Jul 2010 14:59:27 -0700 (PDT) Received: from [192.168.1.147] (99-99-154-139.lightspeed.austtx.sbcglobal.net [99.99.154.139]) by mx.google.com with ESMTPS id h11sm6621548ybk.17.2010.07.28.14.59.25 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 28 Jul 2010 14:59:26 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Apple Message framework v1081) Subject: Re: Cassandra vs MongoDB From: Jeremy Hanna In-Reply-To: <3b4537b4-07bc-e29a-b5c2-a4059067d691@me.com> Date: Wed, 28 Jul 2010 16:59:22 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <3b4537b4-07bc-e29a-b5c2-a4059067d691@me.com> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1081) > "As a result, we designed and built Flume... > (I wonder if this could deliver into Cassanda :) ) Yes - apparently it's pretty easy to do - I was thinking of doing it but = haven't found the time yet. https://issues.cloudera.org//browse/FLUME-20 On Jul 28, 2010, at 4:30 PM, Aaron Morton wrote: >=20 >> If you are looking to store web logs and then do ad hoc queries you = might/should be using Hadoop (depending on how big your logs are) > =20 > I agree, take a look at the Cloudera Hadopp 3 CDH3, they include an = app called Flume for moving data... >=20 > "As a result, we designed and built Flume. Flume is a distributed = service that makes it very easy to collect and aggregate your data into = a persistent store such as HDFS. Flume can read data from almost any = source =96 log files, Syslog packets, the standard output of any Unix = process =96 and can deliver it to a batch processing system like Hadoop = or a real-time data store like HBase. All this can be configured = dynamically from a single, central location =96 no more tedious = configuration file editing and process restarting. Flume will collect = the data from wherever existing applications are storing it, and whisk = it away for further analysis and processing." >=20 > (I wonder if this could deliver into Cassanda :) ) >=20 > If it's straight log file processing Hadoop may be a better fit. >=20 > Aaron