From user-return-10068-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Fri Oct 22 09:38:00 2010 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 61429 invoked from network); 22 Oct 2010 09:38:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Oct 2010 09:38:00 -0000 Received: (qmail 49924 invoked by uid 500); 22 Oct 2010 09:37:58 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 49738 invoked by uid 500); 22 Oct 2010 09:37:55 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 49727 invoked by uid 99); 22 Oct 2010 09:37:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Oct 2010 09:37:54 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a58.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Oct 2010 09:37:47 +0000 Received: from homiemail-a58.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a58.g.dreamhost.com (Postfix) with ESMTP id 079F97D8070 for ; Fri, 22 Oct 2010 02:37:25 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=message-id :from:to:in-reply-to:content-type:content-transfer-encoding :mime-version:subject:date:references; q=dns; s= thelastpickle.com; b=u8K4Id4gNNAf1ye983kNw2zynDMpa5KmTraUEtFMMry 920oFm98FLX3T/4ZjyXtu5+/j68RZ5IpxbynHplpOZ6BdjUYsxgGsRO7DA3vObZg 2LujV9PsVqeEP5wJdRVC1H00ZatfEuMh1O32A7vdXYXk/bFJ/yHxP0tmLGxIxz5c = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references; s=thelastpickle.com; bh=wXotqgJbE7TNlWitKohhvI5gxp0=; b=1dUYasH ncxY1l1MoW5hwy0NkIITAvlT7Qk06E6b34jYa+Fe/MAKjlUEj5GIeb5ixTp4MsCZ xAl8hCUs/ZWqefTaGupu4NLjhZsRkOSdxC3sP+3aEL7heWSN9T5bUW7KooA0WG0h 6Zd9jYfL9fKXjY6HMZ1gt+vFKrFL5ZAkrIbg= Received: from [10.0.1.151] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a58.g.dreamhost.com (Postfix) with ESMTPSA id 5A0577D806C for ; Fri, 22 Oct 2010 02:37:24 -0700 (PDT) Message-Id: From: Aaron Morton To: "user@cassandra.apache.org" In-Reply-To: Content-Type: multipart/alternative; boundary=Apple-Mail-1-195420057 Content-Transfer-Encoding: 7bit X-Mailer: iPad Mail (7B500) Mime-Version: 1.0 (iPad Mail 7B500) Subject: Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data Date: Fri, 22 Oct 2010 22:37:44 +1300 References: <25C4F9F9-F4CC-4ED3-A15F-B8580DD6FC74@thelastpickle.com> X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-1-195420057 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable I may be wrong about which nodes the task is sent to. =20 Others here know more about hadoop integration. Aaron =20 On 22 Oct 2010, at 21:30, Takayuki Tsunakawa = wrote: > Hello, Aaron, > =20 > Thank you for much info (especially pointers that seem interesting). > =20 > > So you would not have 1,000 tasks sent to each of the 1,000 = cassandra nodes. > =20 > Yes, I meant one map task would be sent to each task tracker, = resulting in 1,000 concurrent map tasks in the cluster. = ColumnFamilyInputFormat cannot identify the nodes that actually hold = some data, so the job tracker will send the map tasks to all of the = 1,000 nodes. This is wasteful and time-consuming if only 200 nodes hold = some data for a keyspace. > =20 > > When the task runs on the cassandra node it will iterate through all = of the rows in the specified ColumnFamily with keys in the Token range = the Node is responsible for. > =20 > I hope the ColumnFamilyInputFormat will allow us to set KeyRange to = select rows passed to map. > =20 > I'll read the web pages you gave me. Thank you. > All, any other advice and comment is appreciated. > =20 > Regards, > Takayuki Tsunakawa > =20 > ----- Original Message -----=20 > From: aaron morton=20 > To: user@cassandra.apache.org=20 > Sent: Friday, October 22, 2010 4:05 PM > Subject: Re: [Q] MapReduce behavior and Cassandra's scalability for = petabytes of data > =20 >=20 > For plain old log analysis the Cloudera Hadoop distribution may be a = better match. Flume is designed to help with streaming data into HDFS, = the LZo compression extensions would help with the data size and PIG = would make the analysis easier (IMHO).=20 > http://www.cloudera.com/hadoop/ > = http://www.cloudera.com/blog/2010/09/using-flume-to-collect-apache-2-web-s= erver-logs/ > = http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-l= zo-compression/ > =20 >=20 > I'll try to answer your questions, others please jump in if I'm wrong. > =20 >=20 > 1. Data in a keyspace will be distributed to all nodes in the = cassandra cluster. AFAIK the Job Tracker should only send one task to = each task tracker, and normally you would have a task tracker running on = each cassandra node. The task tracker can then throttle how may = concurrent tasks can run. So you would not have 1,000 tasks sent to each = of the 1,000 cassandra nodes. > =20 >=20 > When the task runs on the cassandra node it will iterate through all = of the rows in the specified ColumnFamily with keys in the Token range = the Node is responsible for. If cassandra is using the = RandomPartitioner, data will be spear around the cluster. So, for = example, a Map-Reduce job that only wants to read the last weeks data = may have to read from every node. Obviously this depends on how the data = is broken up between rows / columns. > =20 > =20 > =20 >=20 > 2. Some of the other people from riptano.com or rackspace may be able = to help with Cassandra's outer limits. There is a 400 node cluster = planned = http://www.riptano.com/blog/riptano-and-digital-reasoning-form-partnership= > =20 >=20 > Hope that helps.=20 > Aaron --Apple-Mail-1-195420057 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit
I may be wrong about which nodes the task is sent to.  

Others here know more about hadoop integration.

Aaron
  

On 22 Oct 2010, at 21:30, Takayuki Tsunakawa <tsunakawa.takay@jp.fujitsu.com> wrote:

Hello, Aaron,
 
Thank you for much info (especially pointers that seem interesting).
 
> So you would not have 1,000 tasks sent to each of the 1,000 cassandra nodes.
 
Yes, I meant one map task would be sent to each task tracker, resulting in 1,000 concurrent map tasks in the cluster. ColumnFamilyInputFormat cannot identify the nodes that actually hold some data, so the job tracker will send the map tasks to all of the 1,000 nodes. This is wasteful and time-consuming if only 200 nodes hold some data for a keyspace.
 
> When the task runs on the cassandra node it will iterate through all of the rows in the specified ColumnFamily with keys in the Token range the Node is responsible for.
 
I hope the ColumnFamilyInputFormat will allow us to set KeyRange to select rows passed to map.
 
I'll read the web pages you gave me. Thank you.
All, any other advice and comment is appreciated.
 
Regards,
Takayuki Tsunakawa
 
----- Original Message -----
From: aaron morton
To: user@cassandra.apache.org
Sent: Friday, October 22, 2010 4:05 PM
Subject: Re: [Q] MapReduce behavior and Cassandra's scalability for petabytes of data
 

For plain old log analysis the Cloudera Hadoop distribution may be a better match. Flume is designed to help with streaming data into HDFS, the LZo compression extensions would help with the data size and PIG would make the analysis easier (IMHO).
http://www.cloudera.com/hadoop/
http://www.cloudera.com/blog/2010/09/using-flume-to-collect-apache-2-web-server-logs/
http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
 

I'll try to answer your questions, others please jump in if I'm wrong.
 

1. Data in a keyspace will be distributed to all nodes in the cassandra cluster. AFAIK the Job Tracker should only send one task to each task tracker, and normally you would have a task tracker running on each cassandra node. The task tracker can then throttle how may concurrent tasks can run. So you would not have 1,000 tasks sent to each of the 1,000 cassandra nodes.
 

When the task runs on the cassandra node it will iterate through all of the rows in the specified ColumnFamily with keys in the Token range the Node is responsible for. If cassandra is using the RandomPartitioner, data will be spear around the cluster. So, for example, a Map-Reduce job that only wants to read the last weeks data may have to read from every node. Obviously this depends on how the data is broken up between rows / columns.
 
 
 

2. Some of the other people from riptano.com or rackspace may be able to help with Cassandra's outer limits. There is a 400 node cluster planned http://www.riptano.com/blog/riptano-and-digital-reasoning-form-partnership
 

Hope that helps.
Aaron
--Apple-Mail-1-195420057--