cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "HadoopSupport" by JonathanEllis
Date Thu, 01 Apr 2010 04:05:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "HadoopSupport" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=4&rev2=5

--------------------------------------------------

- Cassandra version 0.6 and later support running Hadoop jobs against data in Cassandra, out
of the box.  See https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/ for
an example.  (Inserting the ''output'' of a Hadoop job into Cassandra has always been possible.)
 Cassandra rows or row fragments (that is, pairs of (key, `SortedMap` of columns) are input
to Map tasks for processing by your job, as specified by a `SlicePredicate` that describes
which columns to fetch from each row.  Here's how this looks in the word_count example, which
selects just one configurable columnName from each row:
+ Cassandra version 0.6 and later support running Hadoop jobs against data in Cassandra, out
of the box.  See https://svn.apache.org/repos/asf/cassandra/trunk/contrib/word_count/ for
an example.  (Inserting the ''output'' of a Hadoop job into Cassandra has always been possible.)
 Cassandra rows or row fragments (that is, pairs of key + `SortedMap` of columns) are input
to Map tasks for processing by your job, as specified by a `SlicePredicate` that describes
which columns to fetch from each row.  Here's how this looks in the word_count example, which
selects just one configurable columnName from each row:
  
  {{{
              ConfigHelper.setColumnFamily(job.getConfiguration(), KEYSPACE, COLUMN_FAMILY);

Mime
View raw message