cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Trivial Update of "HadoopSupport" by SilvereLestang
Date Thu, 28 Apr 2011 12:58:07 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "HadoopSupport" page has been changed by SilvereLestang.
The comment on this change is: Fix and add URLs.
http://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=30&rev2=31

--------------------------------------------------

  <<Anchor(Overview)>>
  
  == Overview ==
- Cassandra 0.6+ enables certain Hadoop functionality against Cassandra's data store.  Specifically,
support has been added for [[http://hadoop.apache.org/mapreduce/|MapReduce]], [[http://pig.apache.org|Pig]]
and [[http://hive.apache.org/|Hive]].
+ Cassandra 0.6+ enables certain [[http://hadoop.apache.org/|Hadoop]] functionality against
Cassandra's data store.  Specifically, support has been added for [[http://hadoop.apache.org/mapreduce/|MapReduce]],
[[http://pig.apache.org|Pig]] and [[http://hive.apache.org/|Hive]].
  
  [[#Top|Top]]
  
@@ -22, +22 @@

  
  == MapReduce ==
  ==== Input from Cassandra ====
- Cassandra 0.6+ adds support for retrieving data from Cassandra.  This is based on implementations
of [[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/InputSplit.html|InputSplit]],
[[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/InputFormat.html|InputFormat]],
and [[http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/RecordReader.html|RecordReader]]
so that Hadoop !MapReduce jobs can retrieve data from Cassandra.  For an example of how this
works, see the contrib/word_count example in 0.6 or later.  Cassandra rows or row  fragments
(that is, pairs of key + `SortedMap`  of columns) are input to Map tasks for  processing by
your job, as specified by a `SlicePredicate`  that describes which columns to fetch from each
row.
+ Cassandra 0.6+ adds support for retrieving data from Cassandra.  This is based on implementations
of [[http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/InputSplit.html|InputSplit]],
[[http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/InputFormat.html|InputFormat]],
and [[http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/RecordReader.html|RecordReader]]
so that Hadoop !MapReduce jobs can retrieve data from Cassandra.  For an example of how this
works, see the contrib/word_count example in 0.6 or later.  Cassandra rows or row  fragments
(that is, pairs of key + `SortedMap`  of columns) are input to Map tasks for  processing by
your job, as specified by a `SlicePredicate`  that describes which columns to fetch from each
row.
  
  Here's how this looks in the word_count example, which selects just one  configurable columnName
from each row:
  

Mime
View raw message