hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Kelkar <rohitkel...@gmail.com>
Subject mapreduce on two tables
Date Mon, 07 Nov 2011 11:02:17 GMT
I needed some feedback about best way of implementing the following -
In my document table I have documentid as row-id and content:author,
content:text stored in each row. I want to process all documents
pertaining to each author in a map reduce job. ie. my map will take
key=author and values="all documentids sent by that sender". But for
this first I would have to find all distinct authors and store them in
another table. Then run map-reduce job on the second table. Am I
thinking in the right direction or is there a better way to achieve
this?
- Rohit Kelkar

Mime
View raw message