hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Alves <dr-al...@criticalsoftware.com>
Subject Skip Reduce Phase
Date Thu, 07 Feb 2008 17:35:01 GMT
Hi All
	First of all since this is my first post I must say congrats for the
great piece of software (both Hadoop and HBase).
	I've been using Hadoop&HBase for a while and I have a question, let me
just explain a little my setup:

I have an HBase Database that holds information that I want to process
in a Map/Reduce job but that before needs to be a little processed.

So I built another Map/Reduce Job that uses a Specific (Filtered)
TableInputFormat and then pre processes the information in a Map phase.

As I don't need none of the intermediate phases (like merge sort) and I
don't need to do anything on the reduce phase I was wondering If I could
just save the Map phase output and start the second Map/Reduce job using
that as an input (but still saving the splits to DFS for
backtracking/reliability reasons).

Is this possible?

Thanks in advance, and again great piece of software.
David Alves

View raw message