hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehmet Tepedelenlioglu <mehmets...@gmail.com>
Subject Re: HADOOP MapReduce sorting
Date Thu, 08 Sep 2011 17:34:40 GMT
If you have a set of key value pairs you that you want to have in the same reducer, label them
with an index key like so:

<1,RNA1-STRUCT1>
<1,RNA2-STRUCT2>
<1,RNA3-STRUCT3>

In this case RNA1, 2 and 3 with its corresponding structures will end up in the same reducer.
So your mappers won't use RNAi as the key, but another grouping key. 

On Sep 8, 2011, at 10:07 AM, Daniel Yehdego wrote:

> 
> Hi, 
> I want to use an input file which has lines of sequences in which each line (RNA sequence)
will be mapped to the mapper (an executable programthat determines the secondary structure
of each line of sequence). I am also using a reducer which concatenates the output linesfrom
the mapper. But I have some problem that the final output is not sorted in an orderly manner
as the input sequence (RNA-1,RNA-2,RNA-3....). 
> STDIN INPUT FILE : RNA-1                             RNA-2                          
  RNA-3.....
> MAPPER OutPutMAP1<RNA-2><STRUCTURE-2>MAP2<RNA-1><STRUCTURE-1>MAP3<RNA-3><STRUCTURE-3>REDUCER
OUTPUT<RNA-2><RNA-1><RNA-3>\t<STRUCTURE-1><STRUCTURE-2><STRUCTURE-3>\n
OR<RNA-3><RNA-2><RNA-1>\t<STRUCTURE-1><STRUCTURE-2><STRUCTURE-3>\n
> and what I am looking is to reduce in the following ordered manner: <RNA-1><RNA-2><RNA-3>\t<STRUCTURE-1><STRUCTURE-2><STRUCTURE-3>\nlooking
forward to your input. 
> 
> Regards, 
> 
> Daniel T. Yehdego
> Computational Science Program 
> University of Texas at El Paso, UTEP 
> dtyehdego@miners.utep.edu 		 	   		  


Mime
View raw message