hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject JOIN-type operations with Hadoop...
Date Thu, 13 Sep 2007 14:10:47 GMT
Consider two row based files.  The first has fields:
   
      A B C
   
  the second has fields:
   
     B D E 
   
  I want to join these files on the key B, to create records of the form:
   
    A B C D E
   
  So B can be thought of as a primary key, and the second file will only distinct values of
B...i.e. no repeats.
   
  I'm trying to reason through how to do this type of join operation in Hadoop but am unsure
how to proceed with different "types" of files.  
   
  Does the community have any wisdom to share?
   
  Thanks,
  C G

       
---------------------------------
Yahoo! oneSearch: Finally,  mobile search that gives answers, not web links. 
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message