hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suraj Nayak <>
Subject Reading 2 table data in MapReduce for Performing Join
Date Mon, 16 Mar 2015 19:16:36 GMT

I tried reading data via HCatalog for 1 Hive table in MapReduce using
something similar to
I was able to read successfully.

Now am trying to read 2 tables, as the requirement is to join 2 tables. I
did not find API similar to *FileInputFormat.addInputPaths* in
*HCatInputFormat*. What is the equivalent of the same in HCat ?

I had performed join using FilesInputFormat in HDFS(by getting split
information in mapper). This article( helped me code join.
<> Can someone suggest
how I can perform join operation using HCatalog ?

Briefly, the aim is to

   - Read 2 tables (almost similar schema)
   - If key exists in both the table send it to same reducer.
   - Do some processing on the records in reducer.
   - Save the output into file/Hive table.

*P.S : The reason for using MapReduce to perform join is because of complex
requirement which can't be solved via Hive/Pig directly. *

Any help will be greatly appreciated :)

Thanks & Regards
Suraj Nayak M

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message