hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oded Rosen <o...@legolas-media.com>
Subject MultipleInputs in 0.20
Date Sun, 09 May 2010 21:08:37 GMT
By what I've learned from different sites around the web (hadoop wiki,
cloudera<http://www.cloudera.com/blog/2009/05/what%E2%80%99s-new-in-hadoop-core-020/>,
mail archive, etc),
the MultipleInputs class that was available in 0.18-0.19 versions of hadoop,
was not moved to the 0.20 new API.
(so does MultipleOutputs, but that's another story)

I wanted to know if there is a way around this - to use two different paths
with two different input format (sequence file, text file) as sources to the
same job,
with a special mapper for each input type - using hadoop 0.20 API. I think
that writing a new job using 0.19 API only means more trouble later, when
it's officially deprecated.

I saw there is a jira <goog_292716485>
(MAPREDUCE-1170)<https://issues.apache.org/jira/browse/MAPREDUCE-1170>open
for this issue, with a patch marked as "Won't fix".
If someone out there can help me with this, I will be most thankful.

Cheers,
-- 
Oded

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message