hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "W.P. McNeill" <bill...@gmail.com>
Subject Re: How do I read from tab-delimited text files in the new Hadoop API?
Date Wed, 06 Apr 2011 21:19:20 GMT
I'm using Cloudera Hadoop version 0.20.2_320.  How do I read from raw
tab-delimited text input files using this version?

I'm surprised that the answer isn't obvious. This seems like a huge feature
for the Hadoop API to drop.

On Wed, Apr 6, 2011 at 2:05 PM, Harsh J <qwertymaniac@gmail.com> wrote:

> What version of Hadoop are you using? 0.20.x? 0.20.x is a bit lacking
> in New API components. The "deprecated" (not anymore from 0.20.3+),
> stable API is probably better to use if you are on Apache Hadoop
> 0.20.x.
>
> On Thu, Apr 7, 2011 at 2:29 AM, W.P. McNeill <billmcn@gmail.com> wrote:
> > I have some raw text input files in which both key and value are Text and
> > are delimited by a tab character. In the old API I would
> > use KeyValueTextInputFormat, but appears to no longer be supported.  How
> do
> > I handle this kind of input in the new API?
> >
>
> KeyValueTextInputFormat for the new API does come in 0.21 and further:
>
> http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/lib/input/KeyValueTextInputFormat.html
>
> --
> Harsh J
> http://harshj.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message