hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Fully distribute TextInputFormat...
Date Mon, 10 May 2010 16:35:39 GMT
NLineInputFormat seems a fit for your need.
On Mon, May 10, 2010 at 6:05 AM, Pierre ANCELOT <pierreact@gmail.com> wrote:

> Simple and pure raw ascii text. One line == one treatment to do.
>
>
>
> On Mon, May 10, 2010 at 2:52 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>
> > What's the format of this file ? gzip can been split.
> >
> >
> >
> > On Mon, May 10, 2010 at 5:21 AM, Pierre ANCELOT <pierreact@gmail.com>
> > wrote:
> > > Hi folks :)
> > > I have one big file... I read it with FileInputFormat, this generates
> > only
> > > one task and of course, this doesn't get distributed across the cluster
> > > nodes.
> > > Should I use an other Input class or do I have a bug in my
> > implementation?
> > >
> > > The desired behavior is one task per line.
> > >
> > > Thanks.
> > >
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
> >
> >
> > --
> > Best Regards
> >
> > Jeff Zhang
> >
>
>
>
> --
> http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message