hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Baranov <alex.barano...@gmail.com>
Subject Re: Fully distribute TextInputFormat...
Date Mon, 10 May 2010 20:27:11 GMT
If I'm not mistaken LZO compression better suits when splitting needed, not
gzip.

Alex Baranau

http://sematext.com

On Mon, May 10, 2010 at 3:52 PM, Jeff Zhang <zjffdu@gmail.com> wrote:

> What's the format of this file ? gzip can been split.
>
>
>
> On Mon, May 10, 2010 at 5:21 AM, Pierre ANCELOT <pierreact@gmail.com>
> wrote:
> > Hi folks :)
> > I have one big file... I read it with FileInputFormat, this generates
> only
> > one task and of course, this doesn't get distributed across the cluster
> > nodes.
> > Should I use an other Input class or do I have a bug in my
> implementation?
> >
> > The desired behavior is one task per line.
> >
> > Thanks.
> >
> >
> >
> > --
> > http://www.neko-consulting.com
> > Ego sum quis ego servo
> > "Je suis ce que je prot├Ęge"
> > "I am what I protect"
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message