hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: problem w/ data load
Date Sun, 02 May 2010 20:52:52 GMT
You can find sample config from
http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ
Look for io.compression.codecs

On Sun, May 2, 2010 at 1:28 PM, Susanne Lehmann <
susanne.lehmann@metamarketsgroup.com> wrote:

> No, I did't. Can you specify what exactly I have to do?
> Thank you so much for your help!
>
>
>
>
> On Sun, May 2, 2010 at 1:11 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > Did you add codec for the compressed files into io.compression.codecs in
> > hadoop
> > configuration files (core-site.xml) ?
> >
> > On Sun, May 2, 2010 at 11:22 AM, Susanne Lehmann <
> > susanne.lehmann@metamarketsgroup.com> wrote:
> >
> >> Hi,
> >>
> >> I want to load data from HDFS to Hive, the data is in compressed files.
> >> The data is stored in flat files, the delimiter is ^A (ctrl-A).
> >> As long as I use de-compressed files everything is working fine. Since
> >> ctrl-A is the default delimiter I even don't need a specification for
> >> it.  I do the following:
> >>
> >>
> >> hadoop dfs -put /test/file new
> >>
> >> hive>  DROP TABLE test_new;
> >> OK
> >> Time taken: 0.057 seconds
> >> hive>    CREATE TABLE test_new(
> >>    >        bla  int,
> >>    >        bla            string,
> >>    >        etc
> >>    >        bla      string);
> >> OK
> >> Time taken: 0.035 seconds
> >> hive> LOAD DATA INPATH "/test/file" INTO TABLE test_new;
> >> Loading data to table test_new
> >> OK
> >> Time taken: 0.063 seconds
> >>
> >> But if I do the same with the same file compressed it's not working
> >> anymore. I tried tons of different table definitions with the
> >> delimiter specified, but it doesn't go. The load itself works, but the
> >> data is always NULL, so there is a delimiter problem I conclude.
> >>
> >>  Any help is greatly appreciated!
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message