hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Petter von Dolwitz (Hem)" <petter.von.dolw...@gmail.com>
Subject Re: problem with delimiters (control A)
Date Thu, 22 May 2014 15:00:46 GMT
Hi Jack,

could it be that your source file does not use \001 as delimiter but the
character sequence "^A"? I get this feeling when looking at your cat
printout and also when you say you can do a search and replace. If I do a
print out of a file with the same delimiter I get the following

2014-04-097846478510
2014-04-107851178558

These are actually three columns but you cannot see the ^A because it is
not a printable character. Check your file in a hex editor or something to
verify that you have the correct delimiter.

Br,
Petter











2014-05-18 6:40 GMT+02:00 Jack Yang <jiey@uow.edu.au>:

> Hi All,
>
> I have a local file called mytest.txt (restored in hdfs already). The
> content is like this:
>
> $ cat -A  HDFSLOAD_DIR/mytest.txt
>
> 49139801^A25752451^Aunknown$
>
> 49139801^A24751754^Aunknown$
>
> 49139801^A2161696^Anice$
>
>
>
> To load this raw data above, I then defined the table like this in HQL:
>
>
>
> create table my_test(
>
> userid          BIGINT,
>
> movieId         BIGINT,
>
> comment         STRING
>
> )
>
> ROW FORMAT DELIMITED
>
> FIELDS TERMINATED BY '\001'
>
> STORED AS TEXTFILE;
>
>
>
> My problem is that when I “SELECT * FROM my_test;” , I got this:
>
>
>
> NULL    NULL    NULL
>
> NULL    NULL    NULL
>
> NULL    NULL    NULL
>
>
>
> I then replace “^A” with “^” in mytest.txt, and also re-defined my table
> structure by using :
>
> FIELDS TERMINATED BY '^’
>
>
>
> So when I select all, I got the correct results.
>
>
>
> Any thoughts??? Thanks in advance.
>
>
>
>
>
> Best regards,
>
> Jack
>
>
>
>
>

Mime
View raw message