hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "a bc (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-5573) sql result messes up if char '0x0d ' in the database data file
Date Thu, 17 Oct 2013 05:40:45 GMT

     [ https://issues.apache.org/jira/browse/HIVE-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

a bc updated HIVE-5573:
-----------------------

    Description: 
With select statement, the returned result is not correct, totally messed up!


================================
Test case:
for table, we have:
hive> desc tblfoo;
OK
sessionid       string
userid  string
groupid int
docid   string
channeltype     string
licensetype     string
issuitelicense  string
activationstatus        string
licensedproduct string


The *wrong* result when select this table :
hive> select * from tblfoo;
OK
7B179D5F-6D1A-4B00-89CC-5FE9F64A6D74    9EF52B8C-10D2-4C79-9460-6C295C9D5E7A   90       SUBSCRIPTION
   Trial   SUITE   NULL    NULL
        Trial   NULL    NULL    NULL    NULL    NULL    NULL    NULL


and the *right* result should be:
hive> select * from tblfoo;
OK
7B179D5F-6D1A-4B00-89CC-5FE9F64A6D74    9EF52B8C-10D2-4C79-9460-6C295C9D5E7A   90       SUBSCRIPTION
   Trial   SUITE   Trial   NOVALUE


If I remove the char '0x0d' in the database data file, then I can get the correct result.

================================
Steps to reproduce this bug:
1. create table:
hive> create table tblFoo (
    > sessionid               string,
    > userid                  string,
    > groupid                 int   ,
    > docid                   string,
    > channeltype             string,
    > licensetype             string,
    > issuitelicense          string,
    > activationstatus        string,
    > licensedproduct         string);

2. hive> load data local inpath '/tmp/li.dat' overwrite into table tblFoo;
3. hive> select * from tblfoo;


================================
And I will attach some example files:
1. li.dat is the database file with a '0x0d' char in file
-bash-4.1$ od -x li.dat
0000000 4237 3731 4439 4635 362d 3144 2d41 4234
0000020 3030 382d 4339 2d43 4635 3945 3646 4134
0000040 4436 3437 3901 4645 3235 3842 2d43 3031
0000060 3244 342d 3743 2d39 3439 3036 362d 3243
0000100 3539 3943 3544 3745 0141 0139 0130 5553
0000120 5342 5243 5049 4954 4e4f 5401 6972 6c61
0000140 5301 4955 4554 010d 7254 6169 016c 4f4e
0000160 4156 554c 0a45
0000166

2. li2.dat is the database file without a '0x0d' char in file.
-bash-4.1$ od -x li2.dat
0000000 4237 3731 4439 4635 362d 3144 2d41 4234
0000020 3030 382d 4339 2d43 4635 3945 3646 4134
0000040 4436 3437 3901 4645 3235 3842 2d43 3031
0000060 3244 342d 3743 2d39 3439 3036 362d 3243
0000100 3539 3943 3544 3745 0141 0139 0130 5553
0000120 5342 5243 5049 4954 4e4f 5401 6972 6c61
0000140 5301 4955 4554 5401 6972 6c61 4e01 564f
0000160 4c41 4555 000a

-bash-4.1$ ls -l
-rwxr-xr-x  1 hdfs    hdfs     118 Oct 17 10:32 li.dat
-rw-r--r--  1 hdfs    hdfs     117 Oct 17 11:08 li2.dat




  was:
with select statement, the returned result is not correct, totally messed up!

For example:
for table, we have:
hive> desc li2;
OK
sessionid       string
userid  string
groupid int
docid   string
channeltype     string
licensetype     string
issuitelicense  string
activationstatus        string
licensedproduct string

The real result when select this table :
hive> select * from tblfoo;
OK
7B179D5F-6D1A-4B00-89CC-5FE9F64A6D74    9EF52B8C-10D2-4C79-9460-6C295C9D5E7A   90       SUBSCRIPTION
   Trial   SUITE   NULL    NULL
        Trial   NULL    NULL    NULL    NULL    NULL    NULL    NULL

and the right result should be:
hive> select * from tblfoo;
OK
7B179D5F-6D1A-4B00-89CC-5FE9F64A6D74    9EF52B8C-10D2-4C79-9460-6C295C9D5E7A   90       SUBSCRIPTION
   Trial   SUITE   Trial   NOVALUE

If I remove the char '0x0d' in the database data file, then I can get the correct result.


Steps to reproduce this bug:
1. create table with createTable.hql
2. hive> load data local inpath '/tmp/li.dat' overwrite into table tblFoo;
3. hive> select * from tblfoo;

And I will attach some example files:
1.  create the table:
hive> create table tblFoo (
    > sessionid               string,
    > userid                  string,
    > groupid                 int   ,
    > docid                   string,
    > channeltype             string,
    > licensetype             string,
    > issuitelicense          string,
    > activationstatus        string,
    > licensedproduct         string);


2. li.dat is the database file with a '0x0d' char in file
-bash-4.1$ od -x li.dat
0000000 4237 3731 4439 4635 362d 3144 2d41 4234
0000020 3030 382d 4339 2d43 4635 3945 3646 4134
0000040 4436 3437 3901 4645 3235 3842 2d43 3031
0000060 3244 342d 3743 2d39 3439 3036 362d 3243
0000100 3539 3943 3544 3745 0141 0139 0130 5553
0000120 5342 5243 5049 4954 4e4f 5401 6972 6c61
0000140 5301 4955 4554 010d 7254 6169 016c 4f4e
0000160 4156 554c 0a45
0000166

3. li2.dat is the database file without a '0x0d' char in file.
-bash-4.1$ od -x li2.dat
0000000 4237 3731 4439 4635 362d 3144 2d41 4234
0000020 3030 382d 4339 2d43 4635 3945 3646 4134
0000040 4436 3437 3901 4645 3235 3842 2d43 3031
0000060 3244 342d 3743 2d39 3439 3036 362d 3243
0000100 3539 3943 3544 3745 0141 0139 0130 5553
0000120 5342 5243 5049 4954 4e4f 5401 6972 6c61
0000140 5301 4955 4554 5401 6972 6c61 4e01 564f
0000160 4c41 4555 000a

-bash-4.1$ ls -l
-rwxr-xr-x  1 hdfs    hdfs     118 Oct 17 10:32 li.dat
-rw-r--r--  1 hdfs    hdfs     117 Oct 17 11:08 li2.dat





> sql result messes up if char '0x0d ' in the database data file
> --------------------------------------------------------------
>
>                 Key: HIVE-5573
>                 URL: https://issues.apache.org/jira/browse/HIVE-5573
>             Project: Hive
>          Issue Type: Bug
>          Components: Database/Schema
>    Affects Versions: 0.10.0
>         Environment: CDH4.3
>            Reporter: a bc
>            Priority: Critical
>
> With select statement, the returned result is not correct, totally messed up!
> ================================
> Test case:
> for table, we have:
> hive> desc tblfoo;
> OK
> sessionid       string
> userid  string
> groupid int
> docid   string
> channeltype     string
> licensetype     string
> issuitelicense  string
> activationstatus        string
> licensedproduct string
> The *wrong* result when select this table :
> hive> select * from tblfoo;
> OK
> 7B179D5F-6D1A-4B00-89CC-5FE9F64A6D74    9EF52B8C-10D2-4C79-9460-6C295C9D5E7A   90   
   SUBSCRIPTION    Trial   SUITE   NULL    NULL
>         Trial   NULL    NULL    NULL    NULL    NULL    NULL    NULL
> and the *right* result should be:
> hive> select * from tblfoo;
> OK
> 7B179D5F-6D1A-4B00-89CC-5FE9F64A6D74    9EF52B8C-10D2-4C79-9460-6C295C9D5E7A   90   
   SUBSCRIPTION    Trial   SUITE   Trial   NOVALUE
> If I remove the char '0x0d' in the database data file, then I can get the correct result.
> ================================
> Steps to reproduce this bug:
> 1. create table:
> hive> create table tblFoo (
>     > sessionid               string,
>     > userid                  string,
>     > groupid                 int   ,
>     > docid                   string,
>     > channeltype             string,
>     > licensetype             string,
>     > issuitelicense          string,
>     > activationstatus        string,
>     > licensedproduct         string);
> 2. hive> load data local inpath '/tmp/li.dat' overwrite into table tblFoo;
> 3. hive> select * from tblfoo;
> ================================
> And I will attach some example files:
> 1. li.dat is the database file with a '0x0d' char in file
> -bash-4.1$ od -x li.dat
> 0000000 4237 3731 4439 4635 362d 3144 2d41 4234
> 0000020 3030 382d 4339 2d43 4635 3945 3646 4134
> 0000040 4436 3437 3901 4645 3235 3842 2d43 3031
> 0000060 3244 342d 3743 2d39 3439 3036 362d 3243
> 0000100 3539 3943 3544 3745 0141 0139 0130 5553
> 0000120 5342 5243 5049 4954 4e4f 5401 6972 6c61
> 0000140 5301 4955 4554 010d 7254 6169 016c 4f4e
> 0000160 4156 554c 0a45
> 0000166
> 2. li2.dat is the database file without a '0x0d' char in file.
> -bash-4.1$ od -x li2.dat
> 0000000 4237 3731 4439 4635 362d 3144 2d41 4234
> 0000020 3030 382d 4339 2d43 4635 3945 3646 4134
> 0000040 4436 3437 3901 4645 3235 3842 2d43 3031
> 0000060 3244 342d 3743 2d39 3439 3036 362d 3243
> 0000100 3539 3943 3544 3745 0141 0139 0130 5553
> 0000120 5342 5243 5049 4954 4e4f 5401 6972 6c61
> 0000140 5301 4955 4554 5401 6972 6c61 4e01 564f
> 0000160 4c41 4555 000a
> -bash-4.1$ ls -l
> -rwxr-xr-x  1 hdfs    hdfs     118 Oct 17 10:32 li.dat
> -rw-r--r--  1 hdfs    hdfs     117 Oct 17 11:08 li2.dat



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message