hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-4618) show create table creating unusable DDL when field delimiter is \001
Date Fri, 31 May 2013 01:58:20 GMT

     [ https://issues.apache.org/jira/browse/HIVE-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HIVE-4618:
------------------------------

    Attachment: HIVE-4618.D11007.1.patch

navis requested code review of "HIVE-4618 [jira] show create table creating unusable DDL when
field delimiter is \001".

Reviewers: JIRA

HIVE-4618 show create table creating unusable DDL when field delimiter is \001

When including a "fields terminated by" in the create statement. If the delimiter is preceded
by a \001, hive turns this into \u0001 which is correct. However it then gives you a ddl that
does not work because the parser changes the \u0001 into u0001.

Example:

hive> create table j1 (a string) row format delimited fields terminated by '\001';

hive> show create table j1;
CREATE  TABLE j1(
  a string)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\u0001'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
TBLPROPERTIES (
  'transient_lastDdlTime'='1369664999')

hive> desc formatted j1;
…shortened to save space
Storage Desc Params:
	field.delim         	\u0001
	serialization.format	\u0001

hive> drop table j1;

hive> CREATE  TABLE j1(
    >   a string)
    > ROW FORMAT DELIMITED
    >   FIELDS TERMINATED BY '\u0001'
    > STORED AS INPUTFORMAT
    >   'org.apache.hadoop.mapred.TextInputFormat'
    > OUTPUTFORMAT
    >   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    > LOCATION
    >   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
    > TBLPROPERTIES (
    >   'transient_lastDdlTime'='1369664999');

hive> desc formatted j1;
…shortened to save space
Storage Desc Params:
	field.delim         	u0001
	serialization.format	u0001

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11007

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java
  ql/src/test/queries/clientpositive/unicode_notation.q
  ql/src/test/results/clientpositive/unicode_notation.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26277/

To: JIRA, navis

                
> show create table creating unusable DDL when field delimiter is \001
> --------------------------------------------------------------------
>
>                 Key: HIVE-4618
>                 URL: https://issues.apache.org/jira/browse/HIVE-4618
>             Project: Hive
>          Issue Type: Bug
>          Components: CLI
>    Affects Versions: 0.10.0
>         Environment: CDH4.2
> Hive 0.10
>            Reporter: Johndee Burks
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-4618.D11007.1.patch
>
>
> When including a "fields terminated by" in the create statement. If the delimiter is
preceded by a \001, hive turns this into \u0001 which is correct. However it then gives you
a ddl that does not work because the parser changes the \u0001 into u0001. 
> Example: 
> hive> create table j1 (a string) row format delimited fields terminated by '\001';
> hive> show create table j1;
> CREATE  TABLE j1(
>   a string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\u0001'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1369664999')
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
> 	field.delim         	\u0001
> 	serialization.format	\u0001
> hive> drop table j1;
> hive> CREATE  TABLE j1(
>     >   a string)
>     > ROW FORMAT DELIMITED
>     >   FIELDS TERMINATED BY '\u0001'
>     > STORED AS INPUTFORMAT
>     >   'org.apache.hadoop.mapred.TextInputFormat'
>     > OUTPUTFORMAT
>     >   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
>     > LOCATION
>     >   'hdfs://forza-1.cloud.rtp.cloudera.com:8020/user/hive/warehouse/j1'
>     > TBLPROPERTIES (
>     >   'transient_lastDdlTime'='1369664999');
> hive> desc formatted j1;
> …shortened to save space
> Storage Desc Params:
> 	field.delim         	u0001
> 	serialization.format	u0001

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message