hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <>
Subject [jira] [Commented] (HIVE-5871) Use multiple-characters as field delimiter
Date Wed, 10 Sep 2014 05:48:28 GMT


Rui Li commented on HIVE-5871:

[~leftylev] That's perfect. Thanks a lot!

> Use multiple-characters as field delimiter
> ------------------------------------------
>                 Key: HIVE-5871
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Contrib
>    Affects Versions: 0.12.0
>            Reporter: Rui Li
>            Assignee: Rui Li
>              Labels: TODOC14
>             Fix For: 0.14.0
>         Attachments: HIVE-5871.2.patch, HIVE-5871.3.patch, HIVE-5871.4.patch, HIVE-5871.5.patch,
HIVE-5871.6.patch, HIVE-5871.patch
> By default, hive only allows user to use single character as field delimiter. Although
there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially
for amateurs.
> The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can
specify a multiple-character field delimiter when creating tables, in a way most similar to
typical table creations. For example:
> {code}
> create table test (id string,hivearray array<binary>,hivemap map<string,int>)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES
> {code}
> where {{field.delim}} is the field delimiter, {{collection.delim}} and {{mapkey.delim}}
is the delimiter for collection items and key value pairs, respectively. Among these delimiters,
{{field.delim}} is mandatory and can be of multiple characters, while {{collection.delim}}
and {{mapkey.delim}} is optional and only support single character.
> To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class path, e.g.
with the {{add jar}} command.

This message was sent by Atlassian JIRA

View raw message