hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <>
Subject [jira] [Commented] (HIVE-5871) Use multiple-characters as field delimiter
Date Tue, 26 Aug 2014 09:02:12 GMT


Rui Li commented on HIVE-5871:

Hi [~brocknoland],

LazyBinary only intends to decode valid base64 data: {{byte[] decoded = arrayByteBase64 ?
Base64.decodeBase64(recv) : recv;}}. Original data is returned if it contains non-base64 character.
Therefore, I think it's natural to also return the original data if decode fails. Otherwise,
why would we bother to check {{arrayByteBase64}} before decoding?
I've asked Chinna to help see if this is correct and waiting for his comments.

But anyway, if you think this is incorrect or we should minimize change to the code base I'll
find a way to avoid it :)


> Use multiple-characters as field delimiter
> ------------------------------------------
>                 Key: HIVE-5871
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Contrib
>    Affects Versions: 0.12.0
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-5871.2.patch, HIVE-5871.3.patch, HIVE-5871.4.patch, HIVE-5871.5.patch,
HIVE-5871.6.patch, HIVE-5871.patch
> By default, hive only allows user to use single character as field delimiter. Although
there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially
for amateurs.
> In the patch, I add a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users
can specify a multiple-character field delimiter when creating tables, in a way most similar
to typical table creations.

This message was sent by Atlassian JIRA

View raw message