hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brock Noland (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5871) Use multiple-characters as field delimiter
Date Thu, 14 Aug 2014 18:14:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097318#comment-14097318
] 

Brock Noland commented on HIVE-5871:
------------------------------------

One quick question, I noticed the following change to TestLazyPrimitive:

{noformat}
diff --git serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java
index 7cd1805..3d7f11e 100644
--- serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java
+++ serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java
@@ -388,7 +388,7 @@ public void testLazyBinary() {
     initLazyObject(ba, new byte[] {'2', '?', '3'}, 0, 3);
     assertEquals(new BytesWritable(new byte[] {'2', '?', '3'}), ba.getWritableObject());
     initLazyObject(ba, new byte[] {'\n'}, 0, 1);
-    assertEquals(new BytesWritable(new byte[] {}), ba.getWritableObject());
+    assertEquals(new BytesWritable(new byte[] {'\n'}), ba.getWritableObject());
   }
{noformat}

which I am concerned about being backwards incompatible. [~chinnalalam] actually added this
years ago in HIVE-2465. Chinna, what are your thoughts on this change?

> Use multiple-characters as field delimiter
> ------------------------------------------
>
>                 Key: HIVE-5871
>                 URL: https://issues.apache.org/jira/browse/HIVE-5871
>             Project: Hive
>          Issue Type: Improvement
>          Components: Contrib
>    Affects Versions: 0.12.0
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-5871.2.patch, HIVE-5871.3.patch, HIVE-5871.4.patch, HIVE-5871.5.patch,
HIVE-5871.6.patch, HIVE-5871.patch
>
>
> By default, hive only allows user to use single character as field delimiter. Although
there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially
for amateurs.
> In the patch, I add a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users
can specify a multiple-character field delimiter when creating tables, in a way most similar
to typical table creations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message