hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <>
Subject [jira] [Commented] (HIVE-7142) Hive multi serialization encoding support
Date Thu, 14 Aug 2014 08:56:12 GMT


Lefty Leverenz commented on HIVE-7142:

No new JIRA required, and no patch either.  Just edit the doc and let others review it.  Reviewers
can ask you for changes or make changes themselves.  Doc JIRAs are generally for major tasks,
or tasks that don't have an associated code-changing JIRA.

Once you have wiki edit privilege, an Edit button will appear in the upper right corner of
each wikidoc (next to Share and Tools).  That takes you to the Edit window, where you should
enter a brief note about the nature of your changes in a field at the bottom ("What did you
change?") and then edit the doc.

A bar across the top has fairly obvious GUI symbols and there's help via the "?" button, top

In the bottom right corner there's a Preview button (which toggles with Edit) and a Save button.
 The editor auto-saves and keeps your draft, which you can find in the drop-down list on your
ID picture (upper right).  When you use the Save button, email gets sent to everyone watching
that wikidoc unless you've unchecked "Notify watchers" at the bottom.

If you save something and then decide it's all wrong, you can go to the page history and revert
it.  Page history is in the Tools drop-down list on every wiki page (but not in the editing

I usually put a doc comment on the JIRA that I'm documenting, with a link to the doc, so people
watching that JIRA can review the doc changes and future JIRA trawlers can find the docs easily.

One more thing:  please remember to include Hive version information with your changes, because
the wiki covers all releases of Hive.  You can mention the version in the text or use a Version
info box ("+" icon drop-down list, "Info" fifth from the bottom, enter "Version" or "Version
information" as the title).

Thanks for asking about this -- I should add this information to "How to edit the website."

> Hive multi serialization encoding support
> -----------------------------------------
>                 Key: HIVE-7142
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: TODOC14
>             Fix For: 0.14.0
>         Attachments: HIVE-7142.1.patch.txt, HIVE-7142.2.patch, HIVE-7142.3.patch, HIVE-7142.4.patch
> Currently Hive only support serialize data into UTF-8 charset bytes or deserialize from
UTF-8 bytes, real world users may want to load different kinds of encoded data into hive directly.
This jira is dedicated to support serialize/deserialize all kinds of encoded data in SerDe
> For user, only need to configure serialization encoding on table level by set serialization
encoding through serde parameter, for example:
> {code:sql}
> CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES("serialization.encoding"='GBK');
> {code}
> or
> {code:sql}
> ALTER TABLE person SET SERDEPROPERTIES ('serialization.encoding'='GBK'); 
> {code}
> LIMITATIONS: Only LazySimpleSerDe support "serialization.encoding" property in this patch.

This message was sent by Atlassian JIRA

View raw message