hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Wilfong (JIRA)" <>
Subject [jira] [Updated] (HIVE-4005) Column truncation
Date Fri, 22 Feb 2013 19:12:15 GMT


Kevin Wilfong updated HIVE-4005:

    Attachment: HIVE-4005.5.patch.txt
> Column truncation
> -----------------
>                 Key: HIVE-4005
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: CLI
>    Affects Versions: 0.11.0
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, HIVE-4005.3.patch.txt,
HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt
> Column truncation allows users to remove data for columns that are no longer useful.
> This is done by removing the data for the column and setting the length of the column
data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and are treated
as if the column doesn't exist in the data, a null is returned for every value of that column
in every row. This is the same thing that happens when more columns are selected than exist
in the file.
> A new command was added to the CLI
> This launches a map only job where each mapper rewrites a single file without the unnecessary
column data and the adjusted headers. It does not uncompress/deserialize the data so it is
much faster than rewriting the data with NULLs.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message