hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5747) Hcat alter table add parttition: add skip header/row feature
Date Tue, 05 Nov 2013 12:34:18 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813890#comment-13813890
] 

Harsh J commented on HIVE-5747:
-------------------------------

P.s. Doesn't the {{alter table add partition}} clause just alter metadata? Adding a skip option
to that may not make sense. Perhaps you mean to add it generally to a {{load data into table}}
or a {{insert into}} clause?

> Hcat alter table add parttition: add skip header/row feature
> ------------------------------------------------------------
>
>                 Key: HIVE-5747
>                 URL: https://issues.apache.org/jira/browse/HIVE-5747
>             Project: Hive
>          Issue Type: Improvement
>          Components: HCatalog
>    Affects Versions: 0.10.0
>            Reporter: Rekha Joshi
>            Priority: Minor
>
> Creating hcatalog table using creating tables and alter table add partition is most used
approach.However at times the incoming files can come with header row/column names.
> In such cases it would be good feature to be able skip header/rows.
> Suggestions below:
> hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'
-skip header"
> hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'
-skip [n]"
> hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'"
-DskipRow=1
> -- can choose with bounded array (rows) for selecting rows for table
> hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'
-rows[2:]"  // from first row till all
> hcat "alter table rawevents add partition (ds='20100819') location 'hdfs://data/rawevents/20100819/data'
-rows[2:100]"  // from first row till 100 rows
> Correct place for this feature in hive or hcat?or with -D can be handled in hcat?
> Thanks
> Rekha



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message