hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliot West <tea...@gmail.com>
Subject Re: Writing hive column headers in 'Insert overwrite query'
Date Wed, 13 Jan 2016 10:35:57 GMT
I created an issue in the Hive Jira related to this. You may wish to vote
on it or watch it if you believe it to be relevant.

https://issues.apache.org/jira/browse/HIVE-12860



On 13 January 2016 at 09:43, Elliot West <teabot@gmail.com> wrote:

> Unfortunately there appears to be no nice way of doing this. I've seen
> others achieve a work around by UNIONing with a table of the same
> schema, containing a single row of the header names, and then finally
> sorting by a synthesised rank column (see:
> http://stackoverflow.com/a/25214480/74772).
>
> I believe headers should really be an option on INSERT OVERWRITE DIRECTORY
> as it is often a far neater way of generating reports for third party
> consumption. Additionally it is not always possible to elegantly capture
> the headers generated by the CLI option in some environments such as Oozie
> scheduled tasks.
>
> There are arguments against including headers in such Hive generated
> datasets. However, these datasets are often reports at the end of a
> pipeline whose format is mandated by the intended third party recipient. It
> seems a shame to introduce yet another tool into the pipeline purely to
> introduce a header row, we'd rather just do so from within Hive.
>
> There is some background information on the current header implementation
> here: https://issues.apache.org/jira/browse/HIVE-138
>
> Cheers - Elliot.
>
> On Wednesday, 13 January 2016, Sreenath <sreenaths1923@gmail.com> wrote:
>
>> Hey,
>>
>> This will work but lets say i want to write the output to an HDFS
>> location using INSERT OVERWRITE DIRECTORY '<Query>' , in this case even if
>> we set *hive.cli.print.header=true *the headers doesn't get written . Is
>> there a way to write the headers in this case
>>
>> On 13 January 2016 at 12:04, Ankit Bhatnagar <ankitb@yahoo-inc.com>
>> wrote:
>>
>>> r u looking for
>>>
>>> hive -e "*set hive.cli.print.header=true*; < query> " > output
>>>
>>>
>>> On Tuesday, January 12, 2016 10:14 PM, Sreenath <sreenaths1923@gmail.com>
>>> wrote:
>>>
>>>
>>> Hi All,
>>>
>>> Is there a way we can write the hive column headers also along with the
>>> output when we are overwriting a query's output to an HDFS or local
>>> directory ?
>>>
>>>
>>> --
>>> Sreenath S Kamath
>>> Bangalore
>>> Ph No:+91-9590989106
>>>
>>>
>>>
>>
>>
>> --
>> Sreenath S Kamath
>> Bangalore
>> Ph No:+91-9590989106
>>
>

Mime
View raw message