hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yongqiang he <heyongqiang...@gmail.com>
Subject Re: How to output SeqFile
Date Thu, 07 Oct 2010 01:34:53 GMT
can you try
set hive.query.result.fileformat=sequencefile;

if not work, you can also try
set hive.default.fileformat=sequencefile;

thanks
yongqiang
On Wed, Oct 6, 2010 at 2:29 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
>
>
> Thanks Yang. I thought about it as well. But as you said, its a hack.
>
> hive-dev@, can you please verify if this is possible?
>
>
>
> ----- Original Message ----
> From: Yang <teddyyyy123@gmail.com>
> To: hive-user@hadoop.apache.org
> Sent: Wed, October 6, 2010 1:52:21 PM
> Subject: Re: How to output SeqFile
>
> if this is indeed a feature that is yet missing, I have a hack:
>
> create a temp table that is seqFile format, then you dump to that table,
> then since you know the location, just copy the part files from that location.
> then delete that partition from the table manually. of course you may
> run into some issues
> such as "partition already exists" when you insert into the temp table
> the next time, so you may need
> to do an explicit delete from the temp table too.
>
> Y
>
> On Wed, Oct 6, 2010 at 1:46 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
>> I was hoping there would be a configuration where I can set the outputformat
>>for
>> my query.
>>
>> Regards,
>> Gaurav Jain
>>
>>
>>
>> ----- Original Message ----
>> From: Jacob R Rideout <apache@jacobrideout.net>
>> To: hive-user@hadoop.apache.org
>> Sent: Wed, October 6, 2010 1:42:57 PM
>> Subject: Re: How to output SeqFile
>>
>> On Wed, Oct 6, 2010 at 2:35 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
>>> I do have that.
>>>
>>> However I am not writing directly to the table partition. Instead, I first
>>>write
>>> my data in a tmp directory (eventually moved to the hdfs table partition)
>  and
>>> then publish that partition using alter table statement in metastore.
>>>
>>> Something like this:
>>>
>>> -- create table x ... stored as SeqFile
>>> -- insert overwrite directory 'd' select * from table y
>>> -- distcp 'd'  x/dateint=.../hour=...
>>> -- alter table x add partition ....
>>>
>>> In the second step above I need to produce SeqFile.
>>>
>>>
>>> Thanks for prompt reply.
>>> Gaurav Jain
>>>
>>>
>>> ----- Original Message ----
>>> From: Yang <teddyyyy123@gmail.com>
>>> To: jainy_gaurav@yahoo.com
>>> Sent: Wed, October 6, 2010 1:28:42 PM
>>> Subject: Re: How to output SeqFile
>>>
>>> Gaurav:
>>>
>>> not sure if I understand your question correctly....
>>> when you create the output table, that has an option to set the
>>> output table SerDe
>>>
>>> Regards
>>> Yang
>>>
>>> On Wed, Oct 6, 2010 at 1:18 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>> How can I produce a sequence file from query
>>>>
>>>> insert overwrite directory ....
>>>>
>>>>
>>>> I have set:
>>>>
>>>> SET io.seqfile.compression.type=BLOCK;
>>>> SET hive.exec.compress.output=true;
>>>> set mapred.output.compression.type=BLOCK;
>>>> set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
>>>>
>>>>
>>>>
>>>> It seems to produce Text .gz format files.
>>>>
>>>>
>>>>
>>>> Regards,
>>>> Gaurav Jain
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> if you are inserting into the directory rather than the table, hive
>> won't know to look at the metadata description of the table
>>
>> you need something like:
>> insert overwrite table x select * from table y
>>
>>
>>
>>
>>
>
>
>
>
>

Mime
View raw message