hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gaurav jain <jainy_gau...@yahoo.com>
Subject Re: How to output SeqFile
Date Wed, 06 Oct 2010 21:29:34 GMT


Thanks Yang. I thought about it as well. But as you said, its a hack.

hive-dev@, can you please verify if this is possible?



----- Original Message ----
From: Yang <teddyyyy123@gmail.com>
To: hive-user@hadoop.apache.org
Sent: Wed, October 6, 2010 1:52:21 PM
Subject: Re: How to output SeqFile

if this is indeed a feature that is yet missing, I have a hack:

create a temp table that is seqFile format, then you dump to that table,
then since you know the location, just copy the part files from that location.
then delete that partition from the table manually. of course you may
run into some issues
such as "partition already exists" when you insert into the temp table
the next time, so you may need
to do an explicit delete from the temp table too.

Y

On Wed, Oct 6, 2010 at 1:46 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
> I was hoping there would be a configuration where I can set the outputformat 
>for
> my query.
>
> Regards,
> Gaurav Jain
>
>
>
> ----- Original Message ----
> From: Jacob R Rideout <apache@jacobrideout.net>
> To: hive-user@hadoop.apache.org
> Sent: Wed, October 6, 2010 1:42:57 PM
> Subject: Re: How to output SeqFile
>
> On Wed, Oct 6, 2010 at 2:35 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
>> I do have that.
>>
>> However I am not writing directly to the table partition. Instead, I first
>>write
>> my data in a tmp directory (eventually moved to the hdfs table partition) 
 and
>> then publish that partition using alter table statement in metastore.
>>
>> Something like this:
>>
>> -- create table x ... stored as SeqFile
>> -- insert overwrite directory 'd' select * from table y
>> -- distcp 'd'  x/dateint=.../hour=...
>> -- alter table x add partition ....
>>
>> In the second step above I need to produce SeqFile.
>>
>>
>> Thanks for prompt reply.
>> Gaurav Jain
>>
>>
>> ----- Original Message ----
>> From: Yang <teddyyyy123@gmail.com>
>> To: jainy_gaurav@yahoo.com
>> Sent: Wed, October 6, 2010 1:28:42 PM
>> Subject: Re: How to output SeqFile
>>
>> Gaurav:
>>
>> not sure if I understand your question correctly....
>> when you create the output table, that has an option to set the
>> output table SerDe
>>
>> Regards
>> Yang
>>
>> On Wed, Oct 6, 2010 at 1:18 PM, gaurav jain <jainy_gaurav@yahoo.com> wrote:
>>>
>>>
>>>
>>>
>>> How can I produce a sequence file from query
>>>
>>> insert overwrite directory ....
>>>
>>>
>>> I have set:
>>>
>>> SET io.seqfile.compression.type=BLOCK;
>>> SET hive.exec.compress.output=true;
>>> set mapred.output.compression.type=BLOCK;
>>> set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
>>>
>>>
>>>
>>> It seems to produce Text .gz format files.
>>>
>>>
>>>
>>> Regards,
>>> Gaurav Jain
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>
>
> if you are inserting into the directory rather than the table, hive
> won't know to look at the metadata description of the table
>
> you need something like:
> insert overwrite table x select * from table y
>
>
>
>
>



      

Mime
View raw message