hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: Loading multiple file format in hive
Date Tue, 25 Aug 2015 06:23:56 GMT
Is it possible for you to write the data into staging area and run a job on
that and then convert ito paraquet table ?
so you are looking to have two table .. one temp for holding data till
15mins and then your job loads this temp data to to your parquet backed
table

sorry for my misunderstanding .. you can though set fileformat at each
partition level but then you need to entirely redesign your table to have
staging partition and real data partition

On Tue, Aug 25, 2015 at 11:46 AM, Jeetendra G <jeetendra.g@housing.com>
wrote:

> Thanks Nitin for reply.
>
> I have data coming from RabbitMQ and i have spark streaming API which take
> this events and dump into HDFS.
> I cant really convert data events to some format like parquet/orc because
> I dont have schema here.
> Once I dump to HDFS i am writing one job which read this data  and convert
> into Parquet.
> By this time I will have some raw events right?
>
>
>
>
> On Tue, Aug 25, 2015 at 11:35 AM, Nitin Pawar <nitinpawar432@gmail.com>
> wrote:
>
>> file formats in a hive is a table level property.
>> I am not sure why would you have data at 15mins interval to your actual
>> table instead of a staging table and do the conversion or have the raw file
>> in the format you want and load it directly into table
>>
>> On Tue, Aug 25, 2015 at 11:27 AM, Jeetendra G <jeetendra.g@housing.com>
>> wrote:
>>
>>> I tried searching how to set multiple format with multiple partitions ,
>>> could not find much detail.
>>> Can please share some good material around this if you have any.
>>>
>>> On Mon, Aug 24, 2015 at 10:49 PM, Daniel Haviv <
>>> daniel.haviv@veracity-group.com> wrote:
>>>
>>>> Hi,
>>>> You can set a different file format per partition.
>>>> You can't mix files in the same directory (You could theoretically
>>>> write some kind of custom SerDe).
>>>>
>>>> Daniel.
>>>>
>>>>
>>>>
>>>> On Mon, Aug 24, 2015 at 6:15 PM, Jeetendra G <jeetendra.g@housing.com>
>>>> wrote:
>>>>
>>>>> Can anyone put some light on this please?
>>>>>
>>>>> On Mon, Aug 24, 2015 at 12:32 PM, Jeetendra G <jeetendra.g@housing.com
>>>>> > wrote:
>>>>>
>>>>>> HI All,
>>>>>>
>>>>>> I have a directory where I have json formatted and parquet files
in
>>>>>> same folder. can hive load these?
>>>>>>
>>>>>> I am getting Json data and storing in HDFS. later I am running job
to
>>>>>> convert JSon to Parquet(every 15 mins). so we will habe 15 mins Json
data.
>>>>>>
>>>>>> Can i provide multiple serde in hive?
>>>>>>
>>>>>> regards
>>>>>> Jeetendra
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>


-- 
Nitin Pawar

Mime
View raw message