hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <omal...@apache.org>
Subject Re: Trying to write a custom HiveOutputFormat
Date Mon, 13 May 2013 16:38:54 GMT
You could also look at the OrcSerde and how it works.

https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java

Basically, OrcSerde on "serialize" just wraps the row and object inspector
in a fake writable. That is passed down to the OutputFormat. On
"deserialize" it does the reverse and just passes back the object from the
InputFormat.

-- Owen


On Mon, May 13, 2013 at 6:54 AM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> You need to use a combination of output format and serde, this might allow
> you to do something like present struct objects to the input format rather
> then Text objects.
>
> You may want to take a look at the protobuf input format we use:
> https://github.com/edwardcapriolo/hive-protobuf/
>
> You could reverse the logic here and design an output format.
>
>
> On Mon, May 13, 2013 at 8:14 AM, Rui Martins <ruibmartins@gmail.com>wrote:
>
>> Hi guys,
>>
>> I'm currently writing my on HiveOutputFormat as I would like to write the
>> output of hive queries into a specific protobuf format my team is using.
>> I have managed to do this however, the Writable object I get from Hive as
>> a result of a SELECT query is of type Text. This means that I have to split
>> the string to find my fields but that's very error prone, specially if some
>> fields are strings that may contain spaces.
>>
>> My question is:
>> 1) How do I get a Hive Writable that gives me each field of each result
>> row?
>>
>> Thank you,
>> rui
>>
>
>

Mime
View raw message