hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Viraj Bhat <vi...@yahoo-inc.com>
Subject RE: Best format to use
Date Tue, 09 Apr 2013 19:05:57 GMT
Pig supports AvroStorage() UDF for both loading and storing  and is currently residing in the
Piggybank
http://svn.apache.org/repos/asf/pig/branches/branch-0.11/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/
Also there is a version in github which is currently being ported to trunk.
https://github.com/josephadler/fast-avro-storage
Regards
Viraj

From: Nitin Pawar [mailto:nitinpawar432@gmail.com]
Sent: Tuesday, April 09, 2013 12:00 PM
To: user@hadoop.apache.org
Subject: Re: Best format to use

not sure about pig or impala

but in hive you got this
https://cwiki.apache.org/Hive/avroserde-working-with-avro-from-hive.html


On Wed, Apr 10, 2013 at 12:26 AM, Mark <static.void.dev@gmail.com<mailto:static.void.dev@gmail.com>>
wrote:
Avro is pretty sweet but is it supported by Hive, Pig and Impala. Is it splittable?

On Apr 9, 2013, at 10:58 AM, Roman Shaposhnik <rvs@apache.org<mailto:rvs@apache.org>>
wrote:

> On Tue, Apr 9, 2013 at 9:50 AM, Mark <static.void.dev@gmail.com<mailto:static.void.dev@gmail.com>>
wrote:
>> Forgetting Impala, what format would be best to use with daily logs?
>>
>> Block-compressed sequence files?
>
> I'd actually use avro encoded files.
>
> Thanks,
> Roman.



--
Nitin Pawar

Mime
View raw message