hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-794) Use Avro serialization in Pig
Date Wed, 06 May 2009 01:57:30 GMT

    [ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706278#action_12706278
] 

Olga Natkovich commented on PIG-794:
------------------------------------

Hi Rakesh,

Thanks for the update. A few comments:

(1) Thanks for adding comments. They need to be of javadoc style so that we get free documentation
from it. You can see examples in other files
(2) Looks like there is at least one System.println statement that got in I assume by mistake.
(3) Looks like you have some traces as log.error instead of log.debug
(4) You need to attach AVRO library separately. Patches don't work well with binary data

Also I am curious if removing wrapper class made a performance difference?

> Use Avro serialization in Pig
> -----------------------------
>
>                 Key: PIG-794
>                 URL: https://issues.apache.org/jira/browse/PIG-794
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: Rakesh Setty
>         Attachments: AvroBinStorage.patch, AvroStorage.patch
>
>
> We would like to use Avro serialization in Pig to pass data between MR jobs instead of
the current BinStorage. Attached is an implementation of AvroBinStorage which performs significantly
better compared to BinStorage on our benchmarks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message