hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Hammerbacher (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-833) Storage access layer
Date Tue, 18 Aug 2009 01:29:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744323#action_12744323
] 

Jeff Hammerbacher commented on PIG-833:
---------------------------------------

Hey,

Raghu, you mention that a design document is forthcoming. It would be great to have a PDF
design document, like Matei's for the fair scheduler, in addition to the Javadoc and wiki
page. Any progress on that front? I'm quite interested in learning more about Zebra's use
and implementation.

On a larger note, it would be great if Pig moved to the Hadoop model for new features, where
a design document and test plan is required to commit. See https://issues.apache.org/jira/browse/HADOOP-5587.
It's tough to digest the bulk dumps of Owl, Zebra, and Giraffe, though we certainly appreciate
the work Yahoo has done on these projects!

Thanks,
Jeff

> Storage access layer
> --------------------
>
>                 Key: PIG-833
>                 URL: https://issues.apache.org/jira/browse/PIG-833
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jay Tang
>         Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2,
PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz
>
>
> A layer is needed to provide a high level data access abstraction and a tabular view
of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval
code.  This layer should also include a columnar storage format in order to provide fast data
projection, CPU/space-efficient data serialization, and a schema language to manage physical
storage metadata.  Eventually it could also support predicate pushdown for further performance
improvement.  Initially, this layer could be a contrib project in Pig and become a hadoop
subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message