hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-833) Storage access layer
Date Wed, 19 Aug 2009 13:32:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745052#action_12745052
] 

He Yongqiang commented on PIG-833:
----------------------------------

By schema format, i mean the string used to define column names and column types. What kind
of data type zebra support? what's the format? It will be great if there are some example
and explaination.
By storage format, i mean the string used to define column groups, such as "[r.f12, f1, m#{b}];
[m#{a}, r.f11]" in TestBasicTableMapSplits.java.

> Storage access layer
> --------------------
>
>                 Key: PIG-833
>                 URL: https://issues.apache.org/jira/browse/PIG-833
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jay Tang
>         Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2,
PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz
>
>
> A layer is needed to provide a high level data access abstraction and a tabular view
of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval
code.  This layer should also include a columnar storage format in order to provide fast data
projection, CPU/space-efficient data serialization, and a schema language to manage physical
storage metadata.  Eventually it could also support predicate pushdown for further performance
improvement.  Initially, this layer could be a contrib project in Pig and become a hadoop
subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message