hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-833) Storage access layer
Date Fri, 21 Aug 2009 02:23:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745770#action_12745770
] 

He Yongqiang commented on PIG-833:
----------------------------------

Thanks Jing.
Yeah, i know the design of column groups and projection.
The reason i was asking is that i saw an usage in line 251 TestBasicTable.java:
{noformat}
doReadWrite(path, 2, 100, "SF_a,SF_b,SF_c,SF_d,SF_e", "[SF_a,SF_b,SF_c];[SF_d,SF_e]", "SF_f,SF_a,SF_c,SF_d",
true, false);
{noformat}
where  "SF_f,SF_a,SF_c,SF_d" is passed as projection, but is there a column "SF_f" defined?

btw, can you give more detail about the design of Partition? ColumnGroup is much like projection
in C-Store, so it can be more easily to be understood.

> Storage access layer
> --------------------
>
>                 Key: PIG-833
>                 URL: https://issues.apache.org/jira/browse/PIG-833
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jay Tang
>         Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2,
PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz
>
>
> A layer is needed to provide a high level data access abstraction and a tabular view
of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval
code.  This layer should also include a columnar storage format in order to provide fast data
projection, CPU/space-efficient data serialization, and a schema language to manage physical
storage metadata.  Eventually it could also support predicate pushdown for further performance
improvement.  Initially, this layer could be a contrib project in Pig and become a hadoop
subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message