pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "LoadStoreMigrationGuide" by PradeepKamath
Date Sat, 20 Feb 2010 18:09:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "LoadStoreMigrationGuide" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/LoadStoreMigrationGuide?action=diff&rev1=25&rev2=26

--------------------------------------------------

  This page describes how to migrate from the old !LoadFunc and !StoreFunc interface (Pig
0.1.0 through Pig 0.6.0) to the new interfaces proposed in http://wiki.apache.org/pig/LoadStoreRedesignProposal
and planned to be released in Pig 0.7.0. Besides the example in this page, users can also
look at !LoadFunc and !StoreFunc implementation in the piggybank codebase (contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage)
for examples of migration. For example, !MultiStorage implements a custom !OutputFormat.
  
- '''A general note applicable to both !LoadFunc and !StoreFunc implementations is that the
implementation should use the new Hadoop 20 API based classes (!InputFormat/!OutputFormat
and related classes) in org.apache.hadoop.mapreduce package instead of the old org.apache.hadoop.mapred
package.'''
+ '''A general note applicable to both !LoadFunc and !StoreFunc implementations is that the
implementation should use the new Hadoop 20 API based classes (!InputFormat/OutputFormat and
related classes) in org.apache.hadoop.mapreduce package instead of the old org.apache.hadoop.mapred
package.'''
  
  The main motivation for these changes is to move closer to using Hadoop's !InputFormat and
!OutputFormat classes. This way pig users/developers can create new !LoadFunc and !StoreFunc
implementation based on existing Hadoop !InputFormat and !OutputFormat classes with minimal
code. The complexity of reading the data and creating a record will now lie in the !InputFormat
and likewise on the writing end, the complexity of writing will lie in the !OutputFormat.
This enables !Pig to easily read/write data in new storage formats as and when an Hadoop !InputFormat
and !OutputFormat is available for them.
  

Mime
View raw message