pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "LoadStoreMigrationGuide" by PradeepKamath
Date Wed, 17 Feb 2010 01:23:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "LoadStoreMigrationGuide" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/LoadStoreMigrationGuide?action=diff&rev1=18&rev2=19

--------------------------------------------------

- This page describes how to migrate from the old !LoadFunc and !StoreFunc interface (as of
Pig 0.6.0) to the new interfaces proposed in http://wiki.apache.org/pig/LoadStoreRedesignProposal
and planned to be released in Pig 0.7.0.
+ This page describes how to migrate from the old !LoadFunc and !StoreFunc interface (Pig
0.1.0 through Pig 0.6.0) to the new interfaces proposed in http://wiki.apache.org/pig/LoadStoreRedesignProposal
and planned to be released in Pig 0.7.0. Besides the example in this page, users can also
look at !LoadFunc and !StoreFunc implementation in the piggybank codebase (contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage)
for examples of migration. For example, MultiStorage implemented a custom OutputFormat.
  
  A general note applicable to both !LoadFunc and !StoreFunc implementations is that the implementation
should use the new Hadoop 20 API based on org.apache.hadoop.mapreduce package instead of the
old org.apache.hadoop.mapred package.
+ 
+ The main motivation for these changes is to move closer to using !Hadoop's !InputFormat
and !OutputFormat classes. This way pig users/developers can create new !LoadFunc and !StoreFunc
implementation based on existing !Hadoop !InputFormat and !OutputFormat classes with minimal
code. The complexity of reading the data and creating a record will now lie in the !InputFormat
and likewise on the writing end, the complexity of writing will lie in the !OutputFormat.
This enables !Pig to easily read/write data in new storage formats as and when an !Hadoop
!InputFormat and !OutputFormat is available for them.
+ 
  
  = LoadFunc Migration =
  The methods in the old !LoadFunc have been split among a !LoadFunc abstract class which
has the main methods for loading data and 3 new interfaces 
@@ -199, +202 @@

                  break;
  
              case 'x':
+                fieldDel =
+                     Integer.valueOf(delimiter.substring(2), 16).byteValue();
+                break;
+ 
              case 'u':
                  this.fieldDel =
                      Integer.valueOf(delimiter.substring(2)).byteValue();
@@ -488, +495 @@

                  break;
  
              case 'x':
+                fieldDel =
+                     Integer.valueOf(delimiter.substring(2), 16).byteValue();
+                break;
              case 'u':
                  this.fieldDel =
                      Integer.valueOf(delimiter.substring(2)).byteValue();

Mime
View raw message