pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "Pig070IncompatibleChanges" by PradeepKamath
Date Thu, 18 Feb 2010 21:21:48 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "Pig070IncompatibleChanges" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/Pig070IncompatibleChanges?action=diff&rev1=19&rev2=20

--------------------------------------------------

  
  First, in the initial (0.7.0) release, '''we will not support optimization''' where if streaming
follows load of compatible format or is followed by format compatible store the data is not
parsed but passed in chunks from the loader or to the store. The main reason we are not porting
the optimization is that the work is not trivial and the optimization was never documented
and so unlikely to be used.
  
- Second, '''you can no longer use load/store functions for (de)serialization.''' A new interface
has been defined that has to be implemented for custom (de)serializations. The default (PigStorage)
format will continue to work. This format is now implemented by a class called org.apache.pig.impl.streaming.PigStreaming
that can be also used directly in the streaming statement. Details of the new interface are
describe in http://wiki.apache.org/pig/LoadStoreRedesignProposal.
+ Second, '''you can no longer use load/store functions for (de)serialization.''' A new interface
has been defined that has to be implemented for custom (de)serializations. The default (PigStorage)
format will continue to work. This format is now implemented by a class called org.apache.pig.impl.streaming.PigStreaming
that can be also used directly in the streaming statement. Note that this class handles arbitrary
delimiters: For example your statement could look like:
+ {{{
+  `perl StreamScript.pl` input(stdin using org.apache.pig.impl.streaming.PigStreaming(','))
output(stdout using org.apache.pig.impl.streaming.PigStreaming(';')) <...remaining options...>;
+ }}}Details of the new interface are describe in http://wiki.apache.org/pig/LoadStoreRedesignProposal.
  
  We have also removed org.apache.pig.builtin.BinaryStorage loader/store function and org.apache.pig.builtin.PigDump
which were only used from within streaming. They can be restored if needed - we would just
need to implement the corresponding Input/OutputFormats.
  

Mime
View raw message