pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "Pig070IncompatibleChanges" by PradeepKamath
Date Tue, 23 Feb 2010 19:45:58 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "Pig070IncompatibleChanges" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/Pig070IncompatibleChanges?action=diff&rev1=20&rev2=21

--------------------------------------------------

  
  First, in the initial (0.7.0) release, '''we will not support optimization''' where if streaming
follows load of compatible format or is followed by format compatible store the data is not
parsed but passed in chunks from the loader or to the store. The main reason we are not porting
the optimization is that the work is not trivial and the optimization was never documented
and so unlikely to be used.
  
- Second, '''you can no longer use load/store functions for (de)serialization.''' A new interface
has been defined that has to be implemented for custom (de)serializations. The default (PigStorage)
format will continue to work. This format is now implemented by a class called org.apache.pig.impl.streaming.PigStreaming
that can be also used directly in the streaming statement. Note that this class handles arbitrary
delimiters: For example your statement could look like:
+ Second, '''you can no longer use load/store functions for (de)serialization.''' A new interface
has been defined that has to be implemented for custom (de)serializations. The default (PigStorage)
format will continue to work. This format is now implemented by a class called org.apache.pig.builtin.PigStreaming
that can be also used directly in the streaming statement. Note that this class handles arbitrary
delimiters: For example your statement could look like:
  {{{
-  `perl StreamScript.pl` input(stdin using org.apache.pig.impl.streaming.PigStreaming(','))
output(stdout using org.apache.pig.impl.streaming.PigStreaming(';')) <...remaining options...>;
+  `perl StreamScript.pl` input(stdin using PigStreaming(',')) output(stdout PigStreaming(';'))
<...remaining options...>;
  }}}Details of the new interface are describe in http://wiki.apache.org/pig/LoadStoreRedesignProposal.
  
  We have also removed org.apache.pig.builtin.BinaryStorage loader/store function and org.apache.pig.builtin.PigDump
which were only used from within streaming. They can be restored if needed - we would just
need to implement the corresponding Input/OutputFormats.
@@ -70, +70 @@

  
  == Removing Custom Comparators ==
  
- This functionality was added to deal with gap in Pig's early functionality - lack of numeric
comparison in order by as well as lack of descending sort. This functionality has been present
in last 4 releases and custom comparators has been depricated in the last several releases.
They functionality is removed in this release.
+ This functionality was added to deal with gap in Pig's early functionality - lack of numeric
comparison in order by as well as lack of descending sort. This functionality has been present
in last 4 releases and custom comparators has been deprecated in the last several releases.
They functionality is removed in this release.
  
  == Merge Join ==
  In Pig.0.6.0 there was a pre-condition for merge join: "The loadfunc for the right input
of the join should implement the !SamplableLoader interface" - instead the !LoadFunc should
now implement !OrderedLoadFunc interface in Pig 0.7.0. All other pre-condtions still hold.

Mime
View raw message