pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "LoadStoreMigrationGuide" by PradeepKamath
Date Thu, 18 Feb 2010 21:25:08 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "LoadStoreMigrationGuide" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/LoadStoreMigrationGuide?action=diff&rev1=23&rev2=24

--------------------------------------------------

  ||No equivalent method ||setLocation() ||!LoadFunc ||This method is called by Pig to communicate
the load location to the loader. The loader should use this method to communicate the same
information to the underlying !InputFormat. This method is called multiple times by pig -
implementations should bear in mind that this method is called multiple times and should ensure
there are no inconsistent side effects due to the multiple calls. ||
  ||bindTo() ||prepareToRead() ||!LoadFunc ||bindTo() was the old method which would provide
an !InputStream among other things to the !LoadFunc. The !LoadFunc implementation would then
read from the !InputStream in getNext(). In the new API, reading of the data is through the
!InputFormat provided by the !LoadFunc. So the equivalent call is prepareToRead() wherein
the !RecordReader associated with the !InputFormat provided by the !LoadFunc is passed to
the !LoadFunc. The !RecordReader can then be used by the implementation in getNext() to return
a tuple representing a record of data back to pig. ||
  ||getNext() ||getNext() ||!LoadFunc ||The meaning of getNext() has not changed and is called
by Pig runtime to get the next tuple in the data - in the new API, this is the method wherein
the implementation will use the the underlying !RecordReader and construct a tuple ||
- ||bytesToInteger(),...bytesToBag() ||bytesToInteger(),...bytesToBag() ||!LoadCaster ||The
meaning of these methods has not changed and is called by Pig runtime to cast a !DataByteArray
fields to the right type when needed. In the new API, a !LoadFunc implementation should give
a !LoadCaster object back to pig as the return value of getLoadCaster() method so that it
can be used for casting. If a null is returned then casting from !DataByteArray to any other
type (implicitly or explicitly) in the pig script will not be possible ||
+ ||bytesToInteger(),...bytesToBag() ||bytesToInteger(),...bytesToBag() ||!LoadCaster ||The
meaning of these methods has not changed and is called by Pig runtime to cast a !DataByteArray
fields to the right type when needed. In the new API, a !LoadFunc implementation should give
a !LoadCaster object back to pig as the return value of getLoadCaster() method so that it
can be used for casting. The default implementation in !LoadFunc returns an instance of !UTF8StorageConvertor
which can handle casting from UTF-8 bytes to different types. If a null is returned then casting
from !DataByteArray to any other type (implicitly or explicitly) in the pig script will not
be possible ||
  
  
  An example of how a simple !LoadFunc implementation based on old interface can be converted
to the new interfaces is shown in the Examples section below.

Mime
View raw message