pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "PigMemory" by AlanGates
Date Wed, 20 May 2009 17:41:18 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by AlanGates:
http://wiki.apache.org/pig/PigMemory

------------------------------------------------------------------------------
  == Problem Statement ==
  
   1. Pig hogs memory.  In the 0.2.0 version, the expansion factor of data on disk to data
in memory is 2-3x.  This causes pig problems in terms of efficiently executing users programs.
 It is caused largely by extensive use Java objects (Integer, etc.) to store internal data.
-  1. Java memory management and its garbage collector are poorly suited to the workload of
intensive data processing.  Pig needs better control over where data is stored and when memory
is deallocated.  For a complete discussion of this issue see M. A. Shah et. al., ''Java Support
for Data-Intensive Systems:  Experiences Building the Telegraph Dataflow System''.
+  1. Java memory management and its garbage collector are poorly suited to the workload of
intensive data processing.  Pig needs better control over where data is stored and when memory
is deallocated.  For a complete discussion of this issue see M. A. Shah et. al., ''Java Support
for Data-Intensive Systems:  Experiences Building the Telegraph Dataflow System''.  In particular
this paper points out that a memory allocation and garbage collection scheme that is beyond
the control of the programmer is a bad fit for a large data processing system.
+  1. Currently Pig waits until memory is low to begin spilling bags to disk.  This has two
issues:
+    a. It is difficult to accurately determine when available memory is too low.
+    b. The system tries to spill and continue processing simultaneously.  Sometimes the continued
processing outruns the spilling and the system still runs out of memory.
+    
  
  == Proposed Solution ==
  Switching from using Java containers and objects to using large memory buffers and a page
cache will address both of these issues.
@@ -93, +97 @@

          // number of tuples with references in the buffer
          private int refCnt;
          private int nextOffset;
+         private boolean isDirty;
+         private boolean isOnDisk;
          // Package level so others can see it without the overhead of a read call.
          DataBuffer data;
          File diskCache;
@@ -106, +112 @@

              diskCache = null;
              nextOffset = 0;
              refCnt = 0;
+             isDirty = false;
+             isOnDisk = false;
          }
              
          /**
@@ -131, +139 @@

           */
          int write(byte[] data) {
              bringIntoMemory();
+             isDirty = true;
              write data into this.data;
              move nextOffset;
              if insufficient space return -1
@@ -142, +151 @@

           * before the tuple begins to read the data.
           */
          void bringIntoMemory() {
-             if data on disk {
+             if (isOnDisk) {
                  MemoryManager.getMemoryManager().getDataBuffer();
                  read into memory
-                 diskCache = null;
+                 isDirty = false;
                  // pushes the buffer back onto the full queue, so it can be
                  // flushed again if necessary.
                  MemoryManager.getMemoryManger().markFull();
+                 isOnDisk = false;
              }
          }
  
@@ -183, +193 @@

           * @return 
           */
          DataBuffer flush() {
-             diskCache = new File;
+             if (isDirty) {
+                 isOnDisk = true;
+                 open(diskCache);
+                 write buffer to diskCache;
-             diskCache.deleteOnExit();
+                 diskCache.deleteOnExit();
-             write data to diskCache;
-             return data;
+                 return data;
-         }
+             }
- 
+         }
      }
  
      /**
@@ -424, +436 @@

  that by totally circumventing the Java garbage collector they got around a 2.5x speedup
of their system.  So it might be worth
  investigating.
  
+ == Reader Feedback ==
+ 
+ Ted Dunning commented in http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3cc7d45fc70905141943v72591b09u81009cf29b9f58b8@mail.gmail.com%3e
+ 
+ Response:  http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3cE3AC3B63-ADB2-4C49-83FB-49A251CD95FB@yahoo-inc.com%3e
+ 
+ Thejas Nair commented in http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3CC633011F.42186%25tejas@yahoo-inc.com%3E
+ 
+ Response:  http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3C55CAA9B7-C415-4C63-B80B-62D7A753947B@yahoo-inc.com%3E
+ 
+ Chris Olston made a few comments in a conversation:
+ 
+ 1. LRU will sometimes be a bad replacement choice.  In the cases where one batch of data
for the pipeline will be larger than the total of all the memory pages, MRU would be a
+ better choice.
+ 
+ Response:  Agreed, but it seems that in the case that an operator needs a few pages of memory
but not all, LRU may be a better choice (assuming there are not other operators taking
+ all the other pages of memory).  Since we don't ''a priori'' know the size of the input
to an operator I don't know how to choose which is better.
+ 
+ 2. It might be useful to expand the interface to allow control of how many buffer pages
go to a given operator.  This has a couple of benefits.  One, it is possible to prevent one
+  operator from taking all of the resources from another.  Two, you can choose different
replacement algorithms based on what is best for that operator.
+ 
+ Response:  I agree that allowing assignments of memory pages to specific operators could
be useful.  I am concnerned that Pig's planner is not sophisticated enough to make
+ intelligent choices here.  I would like to leave this as an area for future work.
+ 

Mime
View raw message