pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cheol...@apache.org
Subject svn commit: r1582885 - in /pig/branches/tez: ./ bin/ contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/ contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/ ivy/ src/ src/org/apache/pig/ src/org/apac...
Date Fri, 28 Mar 2014 21:26:45 GMT
Author: cheolsoo
Date: Fri Mar 28 21:26:44 2014
New Revision: 1582885

URL: http://svn.apache.org/r1582885
Log:
Merge lates trunk changes

Added:
    pig/branches/tez/test/org/apache/pig/builtin/avro/code/pig/projection_test_with_schema.pig
      - copied unchanged from r1582881, pig/trunk/test/org/apache/pig/builtin/avro/code/pig/projection_test_with_schema.pig
    pig/branches/tez/test/org/apache/pig/builtin/avro/data/json/projectionTestWithSchema.json
      - copied unchanged from r1582881, pig/trunk/test/org/apache/pig/builtin/avro/data/json/projectionTestWithSchema.json
    pig/branches/tez/test/org/apache/pig/builtin/avro/schema/projectionTestWithSchema.avsc
      - copied unchanged from r1582881, pig/trunk/test/org/apache/pig/builtin/avro/schema/projectionTestWithSchema.avsc
    pig/branches/tez/test/org/apache/pig/test/utils/WrongCustomPartitioner.java
      - copied unchanged from r1582881, pig/trunk/test/org/apache/pig/test/utils/WrongCustomPartitioner.java
Modified:
    pig/branches/tez/   (props changed)
    pig/branches/tez/CHANGES.txt
    pig/branches/tez/NOTICE.txt
    pig/branches/tez/bin/pig.cmd
    pig/branches/tez/build.xml
    pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestMultiStorageCompression.java
    pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
    pig/branches/tez/ivy/libraries.properties
    pig/branches/tez/src/org/apache/pig/PigServer.java
    pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
    pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
    pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java
    pig/branches/tez/src/org/apache/pig/builtin/AvroStorage.java
    pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LOForEach.java
    pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LORank.java
    pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LogicalPlan.java
    pig/branches/tez/src/org/apache/pig/newplan/logical/rules/PushDownForEachFlatten.java
    pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistFilter.java
    pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistValidator.java
    pig/branches/tez/src/pig-default.properties   (props changed)
    pig/branches/tez/test/org/apache/pig/builtin/TestAvroStorage.java
    pig/branches/tez/test/org/apache/pig/test/TestEvalPipelineLocal.java
    pig/branches/tez/test/org/apache/pig/test/TestGrunt.java
    pig/branches/tez/test/org/apache/pig/test/TestNewPlanPushDownForeachFlatten.java
    pig/branches/tez/test/org/apache/pig/test/TestSecondarySort.java
    pig/branches/tez/test/org/apache/pig/test/data/bzipdir1.bz2/bzipdir2.bz2/recordLossblockHeaderEndsAt136500.txt.bz2
  (props changed)

Propchange: pig/branches/tez/
------------------------------------------------------------------------------
  Merged /pig/trunk:r1578680-1582881

Modified: pig/branches/tez/CHANGES.txt
URL: http://svn.apache.org/viewvc/pig/branches/tez/CHANGES.txt?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/CHANGES.txt (original)
+++ pig/branches/tez/CHANGES.txt Fri Mar 28 21:26:44 2014
@@ -30,7 +30,7 @@ PIG-2207: Support custom counters for ag
 
 IMPROVEMENTS
 
-PIG-3802: Fix TestBlackAndWhitelistValidator failures (prkommireddi)
+PIG-3449: Move JobCreationException to org.apache.pig.backend.hadoop.executionengine (cheolsoo)
 
 PIG-3765: Ability to disable Pig commands and operators (prkommireddi)
 
@@ -99,6 +99,20 @@ OPTIMIZATIONS
  
 BUG FIXES
 
+PIG-3837: ant pigperf target is broken in trunk (cheolsoo)
+
+PIG-3836: Pig signature has has guava version dependency (amatsukawa via cheolsoo)
+
+PIG-3832: Fix piggybank test compilation failure after PIG-3449 (rohini)
+
+PIG-3807: Pig creates wrong schema after dereferencing nested tuple fields with sorts (daijy)
+
+PIG-3802: Fix TestBlackAndWhitelistValidator failures (prkommireddi)
+
+PIG-3815: Hadoop bug causes to pig to fail silently with jar cache (aniket486)
+
+PIG-3816: Incorrect Javadoc for launchPlan() method (kyungho via prkommireddi)
+
 PIG-3673: Divide by zero error in runpigmix.pl script (suhassatish via daijy)
 
 PIG-3805: ToString(datetime [, format string]) doesn't work without the second argument (jennythompson
via daijy)
@@ -212,8 +226,6 @@ PIG-3542: Javadoc of REGEX_EXTRACT_ALL (
 
 PIG-3518: Need to ship jruby.jar in the release (daijy)
 
-PIG-3516: pig does not bring in joda-time as dependency in its pig-template.xml (daijy)
-
 PIG-3524: Clean up Launcher and MapReduceLauncher after PIG-3419 (cheolsoo)
 
 PIG-3515: Shell commands are limited from OS buffer (andronat via cheolsoo)
@@ -251,6 +263,22 @@ PIG-3480: TFile-based tmpfile compressio
 
 BUG FIXES
 
+PIG-3813: Rank column is assigned different uids everytime when schema is reset (cheolsoo)
+
+PIG-3833: Relation loaded by AvroStorage with schema is projected incorrectly in foreach
statement (jeongjinku via cheolsoo)
+
+PIG-3794: pig -useHCatalog fails using pig command line interface on HDInsight (ehans via
daijy)
+
+PIG-3827: Custom partitioner is not picked up with secondary sort optimization (daijy)
+
+PIG-3826: Outer join with PushDownForEachFlatten generates wrong result (daijy)
+
+PIG-3820: TestAvroStorage fail on some OS (daijy)
+
+PIG-3818: PIG-2499 is accidently reverted (daijy)
+
+PIG-3516: pig does not bring in joda-time as dependency in its pig-template.xml (daijy)
+
 PIG-3753: LOGenerate generates null schema (daijy)
 
 PIG-3782: PushDownForEachFlatten + ColumnMapKeyPrune with user defined schema failing due
to incorrect UID assignment (knoguchi via daijy)

Modified: pig/branches/tez/NOTICE.txt
URL: http://svn.apache.org/viewvc/pig/branches/tez/NOTICE.txt?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/NOTICE.txt (original)
+++ pig/branches/tez/NOTICE.txt Fri Mar 28 21:26:44 2014
@@ -1,5 +1,5 @@
 Apache Pig
-Copyright 2008 The Apache Software Foundation
+Copyright 2008-2014 The Apache Software Foundation
 
 This product includes software developed by The Apache Software
 Foundation (http://www.apache.org/).

Modified: pig/branches/tez/bin/pig.cmd
URL: http://svn.apache.org/viewvc/pig/branches/tez/bin/pig.cmd?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/bin/pig.cmd (original)
+++ pig/branches/tez/bin/pig.cmd Fri Mar 28 21:26:44 2014
@@ -58,11 +58,18 @@ set PIGARGS=
     )
 		goto :ProcessCmdLine 
   )
+	REM Account for quotes around %1 if needed when checking for -useHCatalog
+	REM because the string may come in quoted from WebHCat.
 	if %1==-useHCatalog (
         shift
         set HCAT_FLAG="true"
         goto :ProcessCmdLine 
 	)
+	if %1==^"-useHCatalog^" (
+        shift
+        set HCAT_FLAG="true"
+        goto :ProcessCmdLine
+	)
 	set PIGARGS=%PIGARGS% %1
     shift
     goto :ProcessCmdLine
@@ -95,6 +102,17 @@ set PIGARGS=
   if not defined HCAT_FLAG (
     goto HCAT_END
   )
+
+  REM Try to set HCAT_HOME if not set.  Use of HCATALOG_HOME is deprecated.
+  REM Future development should use HCAT_HOME for consistency with non-Windows
+  REM environments.
+  if not defined HCAT_HOME (
+    if defined HCATALOG_HOME (
+       set HCAT_HOME=%HCATALOG_HOME%
+    ) else (
+       echo "Warning: HCAT_HOME not set"
+    )
+  )
   
   if defined HCAT_HOME (
       call :AddJar %HCAT_HOME%\share\hcatalog hcatalog-*.jar
@@ -111,6 +129,16 @@ set PIGARGS=
       call :AddJar %HIVE_HOME%\lib slf4j-api-*.jar
       call :AddJar %HIVE_HOME%\lib hive-hbase-handler-*.jar
       call :AddJar %HIVE_HOME%\lib httpclient-*.jar
+
+      REM Include datanucleus to support embedded metastore use case via setting
+      REM hive.metastore.uris to ''
+      call :AddJar %HIVE_HOME%\lib datanucleus-*.jar
+
+      REM Include sqljdbc4.jar to support SQL server or Windows Azure SQLDB as embedded metastore.
+      call :AddJar %HIVE_HOME%\lib sqljdbc4.jar
+
+      REM Include derby to support local metastore as embedded metastore.
+      call :AddJar %HIVE_HOME%\lib derby*.jar
   ) else (
       echo "HIVE_HOME should be defined"
       exit /b 1

Modified: pig/branches/tez/build.xml
URL: http://svn.apache.org/viewvc/pig/branches/tez/build.xml?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/build.xml (original)
+++ pig/branches/tez/build.xml Fri Mar 28 21:26:44 2014
@@ -812,7 +812,7 @@
                 <include name="org/apache/pig/test/utils/datagen/*"/>
                 <include name="org/apache/pig/test/udf/storefunc/*"/>
             </fileset>
-            <zipfileset src="${lib.dir}/sdsuLibJKD12.jar" />
+            <zipfileset src="test/perf/pigmix/lib/sdsuLibJKD12.jar" />
         </jar>
     </target>
 

Modified: pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestMultiStorageCompression.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestMultiStorageCompression.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestMultiStorageCompression.java
(original)
+++ pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestMultiStorageCompression.java
Fri Mar 28 21:26:44 2014
@@ -3,9 +3,9 @@
  * NOTICE file distributed with this work for additional information regarding copyright
ownership. The ASF
  * licenses this file to you under the Apache License, Version 2.0 (the "License"); you may
not use this file
  * except in compliance with the License. You may obtain a copy of the License at
- * 
+ *
  * http://www.apache.org/licenses/LICENSE-2.0
- * 
+ *
  * Unless required by applicable law or agreed to in writing, software distributed under
the License is
  * distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express or implied.
  * See the License for the specific language governing permissions and limitations under
the License.
@@ -26,6 +26,7 @@ import java.util.List;
 
 import junit.framework.TestCase;
 
+import org.apache.hadoop.conf.Configurable;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.io.compress.BZip2Codec;
 import org.apache.hadoop.io.compress.CompressionCodec;
@@ -37,7 +38,7 @@ import org.apache.pig.impl.logicalLayer.
 import org.apache.pig.test.Util;
 
 public class TestMultiStorageCompression extends TestCase {
-   
+
    private static String patternString = "(\\d+)!+(\\w+)~+(\\w+)";
    public static ArrayList<String[]> data = new ArrayList<String[]>();
    static {
@@ -127,13 +128,15 @@ public class TestMultiStorageCompression
             // Try to read the records using the codec
             CompressionCodec codec = null;
 
-            
+
             // Use the codec according to the test case
-            if (type.equals("bz2"))
+            if (type.equals("bz2")) {
                codec = new BZip2Codec();
-            else if (type.equals("gz")) {
+            } else if (type.equals("gz")) {
                codec = new GzipCodec();
-               ((GzipCodec)codec).setConf(new Configuration());
+            }
+            if(codec instanceof Configurable) {
+                ((Configurable)codec).setConf(new Configuration());
             }
 
             CompressionInputStream createInputStream = codec
@@ -158,7 +161,7 @@ public class TestMultiStorageCompression
 
    private void runQuery(String outputPath, String compressionType)
          throws Exception, ExecException, IOException, FrontendException {
-      
+
       // create a data file
       String filename = TestHelper.createTempFile(data, "");
       PigServer pig = new PigServer(LOCAL);
@@ -178,6 +181,6 @@ public class TestMultiStorageCompression
 
       pig.executeBatch();
    }
-  
+
 
 }

Modified: pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
(original)
+++ pig/branches/tez/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
Fri Mar 28 21:26:44 2014
@@ -46,7 +46,7 @@ import org.apache.pig.PigServer;
 import org.apache.pig.backend.executionengine.ExecException;
 import org.apache.pig.backend.executionengine.ExecJob;
 import org.apache.pig.backend.executionengine.ExecJob.JOB_STATUS;
-import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException;
+import org.apache.pig.backend.hadoop.executionengine.JobCreationException;
 import org.apache.pig.builtin.mock.Storage.Data;
 import org.apache.pig.data.Tuple;
 import org.apache.pig.impl.io.FileLocalizer;
@@ -69,6 +69,7 @@ public class TestAvroStorage {
     private static String outbasedir;
 
     public static final PathFilter hiddenPathFilter = new PathFilter() {
+        @Override
         public boolean accept(Path p) {
           String name = p.getName();
           return !name.startsWith("_") && !name.startsWith(".");

Modified: pig/branches/tez/ivy/libraries.properties
URL: http://svn.apache.org/viewvc/pig/branches/tez/ivy/libraries.properties?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/ivy/libraries.properties (original)
+++ pig/branches/tez/ivy/libraries.properties Fri Mar 28 21:26:44 2014
@@ -90,6 +90,6 @@ jsr311-api.version=1.1.1
 mockito.version=1.8.4
 jansi.version=1.9
 asm.version=3.3.1
-snappy.version=1.0.5-M3
+snappy.version=1.1.0.1
 tez.version=0.3.0-incubating
 parquet-pig-bundle.version=1.2.3

Modified: pig/branches/tez/src/org/apache/pig/PigServer.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/PigServer.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/PigServer.java (original)
+++ pig/branches/tez/src/org/apache/pig/PigServer.java Fri Mar 28 21:26:44 2014
@@ -1381,8 +1381,8 @@ public class PigServer {
     }
 
     /**
-     * A common method for launching the jobs according to the physical plan
-     * @param pp The physical plan
+     * A common method for launching the jobs according to the logical plan
+     * @param lp The logical plan
      * @param jobName A String containing the job name to be used
      * @return The PigStats object
      * @throws ExecException

Modified: pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
(original)
+++ pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
Fri Mar 28 21:26:44 2014
@@ -20,6 +20,7 @@ package org.apache.pig.backend.hadoop.ex
 import java.io.File;
 import java.io.FileOutputStream;
 import java.io.IOException;
+import java.io.InputStream;
 import java.io.OutputStream;
 import java.lang.reflect.Method;
 import java.net.URI;
@@ -1633,26 +1634,25 @@ public class JobControlCompiler{
             String checksum = DigestUtils.shaHex(url.openStream());
             FileSystem fs = FileSystem.get(conf);
             Path cacheDir = new Path(stagingDir, checksum);
-            FileStatus [] statuses = fs.listStatus(cacheDir);
-            if (statuses != null) {
-                for (FileStatus stat : statuses) {
-                    Path jarPath = stat.getPath();
-                    if(jarPath.getName().equals(filename)) {
-                        log.info("Found " + url + " in jar cache at "+ stagingDir);
-                        long curTime = System.currentTimeMillis();
-                        fs.setTimes(jarPath, -1, curTime);
-                        return jarPath;
-                    }
-                }
+            Path cacheFile = new Path(cacheDir, filename);
+            if (fs.exists(cacheFile)) {
+               log.info("Found " + url + " in jar cache at "+ stagingDir);
+               long curTime = System.currentTimeMillis();
+               fs.setTimes(cacheFile, -1, curTime);
+               return cacheFile;
             }
             log.info("Url "+ url + " was not found in jarcache at "+ stagingDir);
             // attempt to copy to cache else return null
             fs.mkdirs(cacheDir, FileLocalizer.OWNER_ONLY_PERMS);
-            Path cacheFile = new Path(cacheDir, filename);
-            OutputStream os = FileSystem.create(fs, cacheFile, FileLocalizer.OWNER_ONLY_PERMS);
+            OutputStream os = null;
+            InputStream is = null;
             try {
-                IOUtils.copyBytes(url.openStream(), os, 4096, true);
+                os = FileSystem.create(fs, cacheFile, FileLocalizer.OWNER_ONLY_PERMS);
+                is = url.openStream();
+                IOUtils.copyBytes(is, os, 4096, true);
             } finally {
+                org.apache.commons.io.IOUtils.closeQuietly(is);
+                // IOUtils should not close stream to HDFS quietly
                 os.close();
             }
             return cacheFile;
@@ -1688,12 +1688,15 @@ public class JobControlCompiler{
 
         Path dst = new Path(FileLocalizer.getTemporaryPath(pigContext).toUri().getPath(),
suffix);
         FileSystem fs = dst.getFileSystem(conf);
-        OutputStream os = fs.create(dst);
+        OutputStream os = null;
+        InputStream is = null;
         try {
-            IOUtils.copyBytes(url.openStream(), os, 4096, true);
+            is = url.openStream();
+            os = fs.create(dst);
+            IOUtils.copyBytes(is, os, 4096, true);
         } finally {
-            // IOUtils can not close both the input and the output properly in a finally
-            // as we can get an exception in between opening the stream and calling the method
+            org.apache.commons.io.IOUtils.closeQuietly(is);
+            // IOUtils should not close stream to HDFS quietly
             os.close();
         }
         return dst;

Modified: pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
(original)
+++ pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
Fri Mar 28 21:26:44 2014
@@ -50,6 +50,7 @@ import org.apache.pig.backend.executione
 import org.apache.pig.backend.hadoop.datastorage.ConfigurationUtil;
 import org.apache.pig.backend.hadoop.executionengine.JobCreationException;
 import org.apache.pig.backend.hadoop.executionengine.HExecutionEngine;
+import org.apache.pig.backend.hadoop.executionengine.JobCreationException;
 import org.apache.pig.backend.hadoop.executionengine.Launcher;
 import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.LastInputStreamingOptimizer;
 import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.DotMRPrinter;

Modified: pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java
(original)
+++ pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java
Fri Mar 28 21:26:44 2014
@@ -152,6 +152,10 @@ public class SecondaryKeyOptimizer exten
         if (mr.isGlobalSort())
             return;
 
+        // Don't optimize when we already have a custom partitioner
+        if (mr.getCustomPartitioner()!=null)
+            return;
+
         List<PhysicalOperator> mapLeaves = mr.mapPlan.getLeaves();
         if (mapLeaves == null || mapLeaves.size() != 1) {
             log.debug("Expected map to have single leaf! Skip secondary key optimizing");

Modified: pig/branches/tez/src/org/apache/pig/builtin/AvroStorage.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/builtin/AvroStorage.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/builtin/AvroStorage.java (original)
+++ pig/branches/tez/src/org/apache/pig/builtin/AvroStorage.java Fri Mar 28 21:26:44 2014
@@ -210,6 +210,7 @@ public class AvroStorage extends LoadFun
   public final void setUDFContextSignature(final String signature) {
     udfContextSignature = signature;
     super.setUDFContextSignature(signature);
+    updateSchemaFromInputAvroSchema();
   }
 
   /**
@@ -620,16 +621,24 @@ public class AvroStorage extends LoadFun
    */
   public final Schema getInputAvroSchema() {
     if (schema == null) {
-      String schemaString = getProperties().getProperty(INPUT_AVRO_SCHEMA);
-      if (schemaString != null) {
-        Schema s = new Schema.Parser().parse(schemaString);
-        schema = s;
-      }
+      updateSchemaFromInputAvroSchema();
     }
     return schema;
   }
 
-  /*
+  /**
+   * Utility function that gets the input avro schema from the udf
+   * properties and updates schema for this instance.
+   */
+  private final void updateSchemaFromInputAvroSchema() {
+    String schemaString = getProperties().getProperty(INPUT_AVRO_SCHEMA);
+    if (schemaString != null) {
+      Schema s = new Schema.Parser().parse(schemaString);
+      schema = s;
+    }
+  }
+
+  /**
    * @see org.apache.pig.LoadFunc#getInputFormat()
    */
   @Override

Modified: pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LOForEach.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LOForEach.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LOForEach.java (original)
+++ pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LOForEach.java Fri Mar
28 21:26:44 2014
@@ -61,8 +61,19 @@ public class LOForEach extends LogicalRe
     @Override
     public LogicalSchema getSchema() throws FrontendException {
         List<Operator> ll = innerPlan.getSinks();
-        if (ll != null) {
-            schema = ((LogicalRelationalOperator)ll.get(0)).getSchema();
+        LogicalRelationalOperator generate = null;
+        // We can assume LOGenerate is the only sink of the inner plan, but
+        // only after DanglingNestedNodeRemover. LOForEach.getSchema will be
+        // run before DanglingNestedNodeRemover, so need to make sure we do
+        // get LOGenerate
+        for (Operator op : ll) {
+            if (op instanceof LOGenerate) {
+                generate = (LogicalRelationalOperator)op;
+                break;
+            }
+        }
+        if (generate != null) {
+            schema = generate.getSchema();
         }
         
         return schema;

Modified: pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LORank.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LORank.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LORank.java (original)
+++ pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LORank.java Fri Mar 28
21:26:44 2014
@@ -70,14 +70,23 @@ public class LORank extends LogicalRelat
      */
     private boolean isRowNumber = false;
 
+    /**
+     * This is a uid which has been generated for the rank column. It is
+     * important to keep this so that the uid will be persistent between calls
+     * of resetSchema and getSchema.
+     */
+    private long rankColumnUid;
+
     public LORank( OperatorPlan plan) {
         super("LORank", plan);
+        this.rankColumnUid = -1;
     }
 
     public LORank( OperatorPlan plan, List<LogicalExpressionPlan> rankColPlans, List<Boolean>
ascCols) {
         this( plan );
         this.rankColPlans = rankColPlans;
         this.ascCols = ascCols;
+        this.rankColumnUid = -1;
     }
 
     public List<LogicalExpressionPlan> getRankColPlans() {
@@ -139,8 +148,9 @@ public class LORank extends LogicalRelat
 
         schema = new LogicalSchema();
 
-        schema.addField(new LogicalSchema.LogicalFieldSchema(RANK_COL_NAME+SEPARATOR+input.getAlias(),
null, DataType.LONG));
-        schema.getField(0).uid = LogicalExpression.getNextUid();
+        rankColumnUid = rankColumnUid == -1 ? LogicalExpression.getNextUid() : rankColumnUid;
+        schema.addField(new LogicalSchema.LogicalFieldSchema(RANK_COL_NAME + SEPARATOR +
input.getAlias(),
+                null, DataType.LONG, rankColumnUid));
 
         for(LogicalSchema.LogicalFieldSchema fieldSchema: fss) {
             schema.addField(fieldSchema);

Modified: pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LogicalPlan.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LogicalPlan.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LogicalPlan.java (original)
+++ pig/branches/tez/src/org/apache/pig/newplan/logical/relational/LogicalPlan.java Fri Mar
28 21:26:44 2014
@@ -81,12 +81,12 @@ public class LogicalPlan extends BaseOpe
             ps.println("<logicalPlan>XML Not Supported</logicalPlan>");
             return;
         }
-        
+
         ps.println("#-----------------------------------------------");
         ps.println("# New Logical Plan:");
         ps.println("#-----------------------------------------------");
 
-        if (this.size() == 0) { 
+        if (this.size() == 0) {
             ps.println("Logical plan is empty.");
         } else if (format.equals("dot")) {
             DotLOPrinter lpp = new DotLOPrinter(this, ps);
@@ -126,8 +126,8 @@ public class LogicalPlan extends BaseOpe
      */
     public String getSignature() throws FrontendException {
 
-        // Use a streaming hash function. goodFastHash(32) is murmur3 32 bits
-        HashFunction hf = Hashing.goodFastHash(32);
+        // Use a streaming hash function. We use a murmur_32 function with a constant seed,
0.
+        HashFunction hf = Hashing.murmur3_32(0);
         HashOutputStream hos = new HashOutputStream(hf);
         PrintStream ps = new PrintStream(hos);
 

Modified: pig/branches/tez/src/org/apache/pig/newplan/logical/rules/PushDownForEachFlatten.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/newplan/logical/rules/PushDownForEachFlatten.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/newplan/logical/rules/PushDownForEachFlatten.java
(original)
+++ pig/branches/tez/src/org/apache/pig/newplan/logical/rules/PushDownForEachFlatten.java
Fri Mar 28 21:26:44 2014
@@ -159,6 +159,11 @@ public class PushDownForEachFlatten exte
                     for( int i = 0; i < preds.size(); i++ ) {
                         Operator op = preds.get( i );
                         if( op == foreach ) {
+                            // Don't optimize if the flattened side is outer side of an outer
join
+                            // See PIG-3826
+                            if (join.getInnerFlags()[i]==false) {
+                                return false;
+                            }
                             Collection<LogicalExpressionPlan> exprs = join.getJoinPlan(
i );
                             for( LogicalExpressionPlan expr : exprs ) {
                                 List<ProjectExpression> projs = getProjectExpressions(
expr );

Modified: pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistFilter.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistFilter.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistFilter.java (original)
+++ pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistFilter.java Fri Mar 28
21:26:44 2014
@@ -23,8 +23,8 @@ import org.apache.pig.PigConfiguration;
 import org.apache.pig.PigServer;
 import org.apache.pig.impl.PigContext;
 import org.apache.pig.impl.logicalLayer.FrontendException;
-import org.python.google.common.base.Splitter;
-import org.python.google.common.collect.Sets;
+import com.google.common.base.Splitter;
+import com.google.common.collect.Sets;
 
 /**
  * 

Modified: pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistValidator.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistValidator.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistValidator.java (original)
+++ pig/branches/tez/src/org/apache/pig/validator/BlackAndWhitelistValidator.java Fri Mar
28 21:26:44 2014
@@ -45,8 +45,8 @@ import org.apache.pig.newplan.logical.re
 import org.apache.pig.newplan.logical.relational.LOUnion;
 import org.apache.pig.newplan.logical.relational.LogicalRelationalNodesVisitor;
 import org.apache.pig.newplan.logical.rules.LogicalRelationalNodeValidator;
-import org.python.google.common.base.Splitter;
-import org.python.google.common.collect.Sets;
+import com.google.common.base.Splitter;
+import com.google.common.collect.Sets;
 
 /**
  * This validator walks through the list of operators defined in {@link PigConfiguration#PIG_BLACKLIST}
and

Propchange: pig/branches/tez/src/pig-default.properties
------------------------------------------------------------------------------
  Merged /pig/trunk/src/pig-default.properties:r1578680-1582881

Modified: pig/branches/tez/test/org/apache/pig/builtin/TestAvroStorage.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/test/org/apache/pig/builtin/TestAvroStorage.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/test/org/apache/pig/builtin/TestAvroStorage.java (original)
+++ pig/branches/tez/test/org/apache/pig/builtin/TestAvroStorage.java Fri Mar 28 21:26:44
2014
@@ -105,6 +105,7 @@ public class TestAvroStorage {
         "recordWithRepeatedSubRecords",
         "recursiveRecord",
         "projectionTest",
+        "projectionTestWithSchema",
         "recordsWithSimpleUnion",
         "recordsWithSimpleUnionOutput",
     };
@@ -335,10 +336,9 @@ public class TestAvroStorage {
                  "AVROSTORAGE_OUT_1", "records",
                  "AVROSTORAGE_OUT_2", "-n org.apache.pig.test.builtin -f " + basedir + "schema/recordsSubSchema.avsc",
                  "OUTFILE",           createOutputName())
-          );
-        verifyResults(createOutputName(),check);
-      }
-
+        );
+      verifyResults(createOutputName(),check);
+    }
 
     @Test public void testProjection() throws Exception {
         final String input = basedir + "data/avro/uncompressed/records.avro";
@@ -349,10 +349,23 @@ public class TestAvroStorage {
                 "AVROSTORAGE_OUT_1", "projectionTest",
                 "AVROSTORAGE_OUT_2", "-n org.apache.pig.test.builtin",
                 "OUTFILE",          createOutputName())
-          );
-        verifyResults(createOutputName(),check);
-      }
+        );
+      verifyResults(createOutputName(),check);
+    }
 
+    @Test public void testProjectionWithSchema() throws Exception {
+        final String input = basedir + "data/avro/uncompressed/records.avro";
+        final String check = basedir + "data/avro/uncompressed/projectionTestWithSchema.avro";
+        testAvroStorage(true, basedir + "code/pig/projection_test_with_schema.pig",
+                ImmutableMap.of(
+                        "INFILE",           input,
+                        "AVROSTORAGE_IN_2",  "-f " + basedir + "schema/records.avsc",
+                        "AVROSTORAGE_OUT_1", "projectionTest",
+                        "AVROSTORAGE_OUT_2", "-n org.apache.pig.test.builtin",
+                        "OUTFILE",          createOutputName())
+        );
+      verifyResults(createOutputName(),check);
+    }
 
     @Test public void testDates() throws Exception {
       final String input = basedir + "data/avro/uncompressed/records.avro";
@@ -768,7 +781,6 @@ public class TestAvroStorage {
         pigServerLocal.registerQuery("C = FOREACH B generate maps#'key1';");
         pigServerLocal.registerQuery("STORE C INTO 'out' USING mock.Storage();");
 
-
         List<Tuple> out = data.get("out");
         assertEquals(tuple("v11"), out.get(0));
         assertEquals(tuple("v21"), out.get(1));
@@ -891,7 +903,7 @@ public class TestAvroStorage {
             assertEquals(expected.size(), count);
           }
         }
-      }
+    }
 
     private Set<GenericData.Record> getExpected (String pathstr ) throws IOException
{
 
@@ -929,7 +941,7 @@ public class TestAvroStorage {
             }
         }
         return ret;
-  }
+    }
 
 }
 

Modified: pig/branches/tez/test/org/apache/pig/test/TestEvalPipelineLocal.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/test/org/apache/pig/test/TestEvalPipelineLocal.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/test/org/apache/pig/test/TestEvalPipelineLocal.java (original)
+++ pig/branches/tez/test/org/apache/pig/test/TestEvalPipelineLocal.java Fri Mar 28 21:26:44
2014
@@ -1201,4 +1201,19 @@ public class TestEvalPipelineLocal {
         
         Assert.assertFalse(iter.hasNext());
     }
+    
+    // see PIG-3807
+    @Test
+    public void testDanglingNodeWrongSchema() throws Exception{
+        
+        pigServer.registerQuery("d1 = load 'test_data.txt' USING PigStorage() AS (f1: int,
f2: int, f3: int, f4: int);");
+        pigServer.registerQuery("d2 = load 'test_data.txt' USING PigStorage() AS (f1: int,
f2: int, f3: int, f4: int);");
+        pigServer.registerQuery("n1 = foreach (group d1 by f1) {sorted = ORDER d1 by f2;
generate group, flatten(d1.f3) as x3; };");
+        pigServer.registerQuery("n2 = foreach (group d2 by f1) {sorted = ORDER d2 by f2;
generate group, flatten(d2.f3) as q3; };");
+        pigServer.registerQuery("joined = join n1 by x3, n2 by q3;");
+        pigServer.registerQuery("final = foreach joined generate n1::x3;");
+        
+        Schema s = pigServer.dumpSchema("final");
+        Assert.assertEquals(s.toString(), "{n1::x3: int}");
+    }
 }

Modified: pig/branches/tez/test/org/apache/pig/test/TestGrunt.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/test/org/apache/pig/test/TestGrunt.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/test/org/apache/pig/test/TestGrunt.java (original)
+++ pig/branches/tez/test/org/apache/pig/test/TestGrunt.java Fri Mar 28 21:26:44 2014
@@ -1070,14 +1070,10 @@ public class TestGrunt {
             assertFalse(new File("tempShFileToTestShCommand").exists());
 
             if (Util.WINDOWS) {
-               //FIXME
-               // We need to fix this because there is a race condition with pipes.
-               // dir command can potentially run before the TouchedFileInsideGrunt_61 is
written
-               // Solved for linux/unix below using xargs
-               strCmd = "sh echo foo > TouchedFileInsideGrunt_61 | dir /B | findstr TouchedFileInsideGrunt_61
> fileContainingTouchedFileInsideGruntShell_71";
+               strCmd = "sh echo foo > TouchedFileInsideGrunt_61 && dir /B | findstr
TouchedFileInsideGrunt_61 > fileContainingTouchedFileInsideGruntShell_71";
             }
             else {
-               strCmd = "sh touch TouchedFileInsideGrunt_61 | ls | grep TouchedFileInsideGrunt_61
> fileContainingTouchedFileInsideGruntShell_71";
+               strCmd = "sh touch TouchedFileInsideGrunt_61 && ls | grep TouchedFileInsideGrunt_61
> fileContainingTouchedFileInsideGruntShell_71";
             }
 
             cmd = new ByteArrayInputStream(strCmd.getBytes());

Modified: pig/branches/tez/test/org/apache/pig/test/TestNewPlanPushDownForeachFlatten.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/test/org/apache/pig/test/TestNewPlanPushDownForeachFlatten.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/test/org/apache/pig/test/TestNewPlanPushDownForeachFlatten.java (original)
+++ pig/branches/tez/test/org/apache/pig/test/TestNewPlanPushDownForeachFlatten.java Fri Mar
28 21:26:44 2014
@@ -1215,5 +1215,21 @@ public class TestNewPlanPushDownForeachF
         Assert.assertTrue( sort instanceof LOSort );
         
     }
+    
+    @Test
+    // See PIG-3826
+    public void testOuterJoin() throws Exception {
+        String query = "A = load 'A.txt' as (id:chararray, value:double);" +
+        "B = load 'B.txt' as (id:chararray, name:chararray);" +
+        "t1 = group A by id;" +
+        "t2 = foreach t1 { r1 = filter $1 by (value>1); r2 = limit r1 1; generate group
as id, FLATTEN(r2.value) as value; }" +
+        "t3 = join B by id LEFT OUTER, t2 by id;" +
+        "store t3 into 'output';";
+        LogicalPlan newLogicalPlan = migrateAndOptimizePlan( query );
+        
+        Operator store = newLogicalPlan.getSinks().get( 0 );
+        Operator join = newLogicalPlan.getPredecessors(store).get(0);
+        Assert.assertTrue( join instanceof LOJoin );
+    }
 }
 

Modified: pig/branches/tez/test/org/apache/pig/test/TestSecondarySort.java
URL: http://svn.apache.org/viewvc/pig/branches/tez/test/org/apache/pig/test/TestSecondarySort.java?rev=1582885&r1=1582884&r2=1582885&view=diff
==============================================================================
--- pig/branches/tez/test/org/apache/pig/test/TestSecondarySort.java (original)
+++ pig/branches/tez/test/org/apache/pig/test/TestSecondarySort.java Fri Mar 28 21:26:44 2014
@@ -476,5 +476,36 @@ public abstract class TestSecondarySort 
 
         Util.deleteFile(cluster, clusterFilePath);
     }
+    
+    @Test
+    // Once custom partitioner is used, we cannot use secondary key optimizer, see PIG-3827
+    public void testCustomPartitionerWithSort() throws Exception {
+        File tmpFile1 = Util.createTempFileDelOnExit("test", "txt");
+        PrintStream ps1 = new PrintStream(new FileOutputStream(tmpFile1));
+        ps1.println("1\t2\t3");
+        ps1.println("1\t3\t4");
+        ps1.println("1\t4\t4");
+        ps1.println("1\t2\t4");
+        ps1.println("1\t8\t4");
+        ps1.println("2\t3\t4");
+        ps1.close();
+
+        String clusterPath = Util.removeColon(tmpFile1.getCanonicalPath());
+
+        Util.copyFromLocalToCluster(cluster, tmpFile1.getCanonicalPath(), clusterPath);
+        pigServer.registerQuery("A = LOAD '" + Util.encodeEscape(clusterPath) + "' AS (a0,
a1, a2);");
+        pigServer.registerQuery("B = group A by $0 PARTITION BY org.apache.pig.test.utils.WrongCustomPartitioner
parallel 2;");
+        pigServer.registerQuery("C = foreach B { D = order A by a1 desc; generate group,
D;};");
+        boolean captureException = false;
+        try {
+            pigServer.openIterator("C");
+        } catch (Exception e) {
+            captureException = true;
+        }
+        
+        assertTrue(captureException);
+        
+        Util.deleteFile(cluster, clusterPath);
+    }
 }
 

Propchange: pig/branches/tez/test/org/apache/pig/test/data/bzipdir1.bz2/bzipdir2.bz2/recordLossblockHeaderEndsAt136500.txt.bz2
------------------------------------------------------------------------------
  Merged /pig/trunk/test/org/apache/pig/test/data/bzipdir1.bz2/bzipdir2.bz2/recordLossblockHeaderEndsAt136500.txt.bz2:r1578680-1582881



Mime
View raw message