pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2729) Macro expansion does not use pig.import.search.path - UnitTest borked
Date Fri, 20 Jul 2012 16:57:34 GMT

    [ https://issues.apache.org/jira/browse/PIG-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419343#comment-13419343
] 

Rohini Palaniswamy commented on PIG-2729:
-----------------------------------------

Few comments:
1) Exception should not be thrown here as it will break Amazon s3 filesystem support.
{code}
File macroFile = QueryParserUtils.getFileFromSearchImportPath(fname);
+                if (macroFile == null) {
+                    throw new FileNotFoundException("Could not find the specified file '"
+                                                    + fname + "' using import search path");
+                }
+                localFileRet = FileLocalizer.fetchFile(pigContext.getProperties(),
+                                                       macroFile.getAbsolutePath());
{code}
  It should be
{code}
File localFile = QueryParserUtils.getFileFromSearchImportPath(fname);
localFileRet = localFile == null
 ? FileLocalizer.fetchFile(pigContext.getProperties(), fname)
   : new FetchFileRet(localFile.getCanonicalFile(), false);
{code}
   The reason is the macro path could be fully qualified s3 or some other supported file system
path. So if we could not find it in the local filesystem with getFileFromSearchImportPath,
then FileLocalizer.fetchFile will take care of looking at other filesystems and downloading
it locally and returning the local file path. Also it will throw the FileNotFoundException
if the file is missing.

2.  Again for the same reason of s3 support, it is incorrect to use getFileFromSearchImportPath
in this code. And getMacroFile already fetches the file.

{code}
FetchFileRet localFileRet = getMacroFile(fname);
File macroFile = QueryParserUtils.getFileFromSearchImportPath(
+                    localFileRet.file.getAbsolutePath());
         try {
-            in = QueryParserUtils.getImportScriptAsReader(localFileRet.file.getAbsolutePath());
+            in = new BufferedReader(new FileReader(macroFile));
{code}

should be

{code}
in = new BufferedReader(new FileReader(localFileRet.file));
{code}

3. For the tests, can you extract out the common code to a method to cut down on the repetition
of code. Something like

{code}
importUsingSearchPathTest() {
   verifyImportUsingSearchPath("/tmp/mytest2.pig", "mytest2.pig", "/tmp");
}

importUsingSearchPathTest2() {
   verifyImportUsingSearchPath("/tmp/mytest2.pig", "./mytest2.pig", "/tmp");
}

importUsingSearchPathTest3() {
   verifyImportUsingSearchPath("/tmp/mytest2.pig", "../mytest2.pig", "/tmp");
}

importUsingSearchPathTest4() {
   verifyImportUsingSearchPath("/tmp/mytest2.pig", "/tmp/mytest2.pig", "/foo/bar");
}

verifyImportUsingSearchPath(String macroFilePath, String importFilePath, String importSearchPath)
{
.....
}

{code}

4) negtiveUsingSearchPathTest2 and 3 are not very useful, unless some file with same name
and garbage text are created in the search path location. That way we can ensure that the
right file is being picked up and not the other file.
                
> Macro expansion does not use pig.import.search.path - UnitTest borked
> ---------------------------------------------------------------------
>
>                 Key: PIG-2729
>                 URL: https://issues.apache.org/jira/browse/PIG-2729
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2, 0.10.0
>         Environment: pig-0.9.2 and pig-0.10.0, hadoop-0.20.2 from Clouderas distribution
cdh3u3 on Kubuntu 12.04 64Bit.
>            Reporter: Johannes Schwenk
>             Fix For: 0.10.0
>
>         Attachments: PIG-2729.patch, PIG-2729.patch, test-macros.tar.gz, use-search-path-for-imports.patch
>
>
> org.apache.pig.test.TestMacroExpansion, in function importUsingSearchPathTest the import
statement is provided with the full path to /tmp/mytest2.pig so the pig.import.search.path
is never used. I changed the import to 
> import 'mytest2.pig';
> and ran the UnitTest again. This time the test failed as expected from my experience
from earlier this day trying in vain to get pig eat my pig.import.search.path property! Other
properties in the same custom properties file (provided via -propertyFile command line option)
like udf.import.list get read without any problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message