hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-569) Inconsistency with Hadoop in Pig load statements involving globs with subdirectories
Date Sun, 01 Feb 2009 20:43:59 GMT

     [ https://issues.apache.org/jira/browse/PIG-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Dai resolved PIG-569.
----------------------------

    Resolution: Duplicate

It is a duplicate of [PIG-252|https://issues.apache.org/jira/browse/PIG-252]

> Inconsistency with Hadoop in Pig load statements involving globs with subdirectories
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-569
>                 URL: https://issues.apache.org/jira/browse/PIG-569
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>         Environment: FC Linux x86/64, Pig revision 724576
>            Reporter: Kevin Weil
>             Fix For: types_branch
>
>
> Pig cannot handle LOAD statements with Hadoop globs where the globs have subdirectories.
 For example, 
> A = LOAD 'dir/{dir1/subdir1,dir2/subdir2,dir3/subdir3}' USING ...
> A similar statement in Hadoop, hadoop dfs -ls dir/{dir1/subdir1,dir2/subdir2,dir3/subdir3},
does work correctly.
> The output of running the above load statement in pig, built from svn revision 724576,
is:
> 2008-12-17 12:02:28,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
> 2008-12-17 12:02:28,480 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Map reduce job failed
> 2008-12-17 12:02:28,480 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- java.io.IOException: Unable to get collect for pattern dir/{dir1/subdir1,dir2/subdir2,dir3/subdir3}}
[Failed to obtain glob for dir/{dir1/subdir1,dir2/subdir2,dir3/subdir3}]
> 	at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asCollection(HDataStorage.java:231)
> 	at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asCollection(HDataStorage.java:40)
> 	at org.apache.pig.impl.io.FileLocalizer.globMatchesFiles(FileLocalizer.java:486)
> 	at org.apache.pig.impl.io.FileLocalizer.fileExists(FileLocalizer.java:455)
> 	at org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:108)
> 	at org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59)
> 	at org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:200)
> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
> 	at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370)
> 	at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
> 	at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.pig.backend.datastorage.DataStorageException: Failed to obtain
glob for dir/{dir1/subdir1,dir2/subdir2,dir3/subdir3}
> 	... 13 more
> Caused by: java.io.IOException: Illegal file pattern: Expecting set closure character
or end of range, or } for glob {dir1 at 5
> 	at org.apache.hadoop.fs.FileSystem$GlobFilter.error(FileSystem.java:1084)
> 	at org.apache.hadoop.fs.FileSystem$GlobFilter.setRegex(FileSystem.java:1069)
> 	at org.apache.hadoop.fs.FileSystem$GlobFilter.<init>(FileSystem.java:987)
> 	at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:953)
> 	at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:962)
> 	at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:962)
> 	at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:962)
> 	at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:902)
> 	at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:862)
> 	at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asCollection(HDataStorage.java:215)
> 	... 12 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message