hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Templeton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6846) Fragments specified for libjar paths are not handled correctly
Date Mon, 20 Mar 2017 16:52:41 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933034#comment-15933034
] 

Daniel Templeton commented on MAPREDUCE-6846:
---------------------------------------------

Thanks, [~ctrezzo].  Here's my comments:

# I found this code to be a little confusing: {code}          URI pathURI = getPathURI(newPath,
tmpURI.getFragment());
          if (!foundFragment) {
            foundFragment = pathURI.getFragment() != null;
          }
          DistributedCache.addFileToClassPath(new Path(pathURI.getPath()), conf,
              jtFs, false);
          libjarURIs.add(pathURI);{code}  It seems odd to create {{pathURI}} and then do nothing
with it that you couldn't do with {{tmpURI}} until the end.  For me it would be clearer as:{code}
         DistributedCache.addFileToClassPath(new Path(tmpURI.getPath()), conf,
              jtFs, false);
          if (!foundFragment) {
            foundFragment = tmpURI.getFragment() != null;
          }
          libjarURIs.add(getPathURI(newPath, tmpURI.getFragment()));{code}
# I really hate the practice of catching meaningful exceptions and rethrowing them wrapped
in something generic, but in the case of the {{URISyntaxException}} that shoudn't happen,
I don't see where the API gives you much choice.  Since it is an unexpected exception that
we're going to deliver ultimately to the application client, it would be nice to be really
explicit about what went wrong.  You're bundling the original exception, but that doesn't
always make it all the way through to the end user.
# Looks like you're not qualifying the URIs before adding them, as was done before: {code}
       for (URI uri : libjarURIs) {
          DistributedCache.addCacheFile(uri, conf);
        }{code}
# If you rearrange {{uploadLibJars()}} like I suggested above, I don't think you need the
change to {{getPathURI()}}.  Or am I missing something?
# From the perspective of the person trying to understand the unit test failure report, I
don't think replacing the {{fail()}} calls with {{assertTrue()}} and {{assertFalse()}} makes
anything clearer.  The assert message is (correctly) a plain statement of misconfiguration,
which doesn't make sense with an {{assertTrue()}}/{{assertFalse()}}.
# Why are you splitting the {{ResourceLimitsConf}} into two classes?  Inheritance among inner
classes always strikes me as a strange thing to do.
# Given that the {{ResourceLimitsConf.Builder}} is an inner class of {{ResourceLimitsConf}},
I don't think that renaming it to {{ResourceLimitsConfBuilder}} is particularly needful, and
it makes the code a bit more cluttered.  I assume it's because you now have two classes and
two builders, and you want to separate them, but see the previous point.
# Tiny nit, but I'd rather you spell out distributed cache than use DC.
# Might be nice to add a comment to the stub's {{mkdirs()}} to say why it's OK that it doesn't
actually do anything.  You also might want to add a comment to the {{JobResourceUploader.mkdirs()}}
method to explain that it's there so that it can be overridden.


> Fragments specified for libjar paths are not handled correctly
> --------------------------------------------------------------
>
>                 Key: MAPREDUCE-6846
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6846
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
>            Reporter: Chris Trezzo
>            Assignee: Chris Trezzo
>            Priority: Minor
>         Attachments: MAPREDUCE-6846-trunk.001.patch, MAPREDUCE-6846-trunk.002.patch
>
>
> If a user specifies a fragment for a libjars path via generic options parser, the client
crashes with a FileNotFoundException:
> {noformat}
> java.io.FileNotFoundException: File file:/home/mapred/test.txt#testFrag.txt does not
exist
> 	at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:638)
> 	at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:864)
> 	at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:628)
> 	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
> 	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:363)
> 	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:314)
> 	at org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:387)
> 	at org.apache.hadoop.mapreduce.JobResourceUploader.uploadLibJars(JobResourceUploader.java:154)
> 	at org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:105)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:102)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
> 	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
> 	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1892)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
> 	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
> 	at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:306)
> 	at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:359)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> 	at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:367)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> 	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> 	at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {noformat}
> This is actually inconsistent with the behavior for files and archives. Here is a table
showing the current behavior for each type of path and resource:
> | || Qualified path (i.e. file://home/mapred/test.txt#frag.txt) || Absolute path (i.e.
/home/mapred/test.txt#frag.txt) || Relative path (i.e. test.txt#frag.txt) ||
> || -libjars | FileNotFound | FileNotFound|FileNotFound|
> || -files | (/) | (/) | (/) |
> || -archives | (/) | (/) | (/) |



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message