hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-572) If #link is missing from uri format of -cacheArchive then streaming does not throw error.
Date Mon, 31 May 2010 14:16:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873667#action_12873667

Vinod K V commented on MAPREDUCE-572:

Looked at the patch. When we 'check if there is any conflict in fragment names', we can do
better than the O (n^2) comparisons to verify if there is any duplicate, for e.g. while even
iterating the files/archives to see if any fragment is null, we can put them in a map keyed
by fragment name and fail immediately when we encounter duplicates on further iterations?

Granted this is not in any critical section, I am checking if we can incorporate a minor performance
improvement now that the code in question is touched..

> If #link is missing from uri format of -cacheArchive then streaming does not throw error.
> -----------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-572
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-572
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Karam Singh
>            Assignee: Amareshwari Sriramadasu
>            Priority: Minor
>             Fix For: 0.22.0
>         Attachments: patch-572-1.txt, patch-572.txt
> Ran hadoop streaming command as -:
> bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper
"xargs cat"  -reducer "bin/cat" -cahceArchive hdfs://h:p/pathofJarFile
> Streaming submits job to jobtracker and map fails.
> For similar with -cacheFile -:
> bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input in -output out -mapper
"xargs cat"  -reducer "bin/cat" -cahceFile hdfs://h:p/pathofFile
> followinng error is repoerted back -:
> [
> You need to specify the uris as hdfs://host:port/#linkname,Please specify a different
link name for all of your caching URIs
> ]
> Streaming should check about present #link after uri of cacheArchive and should throw
proper error .

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message