hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (PIG-1580) new syntax for native mapreduce operator
Date Wed, 01 Sep 2010 17:39:53 GMT

     [ https://issues.apache.org/jira/browse/PIG-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thejas M Nair resolved PIG-1580.
--------------------------------

    Resolution: Won't Fix

In case of 'hadoop jar' command, the files to ship to distributed cache are specified using
-files command line option.  Since typical users would be moving an existing map-reduce job
that they were running using 'hadoop jar', it is easier for them to copy the existing command
line options rather than the SHIP/CACHE clause in the proposed syntax.

If we don't have the SHIP/CACHE clauses in mapreduce operator, there is very little similarity
between streaming and mapreduce operator. It will be better to use LOAD/STORE instead of INPUT/OUTPUT
in the syntax of mapreduce, as they specify the load/store functions and not the streaming
deserializer/serializer.

So I think it is better to go back to the old syntax. Resolving jira as won't-fix.


> new syntax for native mapreduce operator
> ----------------------------------------
>
>                 Key: PIG-1580
>                 URL: https://issues.apache.org/jira/browse/PIG-1580
>             Project: Pig
>          Issue Type: Task
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> mapreduce operator (PIG-506) and stream operator have some similarities. It makes sense
to use a similar syntax for both.
> Alan has proposed the following syntax for mapreduce operator, and that we move stream
operator also to similar a syntax in a future release.
> MAPREDUCE id jar
>          INPUT  'path' USING LoadFunc  
>         OUTPUT  'path' USING StoreFunc
>         [SHIP 'path' [, 'path' ...]]
>         [CACHE 'dfs_path#dfs_file' [, 'dfs_path#dfs_file' ...]]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message