incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kiyan Ahmadizadeh (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-46) Scrunch jobs launched from repl using PipelineLike#done are not shipped with jar of repl code.
Date Tue, 14 Aug 2012 21:33:37 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kiyan Ahmadizadeh updated CRUNCH-46:
------------------------------------

    Attachment: CRUNCH-46.patch

This patch fixes the problem by adding code to PipelineLike#done that ships the repl jar with
the job.
                
> Scrunch jobs launched from repl using PipelineLike#done are not shipped with jar of repl
code.
> ----------------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-46
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-46
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Kiyan Ahmadizadeh
>         Attachments: CRUNCH-46.patch
>
>
> Suppose the following example code is run in the scrunch/scala repl:
> val pipeline = Pipeline()
> val textLines = pipeline.read(From.textFile("shakes.txt"))
> val alphaNumericTextLines = textLines.map(line => line.toLowerCase().replaceAll("[^A-Za-z
]", ""))
> val words = alphaNumericTextLines.flatMap(line => line.split("""\W+"""))
> counts = words.count()
> counts.write(To.textFile("/user/kiyan/counts"))
> pipeline.done()
>  
> This code results in a ClassNotFoundException in MapReduce tasks.  However, changing
the last line to pipeline.run() produces no errors.
> The problem is that the method org.apache.crunch.scrunch.PipelineLike#run adds a jar
of generated repl code to the job's running tasks, but org.apache.crunch.scrunch.PipelineLike#done
does not.  The done method should be modified to take the same actions as the run method when
launching a job from the Scala repl.  This will ensure users can launch jobs from the repl
regardless of how they conclude their pipelines.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message