incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-70) Simplify Pipeline API
Date Thu, 20 Sep 2012 05:45:07 GMT


Vinod Kumar Vavilapalli updated CRUNCH-70:

    Attachment: CRUNCH-70-20120919.txt

Here's a patch to do this.

I added a util called Pipelines following the convention.

One question though (left as a TODO in the patch): In writeTextFile() run as part of a MR
pipeline, we do the following:
+      collection =
+          collection.parallelDo("asText", IdentityFn.<T> getInstance(),
+            WritableTypeFamily.getInstance().as(collection.getPType()));

Why do we do it? And do we really need MRPipeline to force the PTypeFamily to be Writables?
> Simplify Pipeline API
> ---------------------
>                 Key: CRUNCH-70
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: CRUNCH-70-20120919.txt
> Today Pipeline interface has the following APIs which really belong to a utils class:
>  - readTextFile
>  - writeTextFile
>  - enableDebug
> The implementation of these APIs is the same in both the Pipeline-types present today
and are most likely going to be the same if ever we have one more impl.
> I propose we move these to a util/lib to make the core interface cleaner.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message