crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-337) Make it easier to use multiple input paths
Date Tue, 04 Feb 2014 15:36:09 GMT


Josh Wills commented on CRUNCH-337:

We do our best. I'll see if any of the other committers have an opinion and if it looks good,
I'll commit it later today Pacific Time.

> Make it easier to use multiple input paths
> ------------------------------------------
>                 Key: CRUNCH-337
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.9.0
>            Reporter: Matthew Basil
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-337.patch
> It would be more user-friendly, especially for newbies, to provides methods on {{From}}
for creating sources from multiple {{Path}}s.  I'm currently attempting to write my first
Crunch Pipeline, which needs to read from multiple paths using a custom input format, and
I needed to dig into the source for {{From.formattedFile}} to see I need to do something like
> {code}
> PTableType<K, V> tableType = keyType.getFamily().tableOf(keyType, valueType);
> return new FileTableSourceImpl<K, V>(paths, tableType, formatClass);
> {code}
> I don't particularly mind, but other potential new users might be a bit put off by having
to look at the source on the first line of their first pipeline. If it's undesirable to double
the number of methods in {{From}} by doing this (which is understandable), it might be nice
to add some note on multiple input paths to the section of the users guide on {{Source}}s.
> Thanks!

This message was sent by Atlassian JIRA

View raw message