crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-306) MultipleOutput Targets
Date Fri, 29 Nov 2013 18:44:35 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835503#comment-13835503
] 

Josh Wills commented on CRUNCH-306:
-----------------------------------

I was thinking that it was going to require a custom meta output format, which could take
some other output format (and its associated config) and generate a new RecordWriter per key,
without having to know the values the keys were going to take on upfront. Does that sound
doable? I'm concerned with how to be smart about making this work when the same key exists
on multiple partitions, which is possible if we support writing PTables as well a PGroupedTables.

> MultipleOutput Targets
> ----------------------
>
>                 Key: CRUNCH-306
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-306
>             Project: Crunch
>          Issue Type: Bug
>          Components: IO
>            Reporter: Josh Wills
>
> A commonly desired feature for Crunch is the ability to write an output file for each
key in a PTable/PGroupedTable containing the values associated with that key. We should find
a way to support that one-output-per-key model.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message