crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adric Eckstein (JIRA)" <>
Subject [jira] [Created] (CRUNCH-543) AvroPathPerKeyTarget copy nested subdirectories
Date Thu, 16 Jul 2015 20:08:04 GMT
Adric Eckstein created CRUNCH-543:

             Summary: AvroPathPerKeyTarget copy nested subdirectories
                 Key: CRUNCH-543
             Project: Crunch
          Issue Type: Improvement
          Components: IO
            Reporter: Adric Eckstein

When using AvroPathPerKeyTarget to write out a subpath in the output directory using a String
key, the key might indicate multiple subfolders:

Pair<String, String> kv = new Pair<String, String>("foo/bar", "value");
PTable<String, String> kvs = pipeline.create(Arrays.asList(kv),Avros.tableOf(Avros.strings(),
PTables.asPTable(kvs).write(new AvroPathPerKeyTarget("output"));

This throws the error: java.lang.IllegalArgumentException: Reducer output name 'bar' cannot
be parsed

In AvroPathPerKeyTarget the handleOutputs method would need to recursively copy subfolders
(currently only checks first level in output directory) to enable keys that define multiple
sub folders.

This message was sent by Atlassian JIRA

View raw message