crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-300) Support reflected Avro record writing from MemPipeline
Date Wed, 20 Nov 2013 18:50:35 GMT


Gabriel Reid commented on CRUNCH-300:

I've integrated this one locally, but I'm still working on putting together more substantial
integration tests for writing from MemPipeline in general. 

I'll wait until CRUNCH-293 is done before going further with it. 

An added topic on writing from MemPipelines in General is that there doesn't seem to be a
standard way of handling the output file naming and directory structure. I think that coming
to a standard will mean breaking some some external stuff that depends on the current output
structure, but I'm thinking (hoping) that there aren't many people using it yet. 

> Support reflected Avro record writing from MemPipeline
> ------------------------------------------------------
>                 Key: CRUNCH-300
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: David Whiting
>            Priority: Minor
>         Attachments: 0001-Allow-MemPipeline-to-write-Avro-files-by-reflection.patch
> MemPipeline doesn't support writing Avro records via reflection. It seems that this was
half implemented but never finished, but I needed it to create some test data to run through
a cluster MapReduce test. The current implementation correctly reflects the schema, but then
uses a GenericDatumWriter to try and write the record, causing a ClassCastException. The correct
way would be to get a ReflectDatumWriter from the ReflectDataFactory.

This message was sent by Atlassian JIRA

View raw message