crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Blue (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-293) Injection of reader into AvroRecordReader
Date Mon, 18 Nov 2013 19:09:20 GMT


Ryan Blue commented on CRUNCH-293:

Micah, Josh just pointed me at this issue, which is well-timed. I just ran into the same problem
recently and implemented a solution I was going to submit a patch for. My problem was that
I wanted to override the avro generic classes rather than the specific. My implementation
is very similar to yours, only I changed the reflect factory to a ReaderWriterFactory and
added an AvroMode enum to handle each case (REFLECT, SPECIFIC, GENERIC). The enum approach
brings all of the code together like your AvroDataFactory, but keeps the handling for each
mode separate. Each mode can be individually overridden:

It also cleans up some of the places that hard-code specific or reflect readers to use the
right one based on the AvroType:

This is what makes it possible for me to override the generics correctly. A lot of places
simply used Reflect because it is the most general... but that causes problems if you need
to change specific (e.g., use a different ClassLoader) or generic.

Let me know what you think of the [branch|]
and whether it might work for your needs. Please ignore all of the unnecessary import changes,
I still need to clean it up.

> Injection of reader into AvroRecordReader
> -----------------------------------------
>                 Key: CRUNCH-293
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.0, 0.8.0
>            Reporter: Micah Whitacre
>            Assignee: Micah Whitacre
>         Attachments: CRUNCH-293.patch, CRUNCH-293_v2.patch
> With CRUNCH-243, I wanted to support injecting custom readers to handle the cases like
passivity between Avro Schema.  The changes made however were not complete as we also need
to be able to inject a reader into the AvroRecordReader which constructs its own SpecificDatumReader.
> We could create a SpecificDataFactory which emulates the ReflectDataFactory.  Or simplify
to a single DataFactory which will create either Reflect/Specific/Generic.  Thoughts?

This message was sent by Atlassian JIRA

View raw message