Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C62A2109E4 for ; Wed, 20 Nov 2013 18:32:38 +0000 (UTC) Received: (qmail 67732 invoked by uid 500); 20 Nov 2013 18:32:37 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 67706 invoked by uid 500); 20 Nov 2013 18:32:36 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 67695 invoked by uid 500); 20 Nov 2013 18:32:36 -0000 Delivered-To: apmail-incubator-crunch-dev@incubator.apache.org Received: (qmail 67687 invoked by uid 99); 20 Nov 2013 18:32:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Nov 2013 18:32:35 +0000 Date: Wed, 20 Nov 2013 18:32:35 +0000 (UTC) From: "Ryan Blue (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CRUNCH-293) Injection of reader into AvroRecordReader MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825634#comment-13825634 ] Ryan Blue edited comment on CRUNCH-293 at 11/20/13 6:31 PM: ------------------------------------------------------------ Micah, Josh just pointed me at this issue, which is well-timed. I just ran into the same problem recently and implemented a solution I was going to submit a patch for. My problem was that I wanted to override the avro generic classes rather than the specific. My implementation is very similar to yours, only I changed the reflect factory to a ReaderWriterFactory and added an AvroMode enum to handle each case (REFLECT, SPECIFIC, GENERIC). The enum approach brings all of the code together like your AvroDataFactory, but keeps the handling for each mode separate. Each mode can be individually overridden: {code:java} AvroMode.SPECIFIC.override(specificFactory); {code} It also cleans up some of the places that hard-code specific or reflect readers to use the right one based on the AvroType: {code:java} AvroMode.fromType(atype).configure(bundle) {code} This is what makes it possible for me to override the generics correctly. A lot of places simply used Reflect because it is the most general... but that causes problems if you need to change specific (e.g., use a different ClassLoader) or generic. Let me know what you think of the [branch|https://github.com/rdblue/crunch/commit/110614da91dc2b7609520883bb3cdfd40ea68700] and whether it might work for your needs. Please ignore all of the unnecessary import changes, I still need to clean it up. was (Author: rdblue): Micah, Josh just pointed me at this issue, which is well-timed. I just ran into the same problem recently and implemented a solution I was going to submit a patch for. My problem was that I wanted to override the avro generic classes rather than the specific. My implementation is very similar to yours, only I changed the reflect factory to a ReaderWriterFactory and added an AvroMode enum to handle each case (REFLECT, SPECIFIC, GENERIC). The enum approach brings all of the code together like your AvroDataFactory, but keeps the handling for each mode separate. Each mode can be individually overridden: {code:java} AvroMode.SPECIFIC.override(specificFactory); {code} It also cleans up some of the places that hard-code specific or reflect readers to use the right one based on the AvroType: {code:java} AvroMode.forType(atype).configure(bundle) {code} This is what makes it possible for me to override the generics correctly. A lot of places simply used Reflect because it is the most general... but that causes problems if you need to change specific (e.g., use a different ClassLoader) or generic. Let me know what you think of the [branch|https://github.com/rdblue/crunch/commit/110614da91dc2b7609520883bb3cdfd40ea68700] and whether it might work for your needs. Please ignore all of the unnecessary import changes, I still need to clean it up. > Injection of reader into AvroRecordReader > ----------------------------------------- > > Key: CRUNCH-293 > URL: https://issues.apache.org/jira/browse/CRUNCH-293 > Project: Crunch > Issue Type: Improvement > Components: Core > Affects Versions: 0.7.0, 0.8.0 > Reporter: Micah Whitacre > Assignee: Micah Whitacre > Attachments: CRUNCH-293.patch, CRUNCH-293_v2.patch > > > With CRUNCH-243, I wanted to support injecting custom readers to handle the cases like passivity between Avro Schema. The changes made however were not complete as we also need to be able to inject a reader into the AvroRecordReader which constructs its own SpecificDatumReader. > We could create a SpecificDataFactory which emulates the ReflectDataFactory. Or simplify to a single DataFactory which will create either Reflect/Specific/Generic. Thoughts? -- This message was sent by Atlassian JIRA (v6.1#6144)