avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leigh L. Klotz, Jr (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-1345) Python Codegen
Date Thu, 13 Jun 2013 00:07:20 GMT

    [ https://issues.apache.org/jira/browse/AVRO-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13681784#comment-13681784

Leigh L. Klotz, Jr commented on AVRO-1345:

Codegen lets you prevent the creation of data messages that don't correspond to the schema.
 Using a Python dict turns that error into a runtime error.  This gives you the ability to
detect data validation errors earlier.

> Python Codegen
> --------------
>                 Key: AVRO-1345
>                 URL: https://issues.apache.org/jira/browse/AVRO-1345
>             Project: Avro
>          Issue Type: New Feature
>          Components: java, python
>            Reporter: Tal Levy
>         Attachments: AVRO-1345.patch
> I recently started using Avro at my work and we found it difficult to keep 
> track of what python dict matched to what schema. Instead of having 
> random dicts being populated and then attempted to be serialized to avro, I thought 
> it would be more readable and less error prone to codegen the python dict 
> for developers. These classes are type checked field by field. Although it does not 
> have the advantage of compiled type checking like in the java codegen, it is a 
> friendly wrapper around python dicts representing avro records to be serialized.
> let me know what you think about this, I am still tweaking how it behaves. 
> I understand it is a bit unpythonic to enforce types in this way, but the readability

> is worth it nonetheless.
> here is an example record:
> https://gist.github.com/talevy/5696236
> I extended the avro compiler/tools to provide both java and python codegen functionality.
> so if this sounds like something others would use, maybe it makes sense to include it
> into the main repo.
> here are the changes
> https://github.com/talevy/avro/tree/python-codegen
> a few caveats and thoughts about my current version:
> 1. I do not know how to best handle constructors, because some fields are not allowed
to be null... maybe a builder pattern would work here, but it's kind of weird in python
> 2. I copy/pasted a lot of the code from SpecificCompiler to make the PythonCompiler...
some renaming and code re-use via inheritance would make it read better.
> 3. I wanted to reuse the validate methods provided already in Avro to verify the record,
but it takes away from some of the class type correctness for nested records and such.
> 4. I do not know what the best way of outputing multiple files is, I currently use the
same packaging as the java classes into their namespace directories
> 5. I am not familiar with the avro-protocol format, so I only implemented enums and records.
> I updated the SpecificCompilerTool to have the following usage
> ```
> "Usage: [-string] (schema|protocol) (python|java) input... outputdir"
> ````
> So generating the python classes is as easy as java.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message