avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-2046) avro-python3: Very restricted set of data types which are allowed in AvroSchemaFromJSONData
Date Mon, 17 Jul 2017 08:54:02 GMT

    [ https://issues.apache.org/jira/browse/AVRO-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089511#comment-16089511
] 

ASF GitHub Bot commented on AVRO-2046:
--------------------------------------

GitHub user manu-chroma opened a pull request:

    https://github.com/apache/avro/pull/235

    schema.py: No sys traceback in parse exception

    In the ``SchemaParseException``, do not provide sys traceback. 
    
    For our project CWL Tool, we're using `avro/py` in our python 3 builds. More on this has
been discussed here: https://issues.apache.org/jira/browse/AVRO-2046 
    
    For doing this, we use `autotranslate` tool which converts `avro/py` code to python2and3
compatible code during runtime. 
    The problem arises when it tries to convert this `raise Exception` command. There is no
way to achieve this in a cross-compatible way without the use of external lib.
     
    Thus, I've created this PR. This is a very minimal change and really solves our problem
for the time being. We really hope you'll consider this or at least give your feedback on
the same.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manu-chroma/avro patch-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/avro/pull/235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #235
    
----
commit 92525fda5cbae1ea7b9e5e255a52ad7e8f0ff71f
Author: Manvendra Singh <manvendra0310@gmail.com>
Date:   2017-07-17T08:53:28Z

    schema.py: No sys traceback in parse exception
    
    In the ``SchemaParseException``, do not provide sys traceback. 
    
    For our project CWL Tool, we're using `avro/py` in our python 3 builds. More on this has
been discussed here: https://issues.apache.org/jira/browse/AVRO-2046 
    
    For doing this, we use `autotranslate` tool which converts `avro/py` code to python2and3
compatible code during runtime. 
    The problem arises when it tries to convert this `raise Exception` command. There is no
way to achieve this in a cross-compatible way without the use of external lib.
     
    Thus, I've created this PR. This is a very minimal change and really solves our problem
for the time being. We really hope you'll consider this or at least give your feedback on
the same.

----


> avro-python3: Very restricted set of data types which are allowed in AvroSchemaFromJSONData
> -------------------------------------------------------------------------------------------
>
>                 Key: AVRO-2046
>                 URL: https://issues.apache.org/jira/browse/AVRO-2046
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.8.2
>         Environment: avro-python3 (1.8.2)
>            Reporter: Manvendra Singh
>
> Hey, I come from CWL project: https://github.com/common-workflow-language/cwltool and
as a part of my GSoC project, I'm working on adding Python 3 compatibility to *cwltool* codebase.
We've been using avro-python2 for a long time now and it has worked great for us in our projects:
schema_salad and cwltool.
> In the process of porting cwltool, I'm facing issues with avro-python3 library. I've
found the following bug:
> Minimal reproducible example:
> {code:none}
> from collections import OrderedDict
> import avro.schema
> AvroSchemaFromJSONData = avro.schema.SchemaFromJSONData
> a = {
>   "fields": [
>     {
>       "name": "name",
>       "type": "string"
>     },
>     {
>       "name": "favorite_number",
>       "type": [
>         "int",
>         "null"
>       ]
>     },
>     {
>       "name": "favorite_color",
>       "type": [
>         "string",
>         "null"
>       ]
>     }
>   ],
>   "name": "User",
>   "namespace": "example.avro",
>   "type": "record"
> }
> b = OrderedDict(a)
> AvroSchemaFromJSONData(a)
> AvroSchemaFromJSONData(b)
> {code}
> Ouput: 
> {code}
> ~/Desktop/test/venv3/lib/python3.5/site-packages/avro/schema.py in SchemaFromJSONData(json_data,
names)
>    1252   if parser is None:
>    1253     raise SchemaParseException(
> -> 1254         'Invalid JSON descriptor for an Avro schema: %r.' % json_data)
>    1255   return parser(json_data, names=names)
>    1256 
> SchemaParseException: Invalid JSON descriptor for an Avro schema: OrderedDict([('namespace',
'example.avro'), ('type', 'record'), ('name', 'User'), ('fields', [{'type': 'string', 'name':
'name'}, {'type': ['int', 'null'], 'name': 'favorite_number'}, {'type': ['string', 'null'],
'name': 'favorite_color'}])]).
> {code}
>  
> h5. The current implementation of this function does not allow for *any dict like data
type*. It, however, works in avro-python2. 
> Relevant line of code: https://github.com/apache/avro/blob/master/lang/py3/avro/schema.py#L1250
> Apart from this, I've tried using ``2to3`` tool on avro-python2 and testing our project
with it and it works perfectly. Thus, through this issue, I also want to motivate the following
PR: https://github.com/apache/avro/pull/234
> I don't expect a unified codebase for avro python2 and python3 as of now or in near future.
There has been a discussion on it before: https://github.com/apache/avro/pull/133
> But having avro-python2 cross compatible for both py2 and py3 would be really helpful
for our project and we will be able to complete our porting process. Thanks.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message