avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thiruvalluvan M. G. (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AVRO-533) .NET implementation of Avro
Date Sat, 21 Aug 2010 16:46:17 GMT

    [ https://issues.apache.org/jira/browse/AVRO-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901050#action_12901050
] 

Thiruvalluvan M. G. commented on AVRO-533:
------------------------------------------

Hi Jeremy,

I could get the code to compile on my machine. I was trying to understand the code from the
bottom. Since the schema and encoding/decoding at the very bottom, the following comments
apply to those parts of the code. Please bear with me, it'll take a while to go through the
whole codebase. Since we wish and hope that the code will live for several years, a couple
of weeks spent in getting it right should be worth it.

Most of the tests passed on my machine, some failed. One test even crashes NTest UI.

The class hierarchy of Schema is:
{code}
Schema
    ArraySchema
    MapSchema
    NamedSchema
        EnumSchema
        FixedSchema        
        RecordSchema
            ErrorSchema
    PrimitiveSchema
        BooleanSchema
        NullSchema
    UnionSchema
{code}

It's not clear why only NullSchema and BooleanSchema are specified and the rest of the primitive
schemas are left alone. I think a a more appropriate hierarchy would be:

{code}
Schema
    NamedSchema
        EnumSchema
        FixedSchema        
        RecordSchema/ErrorSchema
    UnnamedSchema
        ArraySchema
        MapSchema
        PrimitiveSchema
        UnionSchema
{code}

Instead of having SchemaType as a string, it'd be better to have it as an enum, which, unlike
string, is close-ended.

Testing appears inadequate. In TestSchema, for example the test only covers parsing.

In the Encoder and Decoder interfaces, we are accepting the Stream in every function. Typically,
a lot of encoding/decoding happens on the same stream. So attaching the stream the interface
once will be better, the way we do in the Java implementation.

I see some files have both Windows and Unix line-endings. I don't know where the problem is.
Do you think keeping Windows line-ending throughout will keep us from these troubles?

Most of the places the tabs are expanded into spaces, which is perfect. But certain tabs seem
to have escaped. We try to avoid hard-tabs.

We can save quite a bit of code if we use parameterized tests. There is attempt to achieve
parameterization in TestSchema. But it leads to some problems, I think. When tests fail, it
becomes hard to find which test has failed.

I tired to capture all the above ideas (regarding the Schema not encoder/decoder) by refactoring
the code. I kept it at git@github.com:thirumg/Avro.NET.git in the master branch.
Please feel free to pull it and try. I have commented out certain tests and code in Avrogen
because of this refactroing. Please don't mistake me, my intention was not to arbitrarily
change your code. I thought some ideas are better shown by demonstration rather than pages
and pages of text.

Please let me know what you think.

Do you have any suggestion for a code coverage tool? It'll be nice to know how much coverage
we get with our tests.

Thanks

Thiru

   

> .NET implementation of Avro
> ---------------------------
>
>                 Key: AVRO-533
>                 URL: https://issues.apache.org/jira/browse/AVRO-533
>             Project: Avro
>          Issue Type: New Feature
>    Affects Versions: 1.4.0
>            Reporter: Jeff Hammerbacher
>         Attachments: AVRO-533.zip, dotnet.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message