avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-570) python implementation of mapreduce connector
Date Fri, 19 Aug 2011 17:28:28 GMT

    [ https://issues.apache.org/jira/browse/AVRO-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087815#comment-13087815

Doug Cutting commented on AVRO-570:

Great to see this working!

Some cosmetic comments on the patch:
 - please don't include comments with your name, the date, stack traces, etc.
 - please don't use tabs for indentation, but only spaces, 2 per indent level
 - please check your patch to make sure no whitespace-only changes are made
 - please put a space after // and before {

These guidelines make it much easier to review patches.

For the enum TRANSPROTO:
 - this might better be called simply TRANSPORT or PROTOCOL or maybe TRANSPORT_PROTOCOL.
 - if it's public in a public, non-test class then it needs javadoc
 - i'd prefer when values of this are processed, rather than using an if-else structure, a
switch statement is used with a 'default' clause that throws an exception.  this makes it
clearer that all values are handled.
 - do we need a NONE value, or should we rather simply default the value to HTTP or SASL?

The need for the flush in TetherKeySerialization is mysterious, since that's using a DirectBinaryEncoder,
which has no buffering.  So the flush only has an effect on the underlying InputStream.  If
that's required then it may be a bug in Hadoop.  Perhaps rather than including the entire
stacktrace here, file a separate Jira issue about this with the stack trace, then just add
an end-of-line comment referring to this issue, e.g., 'flush(); // Possible bug, see: AVRO-XXX'

We shouldn't reference Jira issue numbers in sources except in comments.  So, instead of adding
a test-570 target in lang/py/build.xml you might define test.include and test.exclude properties
near the top of that build.xml, then run it with something like 'ant -Dtest.include=...'.

It would be great if someone more fluent in Python could look over the Python stuff here.

And it would be great if someone more fluent in Maven could look over the Maven stuff here.

> python implementation of mapreduce connector
> --------------------------------------------
>                 Key: AVRO-570
>                 URL: https://issues.apache.org/jira/browse/AVRO-570
>             Project: Avro
>          Issue Type: New Feature
>          Components: python
>    Affects Versions: 1.6.0
>            Reporter: Doug Cutting
>            Assignee: Jeremy Lewi
>            Priority: Critical
>              Labels: hadoop
>         Attachments: AVRO-570.patch, AVRO-570.patch, AVRO-570.patch
> AVRO-512 defines protocols for implementing mapreduce tasks.  It would be good to have
a Python implementation of this.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message