hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6949) Reduces RPC packet size for primitive arrays, especially long[], which is used at block reporting
Date Sat, 19 Mar 2011 00:20:29 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Matt Foley updated HADOOP-6949:

    Attachment: arrayprim_v7.patch

# ArrayPrimitiveWriable

    * Class javadoc is not very clear - not sure what you mean by boxed type not supported
by ObjectWritable and why that should be mentioned in the javadoc. I am not sure also how
clients can provide external locking; more appropriate would be to say, pass a copy of the
array, if concurrent modifications are expected.

Javadoc comments simplified.

    * Comments related to declaredComponentType is not clear. Can you please add description
on what componentType and declaredComponentType are and how they are used, bef adding further


    * Is PrimitiveArrayWritable a better name?

By common usage, "PrimitiveArrayWritable" would imply a sub-class of ArrayWritable, which
this isn't.  So I thought it best to use a different word order.

    * Add @InterfaceAudience.Public and @InterfaceStability.Stable


    * Throw HadoopIllegalArgumentException instead of NullPointerException and IllegalArgumentException


    * Optional: The read write methods could be made static methods in WritableUtils class,
so it can be used by others. Also PrimitiveType map, isCanditdatePrimitive() could also move
to helper.

After discussion, we agreed there is no benefit in efficiency or code clarity, so no change
in this regard.

    * Rename isCandidatePrimitive() to isPrimitive()?

Agreed. Originally, isCandidatePrimitive() provided a layer of indirection so that ArrayPrimitiveWritable
didn't have to contract to support ALL primitive types.  However, since it ended up supporting
them all, the distinction is irrelevant.
Changed to use Class.isPrimitive().

    * Is constructor with componentType as argument used any where?

No, it is provided to make ArrayPrimitiveWritable more generally useful.

    * Some methods could be private - for example - getDeclaredComponentType(), isDeclaredComponentType(),

No, if the user elects to use the typechecking capabilities of declaredComponentType, he should
be able to make both of these queries.

    * Can you use Text instead of UTF8? This avoids deprecation warnings. You could also use
WritableUtils to write and read string. This can be done in many other places in the patch.

I do not want to change the usage from that of ObjectWritable.  Essentially all our Writable
classes use UTF8.  If we want to change that, please open a different Jira.

    * I prefer MalformedInputException instead of ProtocolException (see Text.java for example)

I agree ProtocolException isn't exactly right.  
Problem with MalformedInputException: it doesn't take a "message" argument, only an integer,
so we can't use that.
Changed to just use IOException, since that's what most of the Writable code uses.


# ObjectWritable.java

    * Can you call set() method while setting instance and declaredClass in #writeObject()

My understanding is that ObjectWritable.writeObject() is a static method that does not instantiate
an ObjectWritable instance.  The variables "instance" and "declaredClass" are both local variables
in #writeObject() context.  So calling the non-static "set()" would not be okay.

    * Not sure about the comment: // This case must come after the "isArray()" case

You're right, it's not order-dependent since it uses /if/else if/else/ rather than a sequence
of "if"s.
Removed the comment.

    * Not sure why this condition is required in #writeObject() - instance.getClass().getName().equals(declaredClass.getName())

I'm being paranoid.  There can be a range of behaviors if declaredClass isn't exactly the
same as instance.getClass(), so I'm refusing to allowCompactArrays if the caller isn't declaring
the true type.


# TestArrayPrimitiveWritable

    * In the class javadoc can you add link to PrimitiveArrayWritable?

Done.  Good catch.

    * Can you make in, out, bigSet, resultSet and expectedSet final?


    * Should we prefer reading and writing objects using readFields and writeFields - since
they are the methods supporting Writable contract?

For "normal" i/o I do use readFields() and write().  Then I test to see that the wire format
is as expected, by going around those methods and directly invoking lower-level functionality.
 I think this is the only way to get confidence in the wire format encoders and decoders.

> Reduces RPC packet size for primitive arrays, especially long[], which is used at block
> -------------------------------------------------------------------------------------------------
>                 Key: HADOOP-6949
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6949
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 0.22.0
>            Reporter: Navis
>            Assignee: Matt Foley
>             Fix For: 0.23.0
>         Attachments: ObjectWritable.diff, arrayprim.patch, arrayprim_v4.patch, arrayprim_v5.patch,
arrayprim_v6.patch, arrayprim_v7.patch
>   Original Estimate: 10m
>  Remaining Estimate: 10m
> Current implementation of oah.io.ObjectWritable marshals primitive array types as general
object array ; array type string + array length + (element type string + value)*n
> It would not be needed to specify each element types for primitive arrays.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message