commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Re: [codec] Generics added in a SVN branch
Date Sun, 14 Aug 2011 14:47:29 GMT
On Sat, Aug 13, 2011 at 10:02 PM, sebb <sebbaz@gmail.com> wrote:
> On 12 August 2011 20:56, Gary Gregory <garydgregory@gmail.com> wrote:
>> Hello All:
>>
>> For a first cut at generics support in Codec, please checkout the
>> branch https://svn.apache.org/repos/asf/commons/proper/codec/branches/generics
>>
>> I wrote a migration guide in the root of the project called
>> Codec2-Migration.htm.
>>
>> Let's discuss.
>
> The original code used Encoder and Decoder interfaces which operated
> on Object, and extended these with
> BinaryEncoder (byte[])
> StringEncoder(String)
>
> So for example StringEncoder classes need to implement
> encode(Object) : Object
> encode(String) : String
>
> As far as I can tell, those interfaces cannot be directly generified,
> because type erasure causes a method signature clash.
> So adding generics here means breaking binary compatibilty.

Hello Sebb,

Thank you for taking the time to look this over.

You are correct, a class cannot implement an interface twice, so it
cannot implement the generics version of StringEncoder AND
BinaryEncoder.

>
> Question is, what does adding generics to these interfaces actually provide?

As you know. generics let us define precisely what types are supported
and avoids the current bug prone guessing game for implementers and
call sites.

For example, look at the current mess in trunk's BinaryCodec:

If I call encode(byte[]), I get a byte[] back.

If I call encode(Object) with a byte[], I get a char[] back.

Uh? Is the code wrong, is the Javadoc wrong, how am I supposed to know
what is right? I say it's a mess.

I would flip the question around: If I implemented this today (1) why
would I not use generics, and (2) why would I want to support "Object
encode(Object)". Codec 2.0 is the opportunity to clean this up. What I
write about Encoders here applies to Decoders too.

Most encode/decode methods look like this:

    public Object encode(Object raw) throws EncoderException {
        if (!(raw instanceof byte[])) {
            throw new EncoderException("argument not a byte array");
        }
        return toAsciiChars((byte[]) raw);
    }

There is always one (or more) if instanceof expr for a typed method
call, and codec-styled class cast exception in the form an
EncoderException.This does not smell OO to me.

Tangent: Wouldn't this be better typed as a ClassCastException or an
IllegalArgumentException anyway?

Any advantage of this pattern (is there one? ;) is ruined by the foggy
example above. I say foggy and not black box because we are FOSS.

I also want typed asymmetric encoders, for example, for the new BM
encoder, I want "Set<String> encode(String)", or "Set<CharSequence>
encode(CharSequence)"

Hex decoding is a similar but differently confusing example:

If I call encode(byte[]), the result is:

StringUtils.getBytesUnchecked(encodeHexString(array), getCharsetName());

encodeHexString(array) creates a string which encode(byte[]) uses with
getBytesUnchecked and the char set name. So encodeHexString(array)
calls "new String(encodeHex(data))".

If I call encode(Object) with a byte[], the result is:

encodeHex(byteArray)

Why the difference with the charset name? Do I want to spend the time
telling the difference? Is the code wrong? Is the Javadoc wrong?

The Hex decode(byte[]|Object) methods suffer from the same problem.

Every encode/decode(Object) typed method creates duplication and
opportunities for bugs.

So we have /at least/ three problems today with the current system
which I claim would be eliminated by using generics because there
would be one entry point for each type and it would get rid of
encode/decode(Object) methods.

> Would it not be just as effective to deprecate Decoder and Encoder,
> and code can then use either BinaryEncoder or StringEncoder (etc)?

We could do that too, but I would still want typed Decoder and Encoder
to define codecs that do not fit in BinaryEncoder or StringEncoder.

For an example different from the new BM encoder, we could recode the
DoubleMetaphone encoder as DoubleMetaphoneResult<String,
DoubleMetaphoneResult>.

Today, if you want both encodings (normal and alternate,) you have to
encode twice with a different parameter to encode(String, boolean)
which is not in any interface (and both values are computed internally
each time!) Granted, this problem could be solved without generics,
but generics makes it obvious what the result type is, instead of
reading possibly buggy Javadocs for a return typed as Object. Only
changing the result type in 2.0 of "Object encode(Object)" from String
to DoubleMetaphoneResult would be a runtime 'surprise' as opposed to a
compile time 'surprise.' Catching problems the earlier, the better
when migrating.

Thank you for the discussion, :)
Gary

>
> At the moment test code is of the form:
>
> Encoder enc = new Base64();
> ...
> byte [] ba = (byte[]) enc.encode(binary)); // unchecked cast
>
> However this could be written:
>
> BinaryEncoder enc = new Base64();
> ...
> byte [] ba = enc.encode(binary)); // no cast needed
>
>
>> I plan on not touching trunk until the generics code is flushed out
>> and brought back to trunk.
>>
>> Thank you,
>> Gary
>>
>> --
>> http://garygregory.wordpress.com/
>> http://garygregory.com/
>> http://people.apache.org/~ggregory/
>> http://twitter.com/GaryGregory
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>



-- 
Thank you,
Gary

http://garygregory.wordpress.com/
http://garygregory.com/
http://people.apache.org/~ggregory/
http://twitter.com/GaryGregory

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message