commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lambrou <m...@chrislambrou.com>
Subject Re: Commons-Collections enhanced with Java Generics Support
Date Thu, 26 May 2005 00:19:15 GMT
Hi all,

Sometime last summer, a there was a discussion about providing a generic 
port of collections. The upshot was that no Apache commiters could 
commit to providing time to handle this at Apache, and so I started the 
collections15.sf.net project over on SourceForge. A lot of the issues 
raised in recent messages, such as maintaining two separate codebases, 
the use of Retroweaver, etc.) were discussed back then. The general 
consensus back then was that a generic port of collections should free 
itself from the constraints of pre-1.5 code and should concentrate on 
providing broadly the same functionality as collections, but with a 
focus on the new language features, and a number of other changes as 
appropriate - e.g. a rearranged package structure to avoid the clutter 
that was then present in the root package of collections; either ignore 
or cut back the the Buffer classes and package in light of the new Queue 
API in Java 5.0; change APIs where the current collections API was 
inappropriate. As a result of these issues and a discussion of them last 
summer, it was felt that a simple generification of the existing 
collections API was not unsuitable. And so work on collections15.sf.net 
began. I started with an empty source tree, and began by copying over 
all of the interfaces defined in collections, and then generifying them. 
After that, I started to copy packages over one by one, giving each 
class a thorough work over to fully generify both its public API, its 
internal implementation, its javadoc and its associated unit tests. 
Although work was initially quite fast, it should be apparent what I 
large task this is. A couple of other developers joined the effort along 
the way, which improved progress, but to counter that there were some 
significant health problems that have all but brought my efforts to a 
halt since early this year. Basically, the code that's been written is 
pretty solid, and passes a thorough set of unit test (well, apart from 
some of the classes that are still work in progress), but there's plenty 
more to be done.

In contrast to this, Matt and John have made their recent announcement 
of a generified port of collections a couple of weeks ago - 
collections.sf.net. They've obviously put a lot of time and effort into 
this, which is to be commended. I've checked out the collections.sf.nef 
project and had a bit of a look round, and have come up with a number of 
issues that I'd like to raise, some of which need to be addressed.

1. The port appears to me to be a direct attempt to take the existing 
collections codebase and generify its API. It's an approach I initially 
took but abandoned after a while when I realised that much of the 
existing codebase was inappropriate for generifying. To clarify this, 
much of the current collections API is not typesafe, and raises problems 
when trying to generify it. For example, the ChainedTransformer class 
has a constructor that accepts a Collection. The javadoc indicates that 
this should be a collection of Transformer instances, and the resulting 
ChainedTransformer's transform method takes the input object and 
transforms it using each Transformer in the chain, returning the result. 
When generifying the class to ChainedTransformer<I, O>, it's not 
possible to use the following constructor

    public ChainedTransformer(List<Transformer<I, O>> transformers)

because it's generally not possible to take the output of each of the 
chained Transformers and pass it into the next Transformer in the chain. 
After much consideration, it was decided that to maintain compile-time 
type safety, the behaviour of ChainedTransformer had to fundamentally 
change. This e-mail is already long enough, so I don't want to elaborate 
any further. The point I wish to make is that the collections.sf.net 
project addresses such issues by compromising type safety - it's 
constructor to ChainedTransformer<I, O> is as follows

    public ChainedTransformer(Transformer[] transformers)

In this instance, the difficulties that generification raises have been 
skirted by sacrificing type-safety, and it's an approach that is taken 
throughout the collections.sf.net port of collections. I think this is 
an important point to consider, as probably the most important point of 
generics is to provide compile-time type-safety.

2. Whilst the public API of the collections.sf.net port has been 
generified, the internal implementation is largely untouched. It's still 
the non-generic code that is present in the current  commons-collections 
codebase. From a black-box approach, this isn't especially important 
provided that the implementation honours the documented API. As I've 
mentioned earlier, this isn't the approach I've taken in 
collections15.sf.net, where all of the code has been fully generified, 
rather than just the API. This isn't a particular criticism of 
collections.sf.net - Sun's own implementation of ArrayList<E> takes the 
same pragmatic approach - but it's just a difference I wanted to point 
out. I must say, however, that in the process of generifying all of the 
code in collections15.sf.net, a number of subtle improvements to the 
generified API became apparent that would not have been so had I only 
generified the public API. That this level of attention hasn't been paid 
to the implementation code in collections.sf.net, leads me to worry that 
the generification of the API isn't optimal, though I admit that this 
may be because my first stab at generifying the interfaces was not the 
best and so had to be changed a lot as I generified the implementing 
classes and the problems in the API became more apparent.

3. Here's my biggest worry. The unit tests in collections.sf.net don't 
appear to have been modified to reflect the generification of the APIs. 
The 100% success rate of the unit tests is therefore misleading, as it's 
more of an indication that the original commons-collections code on 
which the collections.sf.net port was based doesn't fail any of its unit 
tests. What's missing in the unit tests is an attempt to exercise the 
generic modifications made to the APIs. Whilst updating the unit tests 
in collections15.sf.net, a fair number of minor errors where uncovered. 
They were typically problems whereby it became apparent when writing the 
unit tests that the generic arguments of various methods weren't 
sufficiently flexible. I'm worried that since the unit tests in 
collections.sf.net don't exercise the generic modification that have 
been made, the modifications may not have been exercised at all.

4. The javadoc in collections.sf.net doesn't appear to have been updated 
to reflect the generification of the API.

Assuming that both Matt and John have been monitoring this mailing list 
since their recent announcement, I hope these issues are taken as the 
constructive criticism that I intend them to be. They've clearly put in 
a significant amount of effort in creating collections.sf.net, which I 
applaud. I just think that their project needs more work. I can't speak 
for the other two developers on collections15.sf.net, but my personal 
feeling is that it's wasteful to continue two separate attempts to 
create a generics port of collections, and that we should be 
collaborating to bring together the good work and good ideas present in 
both projects. I'd really like to hear what Matt and John have to say 
regarding the difference in approaches that have been taken by the two 
projects, particularly regarding the the 1.5 specific tailoring of 
collections15.sf.net compared to the more direct port of collections.sf.net.

Regarding the idea of pulling the collections.sf.net code into the 
commons SVN repository, I think that the original reasons for developing 
collections15.sf.net at SourceForge still apply. The regular 
contributors to the project are not commons committers, and until they 
become so, or an existing commons committer is prepared to take on the 
burden of acting as a broker between the active developers and SVN, the 
project has a better chance of thriving at SourceForge.

Chris

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message