commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Story <henry.st...@bblfish.net>
Subject An improvement to AbstractHashedMap, and a new HashedSet class
Date Mon, 09 Feb 2004 18:58:47 GMT
Hi,

I was looking at an interesting framework for Value Object written by 
Dirk Riehle
at available under the lgpl at http://www.jvalue.org/. There was a 
small bug in his code which I thought I might be able to fix using the 
commons library here.

Essentially jvalue requires what could be thought of as a 
WeakHashedSet, in order to allow for reuse of his Value Objects. It is 
not possible to extend the java.util classes efficiently, so I thought 
it would be easier to do this with the collections package.

I first looked to create a HashedSet class, an equivalent of 
java.net.HashSet. The best way to do this would be to extend the 
AbstractHashedMap. The JavaDoc for this class says:

 > This class implements all the features necessary for a subclass 
hash-based map.
 > Key-value entries are stored in instances of the HashEntry class,  
which can be
 > overridden and replaced. The iterators can similarly be replaced,  
without the
 > need to replace the KeySet, EntrySet and Values view classes.

But in fact it is not quite as extensible as it could be because the 
HashEntry is currently a class, not an interface. Let me explain. I 
want to create a HashedSet. that would use a HashedMap to store its 
objects. Since keys in a hash are unique, adding a key to the HashedMap 
guarantees uniqueness. So I could just wrap a HashedMap around my 
HashedSet and be done with it. But now every time I add an object to my 
HashedSet I am adding it to a HashedMap which creates a new HashEntry, 
which save a pointer for a hash value that is never used. So we are 
wasting 4bytes per entry on a 32bit machine. This is acceptable for run 
of the mill applications, but not for libraries that get used a lot.

The solution is simple. I turned the HashEntry inner class into an 
inner interface of AbstractHashedMap. I added a few methods to get/set 
the key and the value. These methods should be inlined by all modern 
compilers, so I don't think there is any efficiency problem here.

I then wrote a HashedSet which has an inner class that extends 
AbstractHashedMap, but which takes a modified HashEntry class where the 
key and the value are the same object. The resulting 
net.bblfish.util.HashedSet is thus more memory efficient than 
java.util.HashSet. Perhaps it would be worth adding these changes to 
the library.

I have attached the code that required changing in the attached 
changed.tar file.

If the change to 
org.apache.commons.collections.map.AbstractHashedMap.HashEntry is 
accepted then it will make it relatively easy to write a WeakHashedSet. 
It will be just a matter of extending WeakReference by implementing the 
HashEntry methods.

Henry

I have placed the changes in a tar file at:
http://bblfish.net/tmp/changed.tar


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message