commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <aok...@bellsouth.net>
Subject RE: [primitives] Looking for a primitive hashtable
Date Thu, 01 Apr 2004 04:16:41 GMT
Joel,

> -----Original Message-----
> From: Joel Shellman [mailto:joel@ikestrel.com]
> Subject: Re: [primitives] Looking for a primitive hashtable
> 
> Not necessarily so. I just started writing a little int keyed Object
> "map" backed by an object array last week.
> 
> The key difference is that if it is okay for the map to define the key,
> then it's no problem at all. For each new object, just append it to the
> end of the array. So the only holes that show up are when someone
> removes an object from the "map". And if you like, keep track of the
> holes so you can reuse them later.

> One might argue that what I'm talking about is simply a thin layer
> making it easier to work with an extendable Object array, and that's
> fine. It just depends on how you want to look at it.

Yeah I think that your implementation is more for arrays or lists.  I'm
still failing to see how it would fit a Map.  Your example has a couple of
things that make me worry about memory usage:

> public class IntMap implements Serializable {
> 
> 	private Object[] objects;
> 
> 	private int cursor = 0;
> 
> 	public IntMap() {
> 		objects = new Object[7];
> 		holes = new Stack();
> 	}
> 
> 	public IntMap(int initialCapacity) {
> 		objects = new Object[initialCapacity];
> 	}
> 
> 	public int add(Object value) {
> 		return append(value);
> 	}
> 
> 	/**
> 	 * @param value
> 	 */
> 	private int append(Object value) {
> 		if (cursor == objects.length) {
> 			expandCapacity();
> 		}
> 		objects[cursor] = value;
> 		return cursor++;
> 	}
> 
> 	private void expandCapacity() {
> 		throw new UnsupportedOperationException("Not done yet");
> 	}
> 
> 	public void put(int key, Object value) {
> 		objects[key] = value;
> 	}

You'll run into index exceptions if the key is higher here.  You cannot
expect to allocate memory as large as your key size.  You'll consume and
waste too much of it.  Plus copy operations will take a long time every time
you allocate a larger array and copy your existing values.

Perhaps if you could somehow map an int key to an int index then you might
be able to pull off a primitive Map implementation in the style that you're
presenting here.  That way you don't have to have an array as large as the
key.

Two extra integer arrays might be used to achieve this address translation.
WARNING: thinking out loud here.  So if you had these two int arrays and an
object array then could you manage a lookup using index translation?  

Let's experiment with the idea using an example.  Presume these put and
remove operations are performed on such an implementation:

ArrayIntMap map = new ArrayIntMap() ;
map.put( 23, "Rat" ) ;
map.put( 59, "Cow" ) ;
map.put( 94, "Dog" ) ;
map.put( 2048, "Cat" ) ;
map.remove( 59 ) ;

Now the arrays would look like so after these operations are performed:

  Keys    Indices   Object Values
 ------    -----       -----
 |    |    |   |       | +-|-----> "Cat"
 ------    -----       -----
 |2048|    | 3 |       | +-|-----> "Dog"
 ------    -----       -----
 | 94 |    | 2 |       |   |       a hole (null) 
 ------    -----       -----
 | 23 |    | 0 |       | +-|-----> "Rat"
 ------    -----       -----

Basically the remove operation leaves a hole that can be optimistically
filled with the next put operation.  The questions we need to ask our selves
are:

1). How much more space does this consume?
2). How much more time does it consume?

Notice the "more" in the questions so it is relative to something: a
conventional hashing mechanism.

#1 is easy.  This consumes 3x the space however there is no object creation
going on.  There is no Entry so to speak.  This raises the question of
whether one is required?  

#2 is where we get snagged.  Basically it takes a full scan of the keys to
find the index to find the object.  That sucks.  So we're not really gaining
anything here with a lookup time of O(n).  We could have just scanned the
"Object Values" array looking for the value.

So what can we conclude?  

Hashing basically allows us to lookup a value by key without having to scan
the entire key set.  We have no choice but to do this to enable lookups in
constant time.  So Joel you're better off focusing on a solid hashing
mechanism.

Alex



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message