commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Smith" <mich...@iammichael.org>
Subject [collections][PATCH] SequencedHashMap violates Map contract, has poor performance
Date Fri, 01 Feb 2002 06:07:23 GMT
I've submitted this patch twice now (once back in mid-November, once a
week ago), but it still has yet to be committed.  The patch is
essentially a complete rewrite of the SequencedHashMap class and fixes
the following problems:

 - Adding and removing from the map is O(n) because the linked list
needs to be traversed to find the object to remove (when adding, the
object needs to be removed so it can be readded at the front as the most
recently used).  There's been recent discussion on the commons list
about a LRU/MRU thing that is implemented using a linked list/hashtable
combination that had similar performance problems.  This class
demonstrates a manner in which the O(n) performance of using the
LinkedList can be avoided by implementing a custom linked list in the
map entries themselves.

 - The sets and collections returned by keySet(), values(), and
entries() are not properly backed by the map (in violation of the
java.util.Map contract).

The patch also adds an additional test case to verify that the remove()
method of the Iterator returned from keySet().iterator() is properly
backed by the map.

Regards,
Michael

p.s. Since the changes to SequencedHashMap were significant, I'm also
attaching a full version of the file for ease of glancing over the new
code.

Index:
D:/home/michael/dev/jakarta/jakarta-commons/collections/src/java/org/apa
che/commons/collections/SequencedHashMap.java
===================================================================
RCS file:
/home/cvspublic/jakarta-commons/collections/src/java/org/apache/commons/
collections/SequencedHashMap.java,v
retrieving revision 1.1
diff -u -r1.1 SequencedHashMap.java
---
D:/home/michael/dev/jakarta/jakarta-commons/collections/src/java/org/apa
che/commons/collections/SequencedHashMap.java	17 Sep 2001 16:43:49 -0000
1.1
+++
D:/home/michael/dev/jakarta/jakarta-commons/collections/src/java/org/apa
che/commons/collections/SequencedHashMap.java	1 Feb 2002 06:00:35 -0000
@@ -54,258 +54,507 @@
  * <http://www.apache.org/>.
  */

+import java.util.Collections;
 import java.util.Collection;
 import java.util.HashMap;
 import java.util.Iterator;
-import java.util.LinkedList;
+import java.util.AbstractCollection;
+import java.util.AbstractMap;
+import java.util.AbstractSet;
+import java.util.ArrayList;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
+import java.util.NoSuchElementException;

 /**
- * <p>A {@link java.util.HashMap} whose keys are sequenced.  The
- * sequencing of the keys allow easy access to the values in the order
- * which they were added in.  This class is thread safe.</p>
+ *  A map of objects whose mapping entries are sequenced based on the
order in
+ *  which they were added.  This data structure has fast <I>O(1)</I>
search
+ *  time, deletion time, and insertion time.
  *
- * <p>Implementing the List interface is not possible due to a instance
- * method name clash between the Collection and the List interface:
+ *  This class inherits from {@link java.util.HashMap} purely for
backwards
+ *  compatibility.  It should really be inheriting from {@link
+ *  java.util.AbstractMap}, or with a tiny extra bit of work, implement
the
+ *  full map interface on its own. APIs should not rely on this class
being an
+ *  actual {@link java.util.HashMap}, and instead should recognize it
only as a
+ *  generic {@link java.util.Map} (unless, of course, you need the
sequencing
+ *  functionality, but even in that case, this class should not be
referred to
+ *  as a HashMap).
  *
- * <table>
- * <tr><td>Collections</td><td>boolean remove(Object o)</td></tr>
- * <tr><td>Lists</td><td>Object remove(Object o)</td></tr>
- * </table>
- * </p>
+ *  <P>Although this map is sequenced, it cannot implement {@link
+ *  java.util.List} because of incompatible interface definitions.  The
remove
+ *  methods in List and Map have different return values (see: {@link
+ *  java.util.List#remove(Object)} and {@link
java.util.Map#remove(Object)}).
  *
- * <p>So one cannot implement both interfaces at the same, which is
- * unfortunate because the List interface would be very nice in
- * conjuction with <a
- * href="http://jakarta.apache.org/velocity/">Velocity</a>.</p>
- *
- * <p>A slightly more complex implementation and interface could
involve
- * the use of a list of <code>Map.Entry</code> objects.</p>
+ *  <P>This class is not thread safe.  When a thread safe
implementation is
+ *  required, wrap this class using {@link
Collections#synchronizedMap(Map)},
+ *  or use explicit synchronization controls.
  *
+ *  @author <a href="mailto:michael@iammichael.org">Michael A.
Smith</A>
  * @author <a href="mailto:dlr@collab.net">Daniel Rall</a>
  * @author <a href="mailto:hps@intermeta.de">Henning P.
Schmiedehausen</a>
  */
-public class SequencedHashMap extends HashMap
-{
-    /**
-     * The index of the eldest element in the collection.
-     */
-    protected static final int ELDEST_INDEX = 0;
-
-    /**
-     * Indicator for an unknown index.
-     */
-    private static final int UNKNOWN_INDEX = -1;
-
-    /**
-     * The sequence used to keep track of the hash keys.  Younger
objects are
-     * kept towards the end of the list.  Does not allow duplicates.
-     */
-    private LinkedList keySequence;
-
-    /**
-     * Creates a new instance with default storage.
-     */
-    public SequencedHashMap ()
-    {
-        keySequence = new LinkedList();
-    }
-
-    /**
-     * Creates a new instance with the specified storage.
-     *
-     * @param size The storage to allocate up front.
-     */
-    public SequencedHashMap (int size)
-    {
-        super(size);
-        keySequence = new LinkedList();
-    }
-
-    /**
-     * Clears all elements.
-     */
-    public void clear ()
-    {
-        super.clear();
-        keySequence.clear();
-    }
-
-    /**
-     * Creates a shallow copy of this object, preserving the internal
-     * structure by copying only references.  The keys, values, and
-     * sequence are not <code>clone()</code>'d.
-     *
-     * @return A clone of this instance.
-     */
-    public Object clone ()
-    {
-        SequencedHashMap seqHash = (SequencedHashMap) super.clone();
-        seqHash.keySequence = (LinkedList) keySequence.clone();
-        return seqHash;
-    }
-
-    /**
-     * Returns the key at the specified index.
-     */
-    public Object get (int index)
-    {
-        return keySequence.get(index);
-    }
-
-    /**
-     * Returns the value at the specified index.
-     */
-    public Object getValue (int index)
-    {
-        return get(get(index));
-    }
-
-    /**
-     * Returns the index of the specified key.
-     */
-    public int indexOf (Object key)
-    {
-        return keySequence.indexOf(key);
-    }
-
-    /**
-     * Returns a key iterator.
-     */
-    public Iterator iterator ()
-    {
-        return keySequence.iterator();
-    }
-
-    /**
-     * Returns the last index of the specified key.
-     */
-    public int lastIndexOf (Object key)
-    {
-        return keySequence.lastIndexOf(key);
-    }
-
-    /**
-     * Returns the ordered sequence of keys.
-     *
-     * This method is meant to be used for retrieval of Key / Value
pairs
-     * in e.g. Velocity:
-     * <PRE>
-     * ## $table contains a sequenced hashtable
-     * #foreach ($key in $table.sequence())
-     * &lt;TR&gt;
-     * &lt;TD&gt;Key: $key&lt;/TD&gt;
-     * &lt;/TD&gt;Value: $table.get($key)&lt;/TD&gt;
-     * &lt;/TR&gt;
-     * #end
-     * </PRE>
-     *
-     * @return The ordered list of keys.
-     */
-    public List sequence()
-    {
-        return keySequence;
-    }
-
-    /**
-     * Stores the provided key/value pair.  Freshens the sequence of
existing
-     * elements.
-     *
-     * @param key   The key to the provided value.
-     * @param value The value to store.
-     * @return      The previous value for the specified key, or
-     *              <code>null</code> if none.
-     */
-    public Object put (Object key, Object value)
-    {
-        Object prevValue = super.put(key, value);
-        freshenSequence(key, prevValue);
-        return prevValue;
-    }
-
-    /**
-     * Freshens the sequence of the element <code>value</code> if
-     * <code>value</code> is not <code>null</code>.
-     *
-     * @param key   The key whose sequence to freshen.
-     * @param value The value whose existance to check before removing
the old
-     *              key sequence.
-     */
-    protected void freshenSequence(Object key, Object value)
-    {
-        if (value != null)
-        {
-            // Freshening existing element's sequence.
-            keySequence.remove(key);
-        }
-        keySequence.add(key);
-    }
+public class SequencedHashMap extends HashMap {

-    /**
-     * Stores the provided key/value pairs.
-     *
-     * @param t The key/value pairs to store.
-     */
-    public void putAll (Map t)
-    {
-        Set set = t.entrySet();
-        for (Iterator iter = set.iterator(); iter.hasNext(); )
-        {
-            Map.Entry e = (Map.Entry)iter.next();
-            put(e.getKey(), e.getValue());
+  /**
+   *  {@link java.util.Map.Entry} that doubles as a node in the linked
list
+   *  of sequenced mappings.
+   **/
+  private static class Entry implements Map.Entry {
+    private final Object key_;
+    private Object value_;
+
+    Entry next = null;
+    Entry prev = null;
+
+    public Entry(Object key, Object value) {
+      this.key_ = key;
+      this.value_ = value;
+    }
+
+    public Object getKey() {
+      return this.key_;
+    }
+
+    public Object getValue() {
+      return this.value_;
+    }
+
+    public Object setValue(Object value) {
+      Object oldValue = this.value_;
+      this.value_ = value;
+      return oldValue;
+    }
+
+    public int hashCode() {
+      // per Map.Entry.hashCode() api docs
+      return ((getKey() == null ? 0 : getKey().hashCode()) ^
+              (getValue()==null ? 0 : getValue().hashCode()));
+    }
+
+    public boolean equals(Object obj) {
+      if(obj == null) return false;
+      if(obj == this) return true;
+      if(!(obj instanceof Map.Entry)) return false;
+
+      Map.Entry other = (Map.Entry)obj;
+
+      // per Map.Entry.equals(Object api docs)
+      return((getKey() == null ?
+              other.getKey() == null :
+              getKey().equals(other.getKey()))  &&
+             (getValue() == null ?
+              other.getValue() == null :
+              getValue().equals(other.getValue())));
+    }
+    public String toString() {
+      return "[" + getKey() + "=" + getValue() + "]";
+    }
+  }
+
+  private static final Entry createSentinel() {
+    Entry s = new Entry(null, null);
+    s.prev = s;
+    s.next = s;
+    return s;
+  }
+
+  private Entry sentinel;
+  private HashMap entries;
+
+  public SequencedHashMap() {
+    sentinel = createSentinel();
+    entries = new HashMap();
+  }
+
+  public SequencedHashMap(int initialSize) {
+    sentinel = createSentinel();
+    entries = new HashMap(initialSize);
+  }
+
+  public SequencedHashMap(int initialSize, float loadFactor) {
+    sentinel = createSentinel();
+    entries = new HashMap(initialSize, loadFactor);
+  }
+
+  public SequencedHashMap(Map m) {
+    this();
+    putAll(m);
+  }
+
+  /**
+   *  Removes an internal entry from the linked list.  This does not
remove
+   *  it from the underlying map.
+   **/
+  private void removeEntry(Entry entry) {
+    entry.next.prev = entry.prev;
+    entry.prev.next = entry.next;
+  }
+
+  /**
+   *  Inserts a new internal entry to the linked list.  This does not
add the
+   *  entry to the underlying map.
+   **/
+  private void insertEntry(Entry entry) {
+    entry.next = sentinel;
+    entry.prev = sentinel.prev;
+    sentinel.prev.next = entry;
+    sentinel.prev = entry;
+  }
+
+  public int size() {
+    return entries.size();
+  }
+
+  public boolean isEmpty() {
+    return entries.isEmpty();
+  }
+
+  public boolean containsKey(Object key) {
+    return entries.containsKey(key);
+  }
+
+  public boolean containsValue(Object value) {
+    return entries.containsValue(value);
+  }
+
+  public Object get(Object o) {
+    // find entry for the specified key
+    Entry entry = (Entry)entries.get(o);
+    if(entry == null) return null;
+
+    return entry.getValue();
+  }
+
+  public Object put(Object key, Object value) {
+
+    Object oldValue = null;
+
+    Entry e = (Entry)entries.get(key);
+    if(e != null) {
+      // remove from list so the entry gets "moved" to the front of
list
+      removeEntry(e);
+
+      // update value in map
+      oldValue = e.setValue(value);
+    } else {
+      // add new entry
+      e = new Entry(key, value);
+      entries.put(key, e);
+    }
+    // entry in map, but not list
+
+    // add to list
+    insertEntry(e);
+
+    return oldValue;
+  }
+
+  public Object remove(Object key) {
+    Entry e = (Entry)entries.remove(key);
+    if(e == null) return null;
+    removeEntry(e);
+    return e.getValue();
+  }
+
+  // use abstract map impl
+  //void putAll(Map t)
+
+  public void clear() {
+    entries.clear();
+    sentinel.next = sentinel;
+    sentinel.prev = sentinel;
+  }
+
+  public Set keySet() {
+    return new AbstractSet() {
+
+      // required impl
+      public Iterator iterator() { return new OrderedIterator(KEY); }
+      public boolean remove(Object o) {
+        return SequencedHashMap.this.remove(o) != null;
+      }
+
+      // more efficient impls than abstract set
+      public void clear() {
+        SequencedHashMap.this.clear();
+      }
+      public int size() {
+        return SequencedHashMap.this.size();
+      }
+      public boolean isEmpty() {
+        return SequencedHashMap.this.isEmpty();
+      }
+      public boolean contains(Object o) {
+        return SequencedHashMap.this.containsKey(o);
+      }
+
+    };
+  }
+
+  public Collection values() {
+    return new AbstractCollection() {
+      // required impl
+      public Iterator iterator() { return new OrderedIterator(VALUE); }
+      public boolean remove(Object o) {
+        for(Entry pos = sentinel.next; pos != sentinel; pos = pos.next)
{
+          if((pos.getValue() == null && o == null) ||
+             (pos.getValue() != null && pos.getValue().equals(o))) {
+            SequencedHashMap.this.remove(pos.getKey());
+            return true;
+          }
         }
-    }
+        return false;
+      }

-    /**
-     * Removes the element at the specified index.
-     *
-     * @param index The index of the object to remove.
-     * @return      The previous value coressponding the
<code>key</code>, or
-     *              <code>null</code> if none existed.
-     */
-    public Object remove (int index)
-    {
-        return remove(index, null);
-    }
-
-    /**
-     * Removes the element with the specified key.
-     *
-     * @param key   The <code>Map</code> key of the object to remove.
-     * @return      The previous value coressponding the
<code>key</code>, or
-     *              <code>null</code> if none existed.
-     */
-    public Object remove (Object key)
-    {
-        return remove(UNKNOWN_INDEX, key);
-    }
-
-    /**
-     * Removes the element with the specified key or index.
-     *
-     * @param index The index of the object to remove, or
-     *              <code>UNKNOWN_INDEX</code> if not known.
-     * @param key   The <code>Map</code> key of the object to remove.
-     * @return      The previous value coressponding the
<code>key</code>, or
-     *              <code>null</code> if none existed.
-     */
-    private final Object remove (int index, Object key)
-    {
-        if (index == UNKNOWN_INDEX)
-        {
-            index = indexOf(key);
-        }
-        if (key == null)
-        {
-            key = get(index);
-        }
-        if (index != UNKNOWN_INDEX)
-        {
-            keySequence.remove(index);
-        }
-        return super.remove(key);
-    }
-}
+      // more efficient impls than abstract collection
+      public void clear() {
+        SequencedHashMap.this.clear();
+      }
+      public int size() {
+        return SequencedHashMap.this.size();
+      }
+      public boolean isEmpty() {
+        return SequencedHashMap.this.isEmpty();
+      }
+      public boolean contains(Object o) {
+        return SequencedHashMap.this.containsValue(o);
+      }
+    };
+  }
+
+  public Set entrySet() {
+    return new AbstractSet() {
+      // helper
+      private Entry findEntry(Object o) {
+        if(o == null) return null;
+        if(!(o instanceof Map.Entry)) return null;
+
+        Map.Entry e = (Map.Entry)o;
+        Entry entry = (Entry)entries.get(e.getKey());
+        if(entry.equals(e)) return entry;
+        else return null;
+      }
+
+      // required impl
+      public Iterator iterator() {
+        return new OrderedIterator(ENTRY);
+      }
+      public boolean remove(Object o) {
+        Entry e = findEntry(o);
+        if(e == null) return false;
+
+        return SequencedHashMap.this.remove(e.getKey()) != null;
+      }
+
+      // more efficient impls than abstract collection
+      public void clear() {
+        SequencedHashMap.this.clear();
+      }
+      public int size() {
+        return SequencedHashMap.this.size();
+      }
+      public boolean isEmpty() {
+        return SequencedHashMap.this.isEmpty();
+      }
+      public boolean contains(Object o) {
+        return findEntry(o) != null;
+      }
+    };
+  }
+
+  private static final int KEY = 0;
+  private static final int VALUE = 1;
+  private static final int ENTRY = 2;
+
+  private class OrderedIterator implements Iterator {
+
+    private int returnType;
+    private Entry pos = sentinel;
+    boolean removable = false;
+
+    public OrderedIterator(int returnType) {
+      this.returnType = returnType;
+    }
+
+    public boolean hasNext() {
+      return pos.next != sentinel;
+    }
+
+    public Object next() {
+      if(pos.next == sentinel) {
+        throw new NoSuchElementException();
+      }
+      pos = pos.next;
+      removable = true;
+      switch(returnType) {
+      case KEY:
+        return pos.getKey();
+      case VALUE:
+        return pos.getValue();
+      case ENTRY:
+        return pos;
+      default:
+        throw new Error("bad iterator type");
+      }
+
+    }
+
+    public void remove() {
+      if(!removable) {
+        throw new IllegalStateException("remove() must follow next()");
+      }
+
+      SequencedHashMap.this.remove(pos.getKey());
+    }
+  }
+
+  // APIs maintained from previous version of SequencedHashMap for
backwards
+  // compatibility
+
+  /**
+   * Creates a shallow copy of this object, preserving the internal
+   * structure by copying only references.  The keys, values, and
+   * sequence are not <code>clone()</code>'d.
+   *
+   * @return A clone of this instance.
+   */
+  public Object clone () {
+    // yes, calling super.clone() silly since we're just blowing away
all
+    // the stuff that super might be doing anyway, but for motivations
on
+    // this, see:
+    // http://www.javaworld.com/javaworld/jw-01-1999/jw-01-object.html
+    SequencedHashMap map = (SequencedHashMap)super.clone();
+    map.sentinel = createSentinel();
+
+    // note: this does not preserve the initial capacity and load
factor.
+    map.entries = new HashMap();
+
+    map.putAll(this);
+
+    return map;
+  }
+
+  /**
+   *  Returns the Map.Entry at the specified index
+   *
+   *  @exception ArrayIndexOutOfBoundsException if the specified index
is
+   *  <code>&lt; 0</code> or <code>&gt;</code> the size of
the map.
+   **/
+  private Map.Entry getEntry(int index) {
+    Entry pos = sentinel;
+
+    if(index < 0) {
+      throw new ArrayIndexOutOfBoundsException(index + " < 0");
+    }
+
+    // loop to one before the position
+    int i = -1;
+    while(i < index && pos.next != sentinel) {
+      i++;
+      pos = pos.next;
+    }
+    // pos.next is the requested position
+
+    // if sentinel is next, past end of list
+    if(pos.next == sentinel) {
+      throw new ArrayIndexOutOfBoundsException(index + " >= " + (i +
1));
+    }
+
+    return pos.next;
+  }
+
+  /**
+   * Returns the key at the specified index.
+   *
+   *  @exception ArrayIndexOutOfBoundsException if the
<code>index</code> is
+   *  <code>&lt; 0</code> or <code>&gt;</code> the size of
the map.
+   */
+  public Object get (int index)
+  {
+    return getEntry(index).getKey();
+  }
+
+  /**
+   * Returns the value at the specified index.
+   *
+   *  @exception ArrayIndexOutOfBoundsException if the
<code>index</code> is
+   *  <code>&lt; 0</code> or <code>&gt;</code> the size of
the map.
+   */
+  public Object getValue (int index)
+  {
+    return getEntry(index).getValue();
+  }
+
+  /**
+   * Returns the index of the specified key.
+   */
+  public int indexOf (Object key)
+  {
+    Entry e = (Entry)entries.get(key);
+    int pos = 0;
+    while(e.prev != sentinel) {
+      pos++;
+      e = e.prev;
+    }
+    return pos;
+  }
+
+  /**
+   * Returns a key iterator.
+   */
+  public Iterator iterator ()
+  {
+    return keySet().iterator();
+  }
+
+  /**
+   * Returns the last index of the specified key.
+   */
+  public int lastIndexOf (Object key)
+  {
+    // keys in a map are guarunteed to be unique
+    return indexOf(key);
+  }
+
+  /**
+   * Returns a List view of the keys rather than a set view.  The
returned
+   * list is unmodifiable.  This is required because changes to the
values of
+   * the list (using {@link java.util.ListIterator#set(Object)}) will
+   * effectively remove the value from the list and reinsert that value
at
+   * the end of the list, which is an unexpected side effect of
changing the
+   * value of a list.  This occurs because changing the key, changes
when the
+   * mapping is added to the map and thus where it appears in the list.
+   *
+   * <P>An alternative to this method is to use {@link #keySet()}
+   *
+   * @see #keySet()
+   * @return The ordered list of keys.
+   */
+  public List sequence()
+  {
+    List l = new ArrayList(size());
+    Iterator iter = keySet().iterator();
+    while(iter.hasNext()) {
+      l.add(iter.next());
+    }
+
+    return Collections.unmodifiableList(l);
+  }
+
+  /**
+   * Removes the element at the specified index.
+   *
+   * @param index The index of the object to remove.
+   * @return      The previous value coressponding the
<code>key</code>, or
+   *              <code>null</code> if none existed.
+   *
+   * @exception ArrayIndexOutOfBoundsException if the
<code>index</code> is
+   * <code>&lt; 0</code> or <code>&gt;</code> the size of
the map.
+   */
+  public Object remove (int index)
+  {
+    return remove(get(index));
+  }

+}
Index:
D:/home/michael/dev/jakarta/jakarta-commons/collections/src/test/org/apa
che/commons/collections/TestSequencedHashMap.java
===================================================================
RCS file:
/home/cvspublic/jakarta-commons/collections/src/test/org/apache/commons/
collections/TestSequencedHashMap.java,v
retrieving revision 1.1
diff -u -r1.1 TestSequencedHashMap.java
---
D:/home/michael/dev/jakarta/jakarta-commons/collections/src/test/org/apa
che/commons/collections/TestSequencedHashMap.java	17 Sep 2001
16:43:49 -0000	1.1
+++
D:/home/michael/dev/jakarta/jakarta-commons/collections/src/test/org/apa
che/commons/collections/TestSequencedHashMap.java	1 Feb 2002
06:00:36 -0000
@@ -107,6 +107,42 @@
         return new Object[] { "bar", "frob", new Object() };
     }

+    public void testKeySetIteratorRemove() {
+        Object[] keys = getKeys();
+        Object[] values = getValues();
+        for(int i = 0; i < keys.length; i++) {
+            labRat.put(keys[i], values[i]);
+        }
+
+        for(int i = keys.length - 1; i >= 0; i--) {
+            // go to last key in the key set view of the map
+            Iterator iter = labRat.keySet().iterator();
+            while(iter.hasNext()) {
+                iter.next();
+            }
+
+            // and remove it
+            iter.remove();
+
+            assertEquals("size() does not match expected size after " +
+                         "remove using key set iterator",
+                         i, labRat.size());
+
+            // make sure that iterator() has the correct number of
entries in
+            // it as well.
+            int count = 0;
+            iter = labRat.iterator();
+            while(iter.hasNext()) {
+                iter.next();
+                count++;
+            }
+
+            assertEquals("iterator() does not have the same number of "
+
+                         "values as the map",
+                         labRat.size(), count);
+        }
+    }
+
     public void testSequenceMap() throws Throwable {
         Object[] keys = getKeys();
         int expectedSize = keys.length;
@@ -150,4 +186,4 @@
     protected void tearDown() {
         labRat = null;
     }
-}
\ No newline at end of file
+}

Mime
View raw message