commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Oxspring" <roxspr...@imapmail.org>
Subject RE: [Collections] [SUBMIT] Trie
Date Sat, 07 Dec 2002 01:29:06 GMT


> -----Original Message-----
> From: Rich Dougherty [mailto:rich@rd.gen.nz] 
-8<---
> Ok, I'll jump in and level a few criticisms at the interface 
> and implementation I submitted. :-)
> 
> --- Trie - is it the right name? ---
> 
> I must admit, the more I've looked at this, the more 
> uncomfortable I am with the name. In my reading, I've only 
> ever heard 'trie' applied to a data structure indexed by 
> strings. However, a trie does seem to fit into a more general 
> class of data structures. It looks like these structures are 
> called prefix trees--although I can't find a solid 
> definition. However, it may be a very elementary data structure:
> 
>   http://proteacher.net/archive/posts/2001/09/26/21017.html
> 
> So conceptually, we have this sort of breakdown:
> 
> Prefix trees
>  - Prefix trees for handling strings
>     - Trie
>     - The data structure used in suffix trees
>  - Prefix trees for handling binary data
>     - Patricia trees
>  - Prefix trees for handling sequences of anything
>     - AbstractObjectArrayTrie (which should be renamed)
> 
> --- Separating the interface from the implementation ---
> 
> Perhaps the interface doesn't need to be named after the 
> implementation at all. There may be other data structures 
> which provide efficient operations on the prefix of keys, but 
> aren't implemented as a prefix tree.
> 
> --- Suggestion ---
> 
> Here's a rough suggestion.
> 
> /**
>  * A Map which provides several operations based on the
>  * prefix of keys.
>  */
> public interface PrefixMap extends Map {
> 
>     public PrefixMap subPrefixMap(Object keyPrefix);
> 
>     public Map.Entry getLongest(Object candidateKey);
> 
> }
> 
> There is a nice symmetry in providing a SortedPrefixMap, 
> although actual implementation would involve a reasonable 
> amount of work. (But I'll do it if there is consensus about it.)
> 
> public interface SortedPrefixMap extends PrefixMap, SortedMap {
> 
>     // This would return a SortedPrefixMap. I can't override the
>     // return type, can I?
>     public PrefixMap subPrefixMap(Object keyPrefix);
> 
> }
> 
> /**
>  * An implementation of PrefixMap. Subclasses must provide an
>  * implementation for translating keys to and from object arrays
>  * and a mechanism for storing outgoing references.
>  */
> public abstract class AbstractObjectArrayPrefixTree
>         implements PrefixMap {
>     ...
> }
> 
> public abstract class AbstractObjectArraySortedPrefixTree
>         extends AbstractObjectArrayPrefixTree
>         implements SortedPrefixMap {
> 
>     // extra SortedMap methods
> 
>     // requires subclasses to return items in order with
>     // outgoingTransitionIterator().
> 
>     ...
> }
> 
> /**
>  * Stores outgoing references in a Map, subclasses must
>  * still provide an implementation for translating keys to
>  * and from object arrays.
>  */
> public abstract class AbstractOutgoingTransitionPrefixMap
>         extends AbstractObjectArrayPrefixTree {
>     ...
> }
> 
> public abstract class AbstractOutgoingTransitionSortedPrefixMap
>         extends AbstractObjectArraySortedPrefixTree {
>     // needs to be passed in comparator
>     ...
> }
> 
> /**
>  * A SortedPrefixMap for strings.
>  */
> public class Trie implements SortedPrefixMap
>         extends AbstractObjectArraySortedPrefixTree {
>     // Future implementations may not extend this
>     // class. They may be more efficient, and based
>     // on native chars or ints.
>     ...
> }
> 
> Any comments on this? Particularly:
> 
> - The hierarchy of abstract classes. Should it be flattened? At the
>   moment I'm separating out storage of the transitions. This is nice,
>   but perhaps it clutters the package too much?
> 
> - Stephen has mentioned some sort of TrieUtils? Perhaps this should be
>   called PrefixMapUtils, or go into MapUtils.
> 
> Rich

Firstly, I think the name PrefixMap is probably clearer than Trie and
should probably be adopted.  My second thought is that the hierachy
seems pretty deep - more so than other collections implementations I can
think of.  

I'm not sure how much benefit there is to making Sorted guarentees, code
such as the following would probably suit my needs equally well: (and I
prefer the flexibility of changing the comparator at the point of use)
List values = new ArrayList(trie.prefixTrie(prefix).values())
Collections.sort(values, myComparator)

I'm not sure I understand the benefits of separating out the transition
storage but hopefully I'll have time to play and understand better in
the morning.

Rob

> 
> 
> 
> --
> To unsubscribe, e-mail:   
> <mailto:commons-dev-> unsubscribe@jakarta.apache.org>
> For 
> additional commands, 
> e-mail: <mailto:commons-dev-help@jakarta.apache.org>
> 
> 
> 


--
To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.org>


Mime
View raw message