lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin A. Burton" <>
Subject Re: DocumentWriter, StopFilter should use HashMap... (patch)
Date Wed, 10 Mar 2004 19:59:46 GMT
Erik Hatcher wrote:

>> Also... while you're at it... the private variable name is 'table' 
>> which this HashSet certainly is *not* ;)
> Well, depends on your definition of 'table' I suppose :)  I changed it 
> to a type-agnostic stopWords.

Did you know that internally HashSet uses a HashMap?

I sure didn't!

hashset.contains() maps to hashmap.containsKey()

It uses a key -> value mapping to a generic PRESENT Object... hm. 

>> Probably makes sense to just call this variable 'hashset' and then 
>> force the type to be HashSet since it's necessary for this to be a 
>> HashSet to maintain any decent performance.  You'll need to update 
>> your second constructor to require a HashSet too.. would be very bad 
>> to let callers use another set impl... TreeSet and SortedSet would 
>> still be too slow...
> I refuse to expose HashSet... sorry!  :)  But I did wrap what is 
> passed in, like above, in a HashSet in my latest commit. 

Hm... You're doing this EVEN if the caller passes a HashSet directly?!

Why do you have a problem exposing a HashSet/Map... it SHOULD be a Hash 
based implementation.  Doing anything else is just wrong and would 
seriously slow down Lucene indexing.

Also... you're HashSet constructor has to copy values from the original 
HashSet into the new HashSet ... not very clean and this can just be 
removed by forcing the caller to use a HashSet (which they should).




Please reply using PGP.    
    NewsMonster -
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
       AIM/YIM - sfburtonator,  Web -
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
  IRC - #infoanarchy | #p2p-hackers | #newsmonster

View raw message