lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <>
Subject Re: how to instantiate a Set?
Date Mon, 23 Feb 2009 08:51:04 GMT

On Sun, 22 Feb 2009, Andi Vajda wrote:

> On Sun, 22 Feb 2009, Bill Janssen wrote:
>> I'm probably missing something incredibly obvious here...
>> I'm trying to call MoreLikethis.setStopWords(Set words).  I've got a
>> list of stop words in Python, but I can't figure out how to turn that
>> into a Java Set.  I tried "lucene.HashSet(set(words)",
>> "lucene.HashSet(lucene.ArrayList(JArray("string")(words)))", and so
>> forth, without much luck.
> PyLucene doesn't wrap the java.util.Arrays class that fills in the Java gap 
> between arrays and collections. That should be considered an oversight of 
> mine. I should add it to the JCC invocation in PyLucene's Makefile. Then you 
> would be able pass your JArray instance to Arrays toList() method to make an 
> ArrayList and finally feed that to a HashSet.
> Another alternative is to implement a Python extension of the Java Set 
> interface. Guess what ? that is already part of PyLucene. The PythonSet class 
> is the extension point for implementing a Java Set in Python and that is part 
> of the PyLucene distribution.
> I even have such a Python implementation of a Java Set, called, 
> ready here but it's not currently shipping with PyLucene, another oversight 
> of mine. I should add it to the distribution.
> Until then, here it is below. It takes a python set instance as constructor 
> argument and implements the complete Java Set interface. This example also 
> illustrates a Python implementation of the Java Iterator interface.

I added a module to the PyLucene distribution.
To use it:
   >>> from lucene.collections import JavaSet
   >>> from lucene import initVM, CLASSPATH
   >>> initVM(CLASSPATH)
   >>> a = JavaSet(set(['foo', 'bar', 'baz']))

I also added some missing proxies for the mapping and sequence protocols so 
that JavaSet can be iterated and used with the 'in' operator from Python.


View raw message