lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Koch" <k...@orbiteam.de>
Subject AW: AW: PyLucene use JCC shared object by default
Date Mon, 30 Apr 2012 17:02:27 GMT
Dear Andi,
I again had a look at the patch I submitted recently and would like to get back to it.  An
updated version of the patch is attached to this email - the patch is against the branch_3x
repo http://svn.apache.org/repos/asf/lucene/pylucene/branches/branch_3x

The patch mainly 
- adds two java classes:  PythonList,  PythonListIterator
- adds according Python classes   (JavaListIterator and JavaList in collections.py)

Purpose: 
- provide a Java-based List implementation in JCC/PyLucene (similar to existing PythonSet/JavaSet)
- allow to pass python lists via Java Collections into PyLucene

Let's try summarize shortly: PythonSet /JavaSet was already existing, but nothing similar
for Lists. I made an implementation of PythonList /JavaList and with your help this is now
basically working. Except of an open issue that affects both JavaSet and JavaList: initialization
of an ArrayList with a JavaSet (or JavaList) may cause trouble.

As you said: "There is a bug somewhere with constructing an ArrayList from a python collection
like JavaSet or JavaList."

I tried to change the toArray() method as you suggested, but that didn't help. As far as I
understood, there are two options to box python values into a typed JArray:

1)  use the object based JArray class and box python values by wrapping them with the corresponding
Java object (e.g. type<int> -> lucene.Integer):

>>> x = lucene.JArray('object')([lucene.Boolean(True),lucene.Boolean(False)])
JArray<object>[<Object: true>, <Object: false>]
>>> type(x[0])
<type 'Object'>

2)  use the correct array type (int, float, etc.) and pass the list of Python elements or
literals) to the JArray constructur, e.g.

>>> y = lucene.JArray('bool')([True,False])
JArray<bool>[True, False]
>>> type(y[0])
<type 'bool'>

I tried both of them (see _pyList2JArray methods in collections.py) but none of them did the
trick. Actually the 'empty objects in ArrayList' problem remains when handling with strings
(the ArrayList object that is initialized with a JavaSet or JavaList of string items will
have a number of objects as the original JavaSet/JavaList, but all objects are the same -
ooks like an array of empty objects). Furthermore another issue with integer lists comes into
play: here the initialization of  ArrayList with the Collection fails with a Java stacktrace
(lucene.JavaError: org.apache.jcc.PythonException). 

The most simple test case is as follows:

--%< --
import lucene
lucene.initVM()
from lucene.collections import JavaList

# using strings: the ArrayList is created, but initialized with empty objects
jl = JavaList(['a','b'])
al = lucene.ArrayList(jl)
assert (not al.get(0).equals(al.get(1))), "unique values"

# using ints: the ArrayList is not created,  but an error occurs instead:
# Java stacktrace: org.apache.jcc.PythonException: ('while calling toArray')
jl = JavaList(range(3))
al = lucene.ArrayList(jl)
--%< --

I currently feel like having to stab around in the dark to find out what's going on here and
would welcome any suggestions. Needs some JCC expert I guess ,-)

Of course we can leave the patch out - but still there's the same issue with JavaSet.


kind regards

Thomas 
--
OrbiTeam Software GmbH & Co. KG, Germany
http://www.orbiteam.de


> -----Urspr√ľngliche Nachricht-----
> Von: Andi Vajda [mailto:vajda@apache.org]
> Gesendet: Mittwoch, 18. April 2012 20:37
> An: pylucene-dev@lucene.apache.org
> Betreff: Re: AW: PyLucene use JCC shared object by default
> 
> 
> Hi Thomas,
> ...
> Lucene 3.6 just got released a few days ago. Apart from your patch, the
> PyLucene 3.6 release is ready. I'm about to go offline (email only) for a week.
> Let's revisit this patch then (first week of May). It's not blocking the release
> right now as, even if I sent out a release candidate for a vote, the three
> business days required for this would take this into the time I'm away.
> ...
> Andi..


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message