lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <va...@apache.org>
Subject Re: AW: PyLucene use JCC shared object by default
Date Wed, 18 Apr 2012 18:37:02 GMT

Hi Thomas, 

On Apr 18, 2012, at 6:31, "Thomas Koch" <koch@orbiteam.de> wrote:

> Hi,
> sounds like an interesting project – may I ask what you actually implemented and what’s
the motivation (e.g. performance?)?
> 
> I’ve started to experiment with the Facet support in Lucene (actually in PyLucene –
ported an example to Python) and found that facetted search support in Lucene looks powerful
(though API is still said to be ‘experimental’ and I can’t say anything about performance
yet).  I’m talking about the org.apache.lucene.facet.* packages – part of the contrib
part of Lucene and available as JARs that’s accessible in PyLucene as well. I’m not that
familiar with Solr but AFAIK it’s based on Lucene (Java) and should (hopefully) use the
same Java code for its facet search support. Of course Solr adds some nice configuration support
and web GUI to Lucene, but the ‘core’ search is built on Lucene (to my knowledge). So
did you re-implement the Lucene facet search/index code (like TaxonomyReader/Writer, FacetRequest
stuff etc.) in C++ or what part of Solr??
> 
> Regarding Facet support in PyLucene I can share the samples I’ve ‘ported’ to Python
so far. There’s still a patch pending for JavaList (required by facet features) which I
come back to later on this list (still some open issues). Hopefully this can be included in
the PyLucene 3.6 version …

Lucene 3.6 just got released a few days ago. Apart from your patch, the PyLucene 3.6 release
is ready. I'm about to go offline (email only) for a week. Let's revisit this patch then (first
week of May). It's not blocking the release right now as, even if I sent out a release candidate
for a vote, the three business days required for this would take this into the time I'm away.

Out of curiosity, why is this patch tied to the facetting module ? Can't you use the regular
Java List implementations with it instead of a wrapped Python list ? If there are no wrappers
for the classes you want, it's certainly easier to add them and they would provide a more
efficient operation as Java code (the facet module) working with them wouldn't have to cross
the VM barriers for each and every access into these lists.

Andi..

> 
> Regards
> Thomas
> --
> OrbiTeam Software GmbH & Co. KG
> Germany  http://www.orbiteam.de
> 
> 
> Von: Caleb Burns [mailto:caleb@ridersdiscount.com] 
> Gesendet: Dienstag, 17. April 2012 21:16
> An: pylucene-dev@lucene.apache.org
> Betreff: PyLucene use JCC shared object by default
> 
> Hi,
> 
> I've finished the process at my organization of re-implementing SOLR's faceting algorithm
(in C++).
> 
> We would like the public at large to have access to the work we've done and plan to do.
In order for this to be a real possibility the code needs to be built against and use the
same JVM as the PyLucene installation does. The most logical way we feel to have this accomplished
is by having PyLucenes' default installation use JCC as a Shared Object.
> 
> We have yet more plans to extend and provide utilities that work with PyLucene, but this
all hinges on having the shared object. The only alternative methodology would require the
bundling of our source with the PyLucene project itself as a fork.
> 
> We are eager to start open sourcing our work, so please let us know what would be the
best way to integrate our work.
> 
> -- 
> Caleb Burns
> Developer | Riders Discount
> 866.931.6644 x851 | www.RidersDiscount.com 
> 
> Deal of the Day
> 
> 
> 

Mime
View raw message