lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <va...@apache.org>
Subject Re: Stempel stemmer ported to Python
Date Sun, 21 Jul 2019 16:35:48 GMT

  Hi Maciej,

On Fri, 19 Jul 2019, Maciej Gawinecki wrote:

> I have ported your Stempel stemmer [1] for Polish language from Java
> to Python [2]. I know you have also Python wrapper for Lucene
> (pyLucene) so I was curious if you would be interested in the native
> implementation of a single stemmer?
>
> It has same accuracy as the original version and only slightly better
> performance comparing to the wrapped version (compared with pyjini)
> but uses only one language (no need to switch between languages when
> debugging) which was quite important in my NLP project. I understand
> that it introduces the need to maintain two code bases, though.

PyLucene is not a port of Lucene to Python but a Python/C++ wrapper library 
auto-generated via JCC:
   http://lucene.apache.org/pylucene/jcc/
Users of PyLucene in fact embed an actual, unchanged, Apache Java Lucene 
jar file and a JVM into their Python VM.

The Stempel stemmer is part of PyLucene already since it is included in the 
wrapper generation (look for stempel):
   https://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_7_7_1/Makefile

Your native port, which I'm sure is valid and useful, thus does not fit with 
that auto-wrapper model, however. There is little to no maintenance done on 
PyLucene proper as all its useful code is in Java Lucene and JCC. Adding 
native Python code to PyLucene would break that no-maintenance convenience.

Thank you for thinking of PyLucene for hosting it, though !

Andi..

>
> Regards,
> Maciej Gawinecki
>
>
>
> [1]: https://github.com/apache/lucene-solr/tree/master/lucene/analysis/stempel/src/java/org
> [2]:https://github.com/dzieciou/pystempel/tree/feature/1
>

Mime
View raw message