lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <va...@apache.org>
Subject Re: ShingleAnalyzerWrapper in PyLucene
Date Sun, 29 Jan 2017 18:53:25 GMT

> On Jan 29, 2017, at 10:24, marco turchi <marco.turchi@gmail.com> wrote:
> 
> It is strange because I can see the attached files in the email I sent you... 
> 
> I attach again the Java code. In case it is not attached again, you can download from
this link:
> https://www.dropbox.com/s/o7ocygrdv8dqksl/CopyOfTest.java?dl=0
> the file is called CopyOfTest.Java

Indeed. No attachment was received here. Probably some security feature somewhere. The link
you included should be good enough.

Thanks !

Andi..

> 
> Thanks a lot!
> Marco
> 
> 
> 
>> On Sun, Jan 29, 2017 at 7:14 PM, Andi Vajda <vajda@apache.org> wrote:
>> 
>> > On Jan 29, 2017, at 03:50, marco turchi <marco.turchi@gmail.com> wrote:
>> >
>> > Dear Andi,
>> > please find in attachment the Java and the Python codes. Both of them, create
an index with two records using Shingle analyser and then query it printing the query and
the terms of the query.
>> 
>> It looks like you attached only the python program, only one attachment.
>> 
>> Andi..
>> 
>> >
>> > Thanks a lot for your help
>> > Marco
>> >
>> >
>> >
>> >> On Sun, Jan 29, 2017 at 3:10 AM, Andi Vajda <vajda@apache.org> wrote:
>> >>
>> >> On Sat, 28 Jan 2017, marco turchi wrote:
>> >>
>> >>> Dear All,
>> >>> I need to use the ShingleAnalyzerWrapper in PyLucene.
>> >>>
>> >>> I have built the analyzer similar to Lucene:
>> >>> self.analyzer = ShingleAnalyzerWrapper(WhitespaceAnalyzer(), 2, 4, "
" ,
>> >>> True, False, None)
>> >>>
>> >>> and I have used it inside QuertParser
>> >>> query = QueryParser("source", self.analyzer).parse("welcome world is
at on")
>> >>>
>> >>> the output is:
>> >>> source:welcome source:world source:is source:at source:on
>> >>>
>> >>> I have run the same code in Java and the output is how I would expect
it:
>> >>> source:welcome source:welcome world source:welcome world is source:welcome
>> >>> world is at source:world source:world is source:world is at source:world
is
>> >>> at on source:is content:is at source:is at on source:at source:at on
>> >>> source:on
>> >>>
>> >>> Do you have any ideas in what I'm doing wrong in PyLucene?
>> >>
>> >> Please, help me help you by including two simple programs that I can run
to reproduce the problem. One in Java producing the output you expect, one in Python producing
the output you're reporting.
>> >>
>> >> Thanks !
>> >>
>> >> Andi..
>> >>
>> >>
>> >>>
>> >>> Thanks a lot in advance for your help
>> >>> Marco
>> >>>
>> >
>> > <TestShingle.py>
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message