lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From marco turchi <marco.tur...@gmail.com>
Subject Re: ShingleAnalyzerWrapper in PyLucene
Date Sun, 29 Jan 2017 18:24:22 GMT
It is strange because I can see the attached files in the email I sent
you...

I attach again the Java code. In case it is not attached again, you can
download from this link:
https://www.dropbox.com/s/o7ocygrdv8dqksl/CopyOfTest.java?dl=0
the file is called CopyOfTest.Java

Thanks a lot!
Marco



On Sun, Jan 29, 2017 at 7:14 PM, Andi Vajda <vajda@apache.org> wrote:

>
> > On Jan 29, 2017, at 03:50, marco turchi <marco.turchi@gmail.com> wrote:
> >
> > Dear Andi,
> > please find in attachment the Java and the Python codes. Both of them,
> create an index with two records using Shingle analyser and then query it
> printing the query and the terms of the query.
>
> It looks like you attached only the python program, only one attachment.
>
> Andi..
>
> >
> > Thanks a lot for your help
> > Marco
> >
> >
> >
> >> On Sun, Jan 29, 2017 at 3:10 AM, Andi Vajda <vajda@apache.org> wrote:
> >>
> >> On Sat, 28 Jan 2017, marco turchi wrote:
> >>
> >>> Dear All,
> >>> I need to use the ShingleAnalyzerWrapper in PyLucene.
> >>>
> >>> I have built the analyzer similar to Lucene:
> >>> self.analyzer = ShingleAnalyzerWrapper(WhitespaceAnalyzer(), 2, 4, "
> " ,
> >>> True, False, None)
> >>>
> >>> and I have used it inside QuertParser
> >>> query = QueryParser("source", self.analyzer).parse("welcome world is
> at on")
> >>>
> >>> the output is:
> >>> source:welcome source:world source:is source:at source:on
> >>>
> >>> I have run the same code in Java and the output is how I would expect
> it:
> >>> source:welcome source:welcome world source:welcome world is
> source:welcome
> >>> world is at source:world source:world is source:world is at
> source:world is
> >>> at on source:is content:is at source:is at on source:at source:at on
> >>> source:on
> >>>
> >>> Do you have any ideas in what I'm doing wrong in PyLucene?
> >>
> >> Please, help me help you by including two simple programs that I can
> run to reproduce the problem. One in Java producing the output you expect,
> one in Python producing the output you're reporting.
> >>
> >> Thanks !
> >>
> >> Andi..
> >>
> >>
> >>>
> >>> Thanks a lot in advance for your help
> >>> Marco
> >>>
> >
> > <TestShingle.py>
>

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message