lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Solr Shingle is not working properly in solr 6.5.0
Date Wed, 05 Apr 2017 21:43:07 GMT
Steve - please include a broad description of this feature in the next CHANGES.txt. I will
forget about this thread but need to be reminded of why i could need it :)

Thanks,
Markus
 
 
-----Original message-----
> From:Steve Rowe <sarowe@gmail.com>
> Sent: Wednesday 5th April 2017 23:26
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Shingle is not working properly in solr 6.5.0
> 
> Aman,
> 
> In forthcoming Solr 6.5.1, this problem will be addressed by setting a new <fieldtype>
option named “enableGraphQueries” to “false".
> 
> Your fieldtype will look like this:
> 
> -----
> <fieldType name="cust_shingle" class=“solr.TextField" positionIncrementGap=“100”
enableGraphQueries=“false”>
>   <analyzer> 
>     <tokenizer class="solr.StandardTokenizerFactory"/>
>     <filter class="solr.ShingleFilterFactory" outputUnigrams=“false" maxShingleSize="4”/>
>     <filter class="solr.LowerCaseFilterFactory”/>
>   </analyzer>
> </fieldType>
> -----
> 
> --
> Steve
> www.lucidworks.com
> 
> > On Apr 4, 2017, at 5:32 PM, Steve Rowe <sarowe@gmail.com> wrote:
> > 
> > Hi Aman,
> > 
> > I’ve created <https://issues.apache.org/jira/browse/SOLR-10423> for this
problem.
> > 
> > --
> > Steve
> > www.lucidworks.com
> > 
> >> On Mar 31, 2017, at 7:34 AM, Aman Deep Singh <amandeep.cool99@gmail.com>
wrote:
> >> 
> >> Hi Rich,
> >> Query creation is correct only thing what causing the problem is that
> >> Boolean + query while building the lucene query which causing all tokens to
> >> be matched in the document (equivalent of mm=100%) even though I use mm=1
> >> it was using BOOLEAN + query as
> >> normal query one plus one abc
> >> Lucene query -
> >> +(((+nameShingle:one plus +nameShingle:plus one +nameShingle:one abc))
> >> ((+nameShingle:one plus +nameShingle:plus one abc)) ((+nameShingle:one plus
> >> one +nameShingle:one abc)) (nameShingle:one plus one abc))
> >> 
> >> Now since my doc contains only one plus one thus --
> >> one plus ,plus one, one plus one
> >> thus due to Boolean + it was not matching.
> >> Thanks,
> >> Aman Deep Singh
> >> 
> >> On Fri, Mar 31, 2017 at 4:41 PM Rick Leir <rleir@leirtech.com> wrote:
> >> 
> >>> Hi Aman
> >>> Did you try the Admin Analysis tool? It will show you which filters are
> >>> effective at index and query time. It will help you understand why you are
> >>> not getting a mach.
> >>> Cheers -- Rick
> >>> 
> >>> On March 31, 2017 2:36:33 AM EDT, Aman Deep Singh <
> >>> amandeep.cool99@gmail.com> wrote:
> >>>> Hi,
> >>>> I was trying to use the shingle filter but it was not creating the
> >>>> query as
> >>>> desirable.
> >>>> 
> >>>> my schema is
> >>>> <fieldType name="cust_shingle" class="solr.TextField"
> >>>> positionIncrementGap=
> >>>> "100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/>
> >>>> <filter
> >>>> class="solr.ShingleFilterFactory" outputUnigrams="false"
> >>>> maxShingleSize="4"
> >>>> /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer>
> >>>> </fieldType>
> >>>> <field name="nameShingle" type="cust_shingle" indexed="true"
> >>>> stored="true"/>
> >>>> 
> >>>> my solr query is
> >>>> 
> >>> http://localhost:8983/solr/productCollection/select?defType=edismax&debugQuery=true&q=one%20plus%20one%20four&qf=nameShingle&
> >>>> *sow=false*&wt=xml
> >>>> 
> >>>> and it was creating the parsed query as
> >>>> <str name="parsedquery">
> >>>> (+(DisjunctionMaxQuery(((+nameShingle:one plus +nameShingle:plus one
> >>>> +nameShingle:one four))) DisjunctionMaxQuery(((+nameShingle:one plus
> >>>> +nameShingle:plus one four))) DisjunctionMaxQuery(((+nameShingle:one
> >>>> plus
> >>>> one +nameShingle:one four))) DisjunctionMaxQuery((nameShingle:one plus
> >>>> one
> >>>> four)))~1)/no_coord
> >>>> </str>
> >>>> <str name="parsedquery_toString">
> >>>> *+((((+nameShingle:one plus +nameShingle:plus one +nameShingle:one
> >>>> four))
> >>>> ((+nameShingle:one plus +nameShingle:plus one four)) ((+nameShingle:one
> >>>> plus one +nameShingle:one four)) (nameShingle:one plus one four))~1)*
> >>>> </str>
> >>>> 
> >>>> 
> >>>> So ideally token creations is perfect but in the query it is using
> >>>> boolean + operator which is causing the problem as if i have a document
> >>>> with name as
> >>>> "one plus one" ,according to the shingles it has to matched as its
> >>>> token
> >>>> will be  ("one plus","one plus one","plus one") .
> >>>> I have tried using the q.op and played around the mm also but nothing
> >>>> is
> >>>> giving me the correct response.
> >>>> Any idea how i can fetch that document even if the document is missing
> >>>> any
> >>>> token.
> >>>> 
> >>>> My expected response will be getting the document
> >>>> "one plus one" even the user query has any additional term like "one
> >>>> plus
> >>>> one two" and so on.
> >>>> 
> >>>> 
> >>>> Thanks,
> >>>> Aman Deep Singh
> >>> 
> >>> --
> >>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> > 
> 
> 

Mime
View raw message