lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: Selective field query
Date Tue, 13 Oct 2015 11:00:36 GMT
The first thing I would suggest you is the use of the Analysis tool, to
explore your analysis at query and index time.
This will be the first step to understand if you are actually tokenising
and token filtering as expected.

Then you should play with different fields ( in the case the original field
is single value, you are not going to lose the relation) .
Then you can provide the search you expect , for example :

Service Name : Ngram token filtered ( or whatever you need)
Service id: keywordTokenizer ( to keep only one token) .

Can you give additional details ?

Cheers

On 13 October 2015 at 10:36, Colin Hunter <greenfibre@gmail.com> wrote:

> Thanks Scot.
> That is definitely moving things in the right direction
>
> I have another question that relates to this. It is also requested to
> implement a partial word search on the service name field.
> However, each service also has a unique identifier (string). This field
> requires exact string matching.
> I have attempted making a copy field for Service Name using the
> NGramTokenizerFactory, as below.
>
>     <!-- n-gram tokenization -->
>     <fieldType name="_ngram" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="org.apache.solr.analysis.NGramTokenizerFactory"
> minGramSize="3" maxGramSize="7"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>     </fieldType>
>
> While the debugQuery info showed the _ngram results, I was having issue
> building the query that would return these results along with regular
> search. (Your previous response may well clarify this).
> When I set this to return on all fields, then the full string match
> required for the service UI no longer works.
>
> I certainly have to explore further re the eDisMax parser.
> However, any advice that can be offered, regarding meeting these different
> requirements in a single query would be very helpful.
>
> Many Thanks
> Colin
>
> On Tue, Oct 13, 2015 at 5:49 AM, Scott Stults <
> sstults@opensourceconnections.com> wrote:
>
> > Colin,
> >
> > The other thing you'll want to keep in mind (and you'll find this out
> with
> > debugQuery) is that the query parser is going to take your
> > ServiceName:(Search Service) and turn it into two queries --
> > ServiceName:(Search) ServiceName:(Service). That's because the query
> parser
> > breaks on whitespace. My bet is you have a lot of entries with a name of
> "X
> > Service" and the second part of your query is hitting them. Phrase Field
> > might be your friend here:
> >
> > https://wiki.apache.org/solr/ExtendedDisMax#pf_.28Phrase_Fields.29
> >
> >
> > -Scott
> >
> > On Mon, Oct 12, 2015 at 4:15 AM, Colin Hunter <greenfibre@gmail.com>
> > wrote:
> >
> > > Thanks Erick, I'm sure this will be valuable in implementing ngram
> filter
> > > factory
> > >
> > > On Fri, Oct 9, 2015 at 4:38 PM, Erick Erickson <
> erickerickson@gmail.com>
> > > wrote:
> > >
> > > > Colin:
> > > >
> > > > Adding &debug=all to your query is your friend here, the
> > > > parsed_query.toString will show you exactly what
> > > > is searched against.
> > > >
> > > > Best,
> > > > Erick
> > > >
> > > > On Fri, Oct 9, 2015 at 2:09 AM, Colin Hunter <greenfibre@gmail.com>
> > > wrote:
> > > > > Ah ha...   the copy field...  makes sense.
> > > > > Thank You.
> > > > >
> > > > > On Fri, Oct 9, 2015 at 10:04 AM, Upayavira <uv@odoko.co.uk>
wrote:
> > > > >
> > > > >>
> > > > >>
> > > > >> On Fri, Oct 9, 2015, at 09:54 AM, Colin Hunter wrote:
> > > > >> > Hi
> > > > >> >
> > > > >> > I am working on a complex search utility with an index created
> via
> > > > data
> > > > >> > import from an extensive MySQL database.
> > > > >> > There are many ways in which the index is searched. One
of the
> > > utility
> > > > >> > input fields searches only on a Service Name. However, if
I
> target
> > > the
> > > > >> > query as q=ServiceName:"Searched service", this only returns
an
> > > exact
> > > > >> > string match. If q=Searched Service, the query still returns
> > results
> > > > from
> > > > >> > all indexed data.
> > > > >> >
> > > > >> > Is there a way to construct a query to only return results
from
> > one
> > > > field
> > > > >> > of a doc ?
> > > > >> > I have tried setting index=false, stored=true on unwanted
> fields,
> > > but
> > > > >> > these
> > > > >> > appear to have still been returned in results.
> > > > >>
> > > > >> q=ServiceName:(Searched Service)
> > > > >>
> > > > >> That'll look in just one field.
> > > > >>
> > > > >> Remember changing indexed to false doesn't impact the stuff
> already
> > in
> > > > >> your index. And the reason you are likely getting all that stuff
> is
> > > > >> because you have a copyField that copies it over into the 'text'
> > > field.
> > > > >> If you'll never want to search on some fields, switch them to
> > > > >> index=false, make sure you aren't doing a copyField on them,
and
> > then
> > > > >> reindex.
> > > > >>
> > > > >> Upayavira
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > www.gfc.uk.net
> > > >
> > >
> > >
> > >
> > > --
> > > www.gfc.uk.net
> > >
> >
> >
> >
> > --
> > Scott Stults | Founder & Solutions Architect | OpenSource Connections,
> LLC
> > | 434.409.2780
> > http://www.opensourceconnections.com
> >
>
>
>
> --
> www.gfc.uk.net
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message