lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Omri Suissa <omri.sui...@diffdoof.com>
Subject Re: Unexpected MultiFieldQueryParser result
Date Sun, 17 Feb 2013 13:48:05 GMT
Hi,
Thanks, i understand your solution but in my case i don't want to parse the
fields myself, the query parser should do this.
What basically i'm asking is why i got the result that i got, it doesn't
make any sense to me...

Thanks,
Omri

*Omri Suissa     **VP R&D*

*Tel:    +972 9 7724228                         **DiffDoof .ltd**
            *

*Cell:   +972 54 5395206                       **11, Galgaley Haplada
Street, *

*Fax:   +972 9 9512577**                         P.O.Box 2150***

*www.DiffDoof.com* <http://www.DiffDoof.com>*                              *
*Herzlia Pituach 46120, Israel*


On Thu, Feb 14, 2013 at 7:02 PM, Simon Svensson <sisve@devhost.se> wrote:

> Hi,
>
> I've build something similar, basically a large freetext textbox for
> simple queries. I choose instead to have a big concatenated fulltext-field
> instead of searching into separate fields. I call QueryParser.Parse several
> times, first with the fulltext-field, then twice for my title- and
> contributor field. I then combine them together in a BooleanQuery where
> only the fulltext-field is required (must). This will boost documents with
> words found in the title- or contributor field, but not requiring them to
> be present. I've later learned about building custom queries, weights and
> generate your own boosts that way, but never bothered to rebuild how my
> search works.
>
> However, one step is to detect if the user has inputted any fields, and if
> that's the case, use the user-provided query as-is. This works since I've
> only sent it thru the QueryParser, not the MultiQueryParser. Here's the
> code I got that takes a user-provided string and does all this. Some
> methods are not present and are left as a exercise for the implementor. ;)
>
> The FieldsInQuery method uses a QueryVisitor to iterate the query tree and
> extract all field names. You can find one at https://github.com/devhost/**
> Corelicious/blob/master/**Corelicious.Lucene/**QueryVisitor.cs<https://github.com/devhost/Corelicious/blob/master/Corelicious.Lucene/QueryVisitor.cs>You
would basically need to override VisitField(String field) to store all
> fields in a HashSet<String>. The Parse-method is just a wrapper for
> QueryParser.Parse that handles any exceptions thrown. MarkClausesAsOptional
> uses another QueryVisitor to set all clauses to optional (override
> VisitOccur and return Occur.SHOULD).
>
> private Query FreeTextQuery(String value, Analyzer analyzer) {
>     if (String.IsNullOrEmpty(value))
>         return null;
>
>     var ftQuery = Parse(IndexFields.**FullTextField, value, analyzer,
> QueryParser.Operator.OR);
>     if (ftQuery == null)
>         return null;
>
>     // Use the parsed query if there's a manually specified field.
>     if (FieldsInQuery(ftQuery).Any(f => f != IndexFields.FullTextField))
>         return ftQuery;
>
>     // We have a query without any specified fields. We should parse this
> as
>     // all-words-required.
>     ftQuery = Parse(IndexFields.**FullTextField, value, analyzer,
> QueryParser.Operator.AND);
>
>     var outerQuery = new BooleanQuery();
>     outerQuery.Add(ftQuery, Occur.MUST);
>
>     var titleQuery = Parse(IndexFields.TitleField, value, analyzer);
>     if (titleQuery != null) {
>         titleQuery = MarkClausesAsOptional(**titleQuery);
>         outerQuery.Add(titleQuery, Occur.SHOULD);
>     }
>
>     var contributorQuery = Parse(IndexFields.**ContributorField, value,
> analyzer);
>     if (contributorQuery != null) {
>         contributorQuery = MarkClausesAsOptional(**contributorQuery);
>         outerQuery.Add(**contributorQuery, Occur.SHOULD);
>     }
>
>     return outerQuery;
> }
>
> // Simon
>
>
> On 2013-02-14 15:05, Omri Suissa wrote:
>
>> No one? :)
>>
>> On Wed, Feb 6, 2013 at 12:41 PM, Omri Suissa <omri.suissa@diffdoof.com>**
>> wrote:
>>
>>  Hi,
>>>
>>> I'm using MultiFieldQueryParser to allow advance search in my
>>> application.
>>>
>>> In some cases the user don't send the fields names and in some cases the
>>> user send them.
>>>
>>> In this case the user sent the following query:
>>>
>>> (((name:10th AND name:10th) AND (name:10th AND name:10th) AND name:10th
>>> AND name:10th) AND name:10th)
>>>
>>> As you can see all the conditions are the same (name:10th).
>>>
>>> My code looks like this:
>>>
>>> MultiFieldQueryParser queryParser = new MultiFieldQueryParser
>>> (Lucene.Net.Util.Version.**LUCENE_30, fields, analyzer, boosts);
>>>
>>>
>>>
>>> queryParser.**AllowLeadingWildcard = true;
>>>
>>>
>>>
>>> try
>>>
>>> {
>>>
>>> objQuery = queryParser.Parse(realQuery);
>>>
>>> return objQuery;
>>>
>>> }
>>>
>>> catch (ParseException pe)
>>>
>>> {
>>>
>>> return null;
>>>
>>> }
>>>
>>>
>>>
>>> Where the [fields] variable is a list of default fields if the user
>>> didn't
>>> send fields (not relevant in this case), the [analyzer] is
>>> StandardAnalyzer(Lucene.Net.**Util.Version.LUCENE_30) and some default
>>> boosts
>>> (also not relevant in this case).
>>>
>>> The result I got back is:
>>>
>>> +(+(+(((name:"name 10th" +(((name:10th AND name:"name 10th ? name 10th")
>>> +(+(((name:10th AND name:10th) AND (name:"name 10th ? name 10th ? name
>>> 10th" +(((name:10th AND name:10th) AND (name:10th AND name:"name 10th ?
>>> name 10th ? name 10th ? name 10th") +(((name:10th AND name:10th) AND
>>> (name:10th AND name:10th) AND name:"name 10th ? name 10th ? name 10th ?
>>> name 10th ? name 10th" +(((name:10th AND name:10th) AND (name:10th AND
>>> name:10th) AND name:10th AND name:"name 10th ? name 10th ? name 10th ?
>>> name
>>> 10th ? name 10th ? name 10th") +(((name:10th AND name:10th) AND
>>> (name:10th
>>> AND name:10th) AND name:10th AND name:10th) AND name:"name 10th ? name
>>> 10th
>>> ? name 10th ? name 10th ? name 10th ? name 10th ? name 10th"
>>>
>>> As you can see this is not what I was expected. The query search for
>>> different things:
>>>
>>> -          name:10th
>>>
>>> -          name:"name 10th ? name 10th"
>>>
>>> -          name:"name 10th ? name 10th ? name 10th"
>>>
>>> -          name:"name 10th ? name 10th ? name 10th ? name 10th"
>>>
>>> -          and so on…
>>>
>>> Why is that? the way I see it, the user sent the same condition over and
>>> over again with some brackets and ANDs between them that should not
>>> effect
>>> a thing…
>>>
>>> If this was an "IF" condition in C# is was just like saying "if
>>> (name.Contains("10th") ==  true)".
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Omri
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message