lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Omri Suissa <omri.sui...@diffdoof.com>
Subject Re: Unexpected MultiFieldQueryParser result
Date Sun, 17 Feb 2013 17:54:33 GMT
I've simplified the test to the following  query:
*name:10th AND test:10th*

the result is:
*+name:10th +name:10th AND test:"10th and test 10th"*

the expected result is (in my opinion):
*+name:10th +test:10th*

I will be glad if someone could explain the logic of this case.

Thanks!
Omri

On Sun, Feb 17, 2013 at 3:48 PM, Omri Suissa <omri.suissa@diffdoof.com>wrote:

> Hi,
> Thanks, i understand your solution but in my case i don't want to parse
> the fields myself, the query parser should do this.
> What basically i'm asking is why i got the result that i got, it doesn't
> make any sense to me...
>
> Thanks,
> Omri
>
>
> On Thu, Feb 14, 2013 at 7:02 PM, Simon Svensson <sisve@devhost.se> wrote:
>
>> Hi,
>>
>> I've build something similar, basically a large freetext textbox for
>> simple queries. I choose instead to have a big concatenated fulltext-field
>> instead of searching into separate fields. I call QueryParser.Parse several
>> times, first with the fulltext-field, then twice for my title- and
>> contributor field. I then combine them together in a BooleanQuery where
>> only the fulltext-field is required (must). This will boost documents with
>> words found in the title- or contributor field, but not requiring them to
>> be present. I've later learned about building custom queries, weights and
>> generate your own boosts that way, but never bothered to rebuild how my
>> search works.
>>
>> However, one step is to detect if the user has inputted any fields, and
>> if that's the case, use the user-provided query as-is. This works since
>> I've only sent it thru the QueryParser, not the MultiQueryParser. Here's
>> the code I got that takes a user-provided string and does all this. Some
>> methods are not present and are left as a exercise for the implementor. ;)
>>
>> The FieldsInQuery method uses a QueryVisitor to iterate the query tree
>> and extract all field names. You can find one at
>> https://github.com/devhost/**Corelicious/blob/master/**
>> Corelicious.Lucene/**QueryVisitor.cs<https://github.com/devhost/Corelicious/blob/master/Corelicious.Lucene/QueryVisitor.cs>You
would basically need to override VisitField(String field) to store all
>> fields in a HashSet<String>. The Parse-method is just a wrapper for
>> QueryParser.Parse that handles any exceptions thrown. MarkClausesAsOptional
>> uses another QueryVisitor to set all clauses to optional (override
>> VisitOccur and return Occur.SHOULD).
>>
>> private Query FreeTextQuery(String value, Analyzer analyzer) {
>>     if (String.IsNullOrEmpty(value))
>>         return null;
>>
>>     var ftQuery = Parse(IndexFields.**FullTextField, value, analyzer,
>> QueryParser.Operator.OR);
>>     if (ftQuery == null)
>>         return null;
>>
>>     // Use the parsed query if there's a manually specified field.
>>     if (FieldsInQuery(ftQuery).Any(f => f != IndexFields.FullTextField))
>>         return ftQuery;
>>
>>     // We have a query without any specified fields. We should parse this
>> as
>>     // all-words-required.
>>     ftQuery = Parse(IndexFields.**FullTextField, value, analyzer,
>> QueryParser.Operator.AND);
>>
>>     var outerQuery = new BooleanQuery();
>>     outerQuery.Add(ftQuery, Occur.MUST);
>>
>>     var titleQuery = Parse(IndexFields.TitleField, value, analyzer);
>>     if (titleQuery != null) {
>>         titleQuery = MarkClausesAsOptional(**titleQuery);
>>         outerQuery.Add(titleQuery, Occur.SHOULD);
>>     }
>>
>>     var contributorQuery = Parse(IndexFields.**ContributorField, value,
>> analyzer);
>>     if (contributorQuery != null) {
>>         contributorQuery = MarkClausesAsOptional(**contributorQuery);
>>         outerQuery.Add(**contributorQuery, Occur.SHOULD);
>>     }
>>
>>     return outerQuery;
>> }
>>
>> // Simon
>>
>>
>> On 2013-02-14 15:05, Omri Suissa wrote:
>>
>>> No one? :)
>>>
>>> On Wed, Feb 6, 2013 at 12:41 PM, Omri Suissa <omri.suissa@diffdoof.com>*
>>> *wrote:
>>>
>>>  Hi,
>>>>
>>>> I'm using MultiFieldQueryParser to allow advance search in my
>>>> application.
>>>>
>>>> In some cases the user don't send the fields names and in some cases the
>>>> user send them.
>>>>
>>>> In this case the user sent the following query:
>>>>
>>>> (((name:10th AND name:10th) AND (name:10th AND name:10th) AND name:10th
>>>> AND name:10th) AND name:10th)
>>>>
>>>> As you can see all the conditions are the same (name:10th).
>>>>
>>>> My code looks like this:
>>>>
>>>> MultiFieldQueryParser queryParser = new MultiFieldQueryParser
>>>> (Lucene.Net.Util.Version.**LUCENE_30, fields, analyzer, boosts);
>>>>
>>>>
>>>>
>>>> queryParser.**AllowLeadingWildcard = true;
>>>>
>>>>
>>>>
>>>> try
>>>>
>>>> {
>>>>
>>>> objQuery = queryParser.Parse(realQuery);
>>>>
>>>> return objQuery;
>>>>
>>>> }
>>>>
>>>> catch (ParseException pe)
>>>>
>>>> {
>>>>
>>>> return null;
>>>>
>>>> }
>>>>
>>>>
>>>>
>>>> Where the [fields] variable is a list of default fields if the user
>>>> didn't
>>>> send fields (not relevant in this case), the [analyzer] is
>>>> StandardAnalyzer(Lucene.Net.**Util.Version.LUCENE_30) and some default
>>>> boosts
>>>> (also not relevant in this case).
>>>>
>>>> The result I got back is:
>>>>
>>>> +(+(+(((name:"name 10th" +(((name:10th AND name:"name 10th ? name 10th")
>>>> +(+(((name:10th AND name:10th) AND (name:"name 10th ? name 10th ? name
>>>> 10th" +(((name:10th AND name:10th) AND (name:10th AND name:"name 10th ?
>>>> name 10th ? name 10th ? name 10th") +(((name:10th AND name:10th) AND
>>>> (name:10th AND name:10th) AND name:"name 10th ? name 10th ? name 10th ?
>>>> name 10th ? name 10th" +(((name:10th AND name:10th) AND (name:10th AND
>>>> name:10th) AND name:10th AND name:"name 10th ? name 10th ? name 10th ?
>>>> name
>>>> 10th ? name 10th ? name 10th") +(((name:10th AND name:10th) AND
>>>> (name:10th
>>>> AND name:10th) AND name:10th AND name:10th) AND name:"name 10th ? name
>>>> 10th
>>>> ? name 10th ? name 10th ? name 10th ? name 10th ? name 10th"
>>>>
>>>> As you can see this is not what I was expected. The query search for
>>>> different things:
>>>>
>>>> -          name:10th
>>>>
>>>> -          name:"name 10th ? name 10th"
>>>>
>>>> -          name:"name 10th ? name 10th ? name 10th"
>>>>
>>>> -          name:"name 10th ? name 10th ? name 10th ? name 10th"
>>>>
>>>> -          and so on…
>>>>
>>>> Why is that? the way I see it, the user sent the same condition over and
>>>> over again with some brackets and ANDs between them that should not
>>>> effect
>>>> a thing…
>>>>
>>>> If this was an "IF" condition in C# is was just like saying "if
>>>> (name.Contains("10th") ==  true)".
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Omri
>>>>
>>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message