lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Multi Field search without Multifieldqueryparser
Date Tue, 23 Sep 2008 14:55:32 GMT

On Sep 23, 2008, at 8:35 AM, Anshul jain wrote:

> yes you are partly correct
>
> what I need is that lucene should support two type of queries for the
> following document:
> name: abc^10
> organization: xyz^3
>
> structured query:
> name: abc and organization: xyz
>
> unstructured query:
> default_field: abc ^5 and xyz

And what field(s) should "xyz" be searched against?  Again, I ask, how  
do you know what fields "xyz" should go against and why does abc go  
against the default_field?  You've said it shouldn't go against all  
fields (b/c there are thousands of them), and you've said it shouldn't  
go against a catch-all field, but otherwise I still have no clue your  
criteria for what fields xyz should search.  Are you saying that you  
want it to intelligently know that when "xyz" comes in that it should  
search the organization field?

Other than seconding Umesh's or Dino's suggestions of using machine  
learning or heuristics or using some type of templating system, I'm  
not sure what else to offer.  You might look at Solr's Dismax Query  
Parser, which allows you to specify the field structure of queries in  
a multi-field way, but again, I doubt that is wholly what you are  
looking for.

>
>
> But i do not want to create one more field(default_field) that will
> contain all the values concatenated in it. Also, even if i get all the
> fields during indexing and use it for multi field query parser, then
> the query will become very inefficient as there can be thousands of
> fields. I think it should clarify my point.
>
>
>
> On Tue, Sep 23, 2008 at 1:58 PM, Grant Ingersoll  
> <gsingers@apache.org> wrote:
>> So, the piece I'm missing is how do you know what field for which  
>> terms.  In
>> other words how do you know xyz goes against organization and abc  
>> against
>> name.  Your wording implies that you don't know this before hand,  
>> yet you
>> are somehow suggesting that Lucene should be able to do it.   
>> Correct me if
>> I'm wrong.
>>
>> -Grant
>>
>>
>> On Sep 23, 2008, at 6:51 AM, Anshul jain wrote:
>>
>>> Here is what I'm trying to do:
>>>
>>> say a lucene document:
>>> name: abc ^10
>>> organization: xyz ^3
>>>
>>> ^10 and ^3 are boosts in the document.
>>>
>>> now if I query name: abc ^5 AND organization: xyz this will work.
>>>
>>> but if I query (default_field): abc^5 AND xyz this won't work.
>>>
>>> Now what I want is that a text can be associated with more than  
>>> one field.
>>> i.e.
>>>
>>> (field1,field2,field3):value
>>> name,(default_field),title: abc^10
>>> organization,(default_field),institute: xyz^3
>>>
>>> then both of my queries will work.
>>>
>>> Is it possible to do so in lucene without changing the source?
>>> If no then can anyone please explain the indexing and searching
>>> mechanism for lucene, so that I can start working on it.
>>>
>>> The solution given by the java-users won't work for me as I do not
>>> want to add all the contents of the document in a single field and
>>> then search for that field, as this would increase the index size  
>>> and
>>> I've to index more than 10 million documents. Also
>>> multifieldqueryparser will make it query execution inefficient, as
>>> there will be thousands of fields.
>>>
>>> If I start storing just a single field as: (default_field): "name  
>>> abc
>>> organization xyz", then it is possible that some other documents  
>>> might
>>> get selected that are not relevant. Also i want to boost individual
>>> fields in a document.
>>>
>>> Anshul
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>> Lucene Helpful Hints:
>> http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> http://wiki.apache.org/lucene-java/LuceneFAQ
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
> -- 
> Anshul Jain
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message