lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Umesh Prasad" <umesh.i...@gmail.com>
Subject Re: Multi Field search without Multifieldqueryparser
Date Mon, 22 Sep 2008 09:29:34 GMT
Hi,
Having an extra indexed but unstored field is equivalent to having a bag of
words. So the search results quality will be affected.
Consider an Example:

Text : ----  President  of USA--
Other Fields ..

Text : --
Occupation: President of USA

In both cases searchable-mash = BAG of WORDs, will have President of USA
hence will score almost same, which would be undesirable.


Another solution is to learn the field name of each term in the unstructured
query and then form the query programmatically.
You will have to write 2 additional subsystems.
1. Field Learning System
2. Customized Query Tokenizer and Query Parser

That said, Best solution depends on your requirement.

Thanks
Umesh
On Mon, Sep 22, 2008 at 2:18 PM, Dino Korah <dckorah@gmail.com> wrote:

> I would think, with the current capabilities of lucene, denormalisation is
> the solution. Create an extra indexed but not stored field called
> "searchable-mash" which will hold the values from all fields with added
> words to connect the data like "Male named George Bush whoes occupation is
> President of USA ... Etc" so that you can run that generic query on that
> field.
>
> So you pass "searchable-mash: George bush and president" to query parser.
>
> You will pay a penalty here, of bigger index and slower indexing.
>
> -----Original Message-----
> From: Anshul jain [mailto:anshulnirvana@gmail.com]
> Sent: 21 September 2008 20:27
> To: java-user@lucene.apache.org
> Subject: Multi Field search without Multifieldqueryparser
>
> Hi!
>
> I've a lucene document structured like:
> Field: Text
> name: George Bush
> Sex: Male
> Occupation: President of USA
>
> Now I can have two types of queries:
> Structured query:
> name: George Bush AND Occupation: President
>
> Unstructured Query:
> George Bush AND President.
>
> After parsing it will become, value: George bush and president.
> "value" is some default field that has to defined during parsing.
>
> But as you can see that this unstructured query would not work because of
> the structure of the lucene document. Now what I want to do is that when an
> user gives an Unstructured query Lucene should search in all fields. (Multi
> field query parser is an option but we have to define all the fields first,
> and it can be expensive as the query can get really big).
>
> I would really appreciate if you can help me out with this.
>
> Regards,
> Anshul Jain
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Thanking you

Regards
Umesh Prasad

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message