lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kainth, Sachin" <Sachin.Kai...@atkinsglobal.com>
Subject RE: Fields
Date Mon, 19 Feb 2007 16:14:48 GMT
Hi Erik,

I looked at the QueryParser API doc but I can't seem to find what the
default field is. Also, how would the syntax of the index code differ
when indexing a word to the default field from this:

Doc.Add(Field.Text("album", Album));

Cheers

Sachin

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 19 February 2007 16:05
To: java-user@lucene.apache.org
Subject: Re: Fields

See below.

On 2/19/07, Kainth, Sachin <Sachin.Kainth@atkinsglobal.com> wrote:
>
> Hi all,
>
> I have a few question regarding indexing documents.
>
> 1. With my experience of indexing documents with lucene so far I have 
> done things like:
>
> Doc.Add(Field.Text("album", Album));
>
> Where Album is a string representing an album name.  Now with this 
> sort of indexing what I do is a search such as:
>
> "album:Thriller"
>
> a) Does this mean that I cannot do an search across all fields  by 
> submitting the query:
>
> "Thriller"?  In other words by submitting this query would my code 
> search all fields?


No. If you just submit "Thriller", you'll only search the default field.
See QueryParser for the default field.


b) Is there a way in which I can index elements of a document without
> naming the field.  What would the impact of such a use of the indexing

> capabilities of Lucene be?


I don't think this makes sense in Lucene terms. All elements in a
document have a field. You can index everything into one field if you
need an aggregate, which gives you this same result.

Do note, however, that there's no requirement that all documents have
the same fields.


2. Is there a limit to the number of
> a) named fields per document that I can store


I think there is, but it's absurdly high. Don't worry about this....


b) non-named fields per document that I can store


0 since I don't think you can.


3.
>
> a) Is it possible in Lucene to conduct searches that are very complex 
> such as:
>
> ((album = Thriller AND artist = (Michael OR Jackson)) OR (date between
X
> AND Y)) AND (label = sony OR Epic)   etc...


Yes


b) For such a query what are the performance penalties compared to a
> simple search involving 1 term?


In the immortal words of Mr. Hatcher.. .it depends. You'll really just
have to experiment and find out. It can probably be approximated by
taking the sum of the individual queries as the upper limit. The real
killer is wildcards..... The real question isn't "what is the effect on
performance", it's "is the performance good enough for my application".
Which varies as the characteristics of the database change.

I would argue that a 1M index will process arbitrarily complex queries
"fast enough". The same may not be true for a 100G index. So this
question is really unanswerable in the abstract.


Cheers
>
> Sachin
>
>
>
> This email and any attached files are confidential and copyright 
> protected. If you are not the addressee, any dissemination of this 
> communication is strictly prohibited. Unless otherwise expressly 
> agreed in writing, nothing stated in this communication shall be
legally binding.
>
> The ultimate parent company of the Atkins Group is WS Atkins plc.  
> Registered in England No. 1885586.  Registered Office Woodcote Grove, 
> Ashley Road, Epsom, Surrey KT18 5BW.
>
> Consider the environment. Please don't print this e-mail unless you 
> really need to.
>


This message has been scanned for viruses by MailControl - (see
http://bluepages.wsatkins.co.uk/?6875772)

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message