lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Engels" <reng...@ix.netcom.com>
Subject RE: non indexed field searching?
Date Wed, 17 May 2006 23:42:33 GMT
Having an indexed-field seems to occur significant overhead when merging,
and if the index is highly interactive, the merging process occurs quite
often.

Maybe I am incorrect regarding the overhead of indexed fields? I have
attempted to keep the number of indexed fields to a minimum.

I think it boils down to whether a being able to do a range query (for date
filtering as an example) is worth the cost of maintaining that index. If the
other terms are mildly rare, then inspecting the documents to match against
the needed range seems more efficient (thus the need to turn Filter into an
interface). But if the term they are looking for is common, then the date
range would be needed (to avoid a scan of all documents matching the term).

It may just be that all fields need to be indexed in order to cover all
cases (and that the cost of doing a range filter on a indexed field is far
less in ALL cases than inspecting any documents).

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
Sent: Wednesday, May 17, 2006 6:19 PM
To: java-dev@lucene.apache.org
Subject: Re: non indexed field searching?



On May 17, 2006, at 11:20 AM, Robert Engels wrote:

> I reviewed the solr source (at LOT of the code is amazingly similar
> to our
> own search server).
>
> I don't see anything related to searching using non-indexed fields.
> Could
> you maybe point me at the class(es) that implement this functionality?

Sorry, I missed the "non" part of "non-indexed fields".  I don't
quite understand why you wouldn't just index every field if you
needed that capability though.

	Erik



>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Tuesday, May 16, 2006 6:35 PM
> To: java-dev@lucene.apache.org
> Subject: Re: non indexed field searching?
>
>
>
> On May 16, 2006, at 3:37 PM, Robert Engels wrote:
>> It seems that maybe a query could be separated into Filter and
>> Query clauses
>> (similar to how the query optimizer works in Nutch). Clauses that
>> were based
>> on non-indexed fields would be converted to a Filter.
>>
>> The problem is if you have some thing like
>>
>> (indexed:somevalue OR nonindexed:somevalue)
>>
>> would require a complete visit to every document.
>
> Not necessarily.  A query optimizer could could extract these term
> query clauses, look up cached doc sets (bit sets) and union them.
> Scoring is the trickier part - I'm now curious to dig into Solr and
> see how it handles this.
>
>> I understand that this is moving Lucene closer to a database, but
>> it is just
>> very difficult to perform some complex queries efficiently without
>> it.
>
> Check out Solr - I think you'll find it fits this niche nicely.
>
>> *** As an aside, I still don't understand why Filter is not an
>> interface
>
> I saw that Paul Elschot has just done some refactoring work attached
> to a JIRA issue on this very topic.
>
> 	Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message