lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject RE: Combining results of multiple indexes
Date Wed, 24 Dec 2008 09:00:35 GMT

: a) once a doc is added to an index, it will not get modified/deleted
: b) all the fields added are keywords (mostly numbers) - no analysis is
: required. 
: c) indexing speed is more important than querying speed. 
: d) every document is the same - there is no boost or relevancy required.
: 
: e) Query results should be sorted in the order they were indexed. 

given those statements, it really doesn't sound like Lucene (or any 
inverted index structure) is useful to you at all.  if you really have an 
unbounded prefrence for indexing speed vs query speed you should use a 
data structure where "add" is a constant time operation, even if that 
means querying is done via a linear scan of every doc -- which actually 
aids you by automatically returning everything the order they were added.

have you considered just using delimited files and something like perl for 
finding every record where the specified columns match your input 
criteria?

or if you ar a *little* concerned about query performance: using Hadoop 
map/reduce to scan multiple text files spread across many boxes.


(Disclaimer: haven't been following the whole thread, but did spot check 
the first message to see that hte query types are all simple field 
equality tests combined in a boolean epxression.)

-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message