lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Serving remote lucene client - RMI vs HTTP
Date Mon, 16 Jul 2007 05:15:13 GMT

: And our final queries sent to lucene are quite complicated. This is because
: we need to confirm to a lot of criteria (some are set by users and some are
: internal logics). I don't think we can simplify our queries.

in my experience, your queries can always be made simpler by making your
indexing more complex -- that's not neccessarily a good tradeoff to make
in every situation, but it's usually possible to "denormalize" your index
to help decrease complexity.

: We are not sure about implementing using Solr because we crawl only specific
: type of sites and our crawling mechanism has proven to be quite stable. We

whether or not you want to use Solr is really independent of what kidn of
crawler you currently have -- you'd still use hte same crawler, but
dpeneding on how you wanted to leverage Solr either you'd change your
crawler to POST your new documents to Solr instead of writing the the
index directly, or you could keep writing to the index directly and just
use Solr to serve the searches (assuming you configure the Solr schema.xml
to match up with the fields/analyzers your crawler uses when writing to
the index so it can query on the right fields)

if you have no interest in using Solr to manage your index, even if you
have no interest in using Solr to search your index over HTTP, you might
want to take a look at the distribution scripts that come with Solr to
provide replication.  At the core they are incredibly simple, just
taking advantage of rsync, hard links, and properties of the lucene
fileformat to help minimize the amount of data that needs to go over the
wire when you want to index on one box and then replicate that index to 10
other boxes to distribute the search load.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message