lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nagelberg, Kallin" <KNagelb...@globeandmail.com>
Subject index corruption / deployment strategy
Date Thu, 08 Apr 2010 17:33:42 GMT
Hi everyone,

I've been doing work evaluating Solr for use on a hightraffic website for sometime and things
are looking positive. I have some concerns from my higher-ups that I need to address. I have
suggested that we use a single index in order to keep things simple, but there are suggestions
to split are documents amongst different indexes.

The primary motivation for this split is a worry about potential index corruption. IE, if
we only have one index and it becomes corrupt what do we do? I never considered this to be
an issue since we would have backups etc., but I think they have had issues with other search
technology in the past where one big index resulted in frequent and difficult to recover from
corruption. Do you think this is a concern with Solr? If so, what would you suggest to mitigate
the risk?

My second question involves general deployment strategy. We will expect about 50 million documents,
each on average a few paragraphs, and our website receives maybe 10 million hits a day. Can
anyone provide an idea of # of servers, clustering/replication setup etc. that might be appropriate
for this scenario? I'm interested to hear what other's experience is with similar situations.

Thanks,
-Kallin Nagelberg


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message