lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Rodenburg" <jeff.rodenb...@gmail.com>
Subject Re: 30 milllion+ docs on a single server
Date Sun, 13 Aug 2006 19:28:41 GMT
On 8/12/06, Mark Miller <markrmiller@gmail.com> wrote:
>
> The single server is important because I think it will take a lot of
> work to scale it to multiple servers. The index must allow for close to
> real-time updates and additions. It must also remain searchable at all
> times (other than than during the brief period of single updates and
> additions). If it is easy to scale this to multiple servers please tell
> me how.
>

It can take quite a bit of work to implement a multiple-server index system;
we did it last year, building an operational wrapper around Lucene.  Wish
Solr had been around then.  ;-)

I've done both the Windows and the Linux route.  Windows certainly comes
from a scale-up mentality, though we made it work in a scale-out model.  Our
requirements were the same as yours: near real-time updates & additions,
always-on searchability, etc.  It takes work, but it can be done.  We're
serving searches across 6 different types of indexes, with the indexes
spread across the server farm (no single server has the full composite
index).  Our search availability for this year is damn near 5 nines.  If you
haven't looked at Windows 64-bit, let me save you some time.  You don't gain
as much as you might expect; the point of diminishing returns appears to
have certainly been met with Windows Server.  We'll apply a similar strategy
to Solr, in that we'll likely run Solr clusters for our composite index.

The best way to explain "how" is to simply refer you to Solr, from an
operational perspective.  The only thing that Solr doesn't have that we do
is rolling together results from multiple searchers, and that's simply an
out-of-the-box configuration; it's not a major ordeal to change that to meet
our needs.

Hope this helps.

-- j

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message